Programming Languages


The Complete Pandas Bootcamp 2021: Data Science with Python

Pandas fully explained | 150+ Exercises | Must-have skills for Machine Learning & Finance | + Scikit-Learn and Seaborn

4.74 (2162 reviews)


34 hours


Jul 2021

Last Update
Regular Price

Blue Host
Fast, easy, and secure WordPress hosting in minutes + 1 free domain name
65%OFF : $2.95/month

What you will learn

Bring your Data Handling & Data Analysis skills to an outstanding level.

Learn and practice all relevant Pandas methods and workflows with Real-World Datasets

Learn Pandas based on NEW Version 1.x (the days of versions 0.x are over)

Import, clean, and merge messy Data and prepare Data for Machine Learning

Master a complete Machine Learning Project A-Z with Pandas, Scikit-Learn, and Seaborn

Analyze, visualize, and understand your Data with Pandas, Matplotlib, and Seaborn

Practice and master your Pandas skills with Quizzes, 150+ Exercises, and Comprehensive Projects

Import Financial/Stock Data from Web Sources and analyze them with Pandas

Learn and master the most important Pandas workflows for Finance

Learn how to best transition from Versions 0.x to new Version 1.x

Learn the Basics of Pandas and Numpy Coding (Appendix)

Learn and master important Statistical Concepts with scipy


######### UPDATE (November 2020) ###########

  • Added: Introduction to Machine Learning with Pandas and scikit-learn - incl. a comprehensive ML Project A-Z

  • Added: Another comprehensive Final Project (Explanatory Data Analysis) to test your skills

  • Updated to latest Pandas Version 1.1! This is the first course that covers Pandas 1.x. It gives optimal guidance on how to transition from version 0.x to version 1.x!


Welcome to the web´s most comprehensive Pandas Bootcamp with 34 hours of video content, 150+ exercises, and two large and comprehensive Final Projects that test your skills! This course has one goal: Bringing your data handling skills to the next level to build your career in Data Science, Machine Learning, Finance & co.

This course has five parts:

  • Pandas Basics - from Zero to Hero (Part 1).

  • The complete data workflow A-Z with Pandas: Importing, Cleaning, Merging, Aggregating, and Preparing Data for Machine Learning. (Part 2)

  • Two Comprehensive Project Challenges that are frequently used in Data Science job recruiting/assessment centers: Test your skills! (Part 3).

  • Application 1: Pandas for Finance, Investing and other Time Series Data (Part 4)

  • Application 2: Machine Learning with Pandas and scikit-learn (Part 5)

Why should you learn Pandas?

The world is getting more and more data-driven. Data Scientists are gaining ground with $100k+ salaries. It´s time to switch from soapbox cars (spreadsheet software like Excel) to High Tuned Racing Cars (Pandas)!

Python is a great platform/environment for Data Science with powerful Tools for Science, Statistics, Finance, and Machine Learning. The Pandas Library is the Heart of Python Data Science. Pandas enables you to import, clean, join/merge/concatenate, manipulate, and deeply understand your Data and finally prepare/process Data for further Statistical Analysis, Machine Learning, or Data Presentation. In reality, all of these tasks require a high proficiency in Pandas! Data Scientists typically spend up to 85% of their time manipulating Data in Pandas.

Can you start right now?

A frequently asked question of Python Beginners is: "Do I need to become an expert in Python coding before I can start working with Pandas?"

The clear answer is: "No! Do you need to become a Microsoft Software Developer before you can start with Excel? Probably not!"

You require some Python Basics like data types, simple operations/operators, lists and numpy arrays. In the Appendix of this course, you can find a Python crash course. This Python Introduction is tailor-made and sufficient for Data Science purposes!

In addition, this course covers fundamental statistical concepts (coding with scipy).   

As a Summary, if you primarily want to use Python for Data Science or as a replacement for Excel, this course is a perfect match!

Why should you take this Course?

  • It is the most relevant and comprehensive course on Pandas.

  • It is the most up-to-date course and the first that covers Pandas Version 1.x. The Pandas Library has experienced massive improvements in the last couple of months. Working with and relying on outdated code can be painful.

  • Pandas isn´t an isolated tool. It is used together with other Libraries: Matplotlib and Seaborn for Data Visualization | Numpy, Scipy and Scikit-Learn for Machine Learning, scientific and statistical computing. This course covers all these Libraries.

  • In real-world projects, coding and the business side of things are equally important. This is probably the only Pandas course that teaches both: in-depth Pandas Coding and Big-Picture Thinking.

  • It serves as a Pandas Encyclopedia covering all relevant methods, attributes, and workflows for real-world projects. If you have problems with any method or workflow, you will most likely get help and find a solution in this course.

  • It shows and explains the full real-world Data Workflow A-Z: Starting with importing messy data, cleaning data, merging and concatenating data, grouping and aggregating data, Explanatory Data Analysis through to preparing and processing data for Statistics, Machine Learning, Finance, and Data Presentation. 

  • It explains Pandas Coding on real Data and real-world Problems. No toy data! This is the best way to learn and understand Pandas.

  • It gives you plenty of opportunities to practice and code on your own. Learning by doing. In the exercises, you can select the level of difficulty with optional hints and guidance/instruction.

  • Pandas is a very powerful tool. But it also has pitfalls that can lead to unintended and undiscovered errors in your data. This course also focuses on commonly made mistakes and errors and teaches you, what you should not do.

  • Guaranteed Satisfaction: Otherwise, get your money back with 30-Days-Money-Back-Guarantee.

I am looking forward to seeing you in the course!


The Complete Pandas Bootcamp 2021: Data Science with Python
The Complete Pandas Bootcamp 2021: Data Science with Python
The Complete Pandas Bootcamp 2021: Data Science with Python
The Complete Pandas Bootcamp 2021: Data Science with Python


Getting Started

Overview / Student FAQ

Tips: How to get the most out of this course

Did you know that...?

More FAQ / Important Information

Installation of Anaconda

Opening a Jupyter Notebook

How to use Jupyter Notebooks

How to tackle Pandas Version 1.0


Intro to Tabular Data / Pandas

Download: Part 1 Course Materials

Pandas Basics (DataFrame Basics I)

Create your very first Pandas DataFrame (from csv)

Pandas Display Options and the methods head() & tail()

First Data Inspection

Built-in Functions, Attributes and Methods with Pandas

Make it easy: TAB Completion and Tooltip

First Steps

Explore your own Dataset: Coding Exercise 1 (Intro)

Explore your own Dataset: Coding Exercise 1 (Solution)

Selecting Columns

Selecting one Column with the "dot notation"

Zero-based Indexing and Negative Indexing

Selecting Rows with iloc (position-based indexing)

Slicing Rows and Columns with iloc (position-based indexing)

Position-based Indexing Cheat Sheets

Selecting Rows with loc (label-based indexing)

Slicing Rows and Columns with loc (label-based indexing)

Label-based Indexing Cheat Sheets

Indexing and Slicing with reindex()

Summary, Best Practices and Outlook

Indexing and Slicing

Coding Exercise 2 (Intro)

Coding Exercise 2 (Solution)

Pandas Series and Index Objects


First Steps with Pandas Series

Analyzing Numerical Series with unique(), nunique() and value_counts()

UPDATE Pandas Version 0.24.0 (Jan 2019)

EXCURSUS: Updating Pandas / Anaconda

Analyzing non-numerical Series with unique(), nunique(), value_counts()

Creating Pandas Series (Part 1)

Creating Pandas Series (Part 2)

Indexing and Slicing Pandas Series

Sorting of Series and Introduction to the inplace - parameter

nlargest() and nsmallest()

idxmin() and idxmax()

Manipulating Pandas Series

Pandas Series

Coding Exercise 3 (Intro)

Coding Exercise 3 (Solution)

First Steps with Pandas Index Objects

Creating Index Objects from Scratch

Changing Row Index with set_index() and reset_index()

Changing Column Labels

Renaming Index & Column Labels with rename()

Pandas Index objects

Coding Exercise 4 (Intro)

Coding Exercise 4 (Solution)

DataFrame Basics II


Filtering DataFrames by one Condition

Filtering DataFrames by many Conditions (AND)

Filtering DataFrames by many Conditions (OR)

Advanced Filtering with between(), isin() and ~

any() and all()

Removing Columns

Removing Rows

Adding new Columns to a DataFrame

Creating Columns based on other Columns

Adding Columns with insert()

Creating DataFrames from Scratch with pd.DataFrame()

Adding new Rows (hands-on approach)

DataFrame Basics II

Coding Exercise 5 (Intro)

Coding Exercise 5 (Solution)

Manipulating Elements in a DataFrame / Slice +++Important, know the Pitfalls!+++


Best Practice (How you should do it)

Chained Indexing: How you should NOT do it (Part 1)

Chained Indexing: How you should NOT do it (Part 2)

View vs. Copy

Simple Rules what to do when...

Manipulating DataFrames / Slices

Coding Exercise 6 (Intro)

Coding Exercise 6 (Solution)

DataFrame Basics III


Sorting DataFrames with sort_index() and sort_values()

Ranking DataFrames with rank()

nunique() and nlargest() / nsmallest() with DataFrames

Summary Statistics and Accumulations

The agg() method

Coding Exercise 7 (Intro)

Coding Exercise 7 (Solution)

User-defined Functions with apply(), map() and applymap()

Hierarchical Indexing (Part 1)

Hierarchical Indexing (Part 2)

String Operations (Part 1)

String Operations (Part 2)

Coding Exercise 8 (Intro)

Coding Exercise 8 (Solution)

Visualization with Matplotlib


The plot() method

Customization of Plots

Histograms (Part 1)

Histograms (Part 2)

Barcharts and Piecharts


Coding Exercise 9 (Intro)

Coding Exercise 9 (Solution)


Welcome to PART 2: Full Data Workflow A-Z

Download: Part 2 Course Materials

Importing Data

Importing csv-files with pd.read_csv

Importing messy csv-files with pd.read_csv

Importing Data from Excel with pd.read_excel()

Importing messy Data from Excel with pd.read_excel()

Importing Data from the Web with pd.read_html()

Coding Exercise 10

Cleaning Data

First Inspection & Handling of inconsistent Data

String Operations

Changing Datatype of Columns with astype()

Intro NA values / missing values

Detection of missing Values

Removing missing values

Replacing missing values

Intro Duplicates

Detection of Duplicates

Handling / Removing Duplicates

Detection of Outliers

Handling / Removing Outliers

Categorical Data

Coding Exercise 11 (Intro)

Coding Exercise 11 (Solution)

Merging, Joining, and Concatenating Data


Adding Rows with append() and pd.concat() (Part 1)

Adding Rows with pd.concat() (Part 2)

Arithmetic with Pandas Objects / Data Alignment

EXCURSUS: Comparing two DataFrames / Identify Differences

Outer Joins with merge()

Inner Joins with merge()

Outer Joins (without Intersection) with merge()

Left Joins (without Intersection) with merge()

Right Joins (without Intersection) with merge()

Left Joins with merge()

Right Joins with merge()

Joining on different Column Names / Indexes

Joining on more than one Column

pd.merge() and join()

Coding Exercise 12

GroupBy Operations


Understanding the GroupBy Object

Splitting with many Keys

split-apply-combine explained

split-apply-combine applied

GroupBy 1

Advanced aggregation with agg()

GroupBy Aggregation with Relabeling (NEW - Pandas Version 0.25)

Transformation with transform()

Replacing NA Values by group-specific Values

Generalizing split-apply-combine with apply()

Hierarchical Indexing with Groupby

stack() and unstack()

GroupBy 2

Coding Exercise 13 (Intro)

Coding Exercise 13 (Solution)

Reshaping and Pivoting DataFrames


Transposing Rows and Columns

Pivoting DataFrames with pivot()

Limits of pivot()



melting DataFrames with melt()

Coding Exercise 14

Data Preparation and Feature Creation


Arithmetic Operations (Part 1)

Arithmetic Operations (Part 2)

Transformation/Mapping with map()

Conditional Transformation

Discretization and Binning with pd.cut() (Part 1)

Discretization and Binning with pd.cut() (Part 2)

Discretization and Binning with pd.qcut()

Floors and Caps

Scaling / Standardization

Creating Dummy Variables

String Operations

Coding Exercise 15

Advanced Visualization with Seaborn


First Steps in Seaborn

Categorical Plots

Joint Plots / Regression Plots

Matrixplots / Heatmaps

Coding Exercise 16


Download: Part 3 Course Materials

Olympic Medal Tables (Instruction & Hints)

Olympic Medal Tables (Solution Part 1)

Olympic Medal Tables (Solution Part 2)

Olympic Medal Tables (Solution Part 3)


Welcome to PART 4: Time Series Data with Pandas

Download: Part 4 Course Materials

Time Series Basics

Importing Time Series Data from csv-files

Converting strings to datetime objects with pd.to_datetime()

Initial Analysis / Visualization of Time Series

Indexing and Slicing Time Series

Creating a customized DatetimeIndex with pd.date_range()

More on pd.date_range()

Downsampling Time Series with resample() (Part 1)

Downsampling Time Series with resample (Part 2)

The PeriodIndex object

Advanced Indexing with reindex()

Time Series Advanced / Financial Time Series


Getting Ready (Installing required package)

Importing Stock Price Data from Yahoo Finance (it still works!)

Initial Inspection and Visualization

Normalizing Time Series to a Base Value (100)

The shift() method

The methods diff() and pct_change()

Measuring Stock Performance with MEAN Returns and STD of Returns

Financial Time Series - Return and Risk

Financial Time Series - Covariance and Correlation

Helpful DatetimeIndex Attributes and Methods

Filling NA Values with bfill, ffill and interpolation

Coding Exercise 17


Intro and Overview

How to update Pandas to Version 1.0

Downloads for this Section

Important Recap: Pandas Display Options (Changed in Version 0.25)

Info() method - new and extended output

NEW Extension dtypes ("nullable" dtypes): Why do we need them?

Creating the NEW extension dtypes with convert_dtypes()

NEW pd.NA value for missing values

The NEW "nullable" Int64Dtype

The NEW StringDtype

The NEW "nullable" BooleanDtype

Addition of the ignore_index parameter

Removal of prior Version Deprecations


Welcome to the Appendix

Python Basics


First Steps


Data Types: Integers and Floats

Data Types: Strings

Data Types: Lists (Part 1)

Data Types: Lists (Part 2)

Data Types: Tuples

Data Types: Sets

Operators & Booleans

Conditional Statements (if, elif, else, while)

For Loops

Key words break, pass, continue

Generating Random Numbers

User Defined Functions (Part 1)

User Defined Functions (Part 2)

User Defined Functions (Part 3)

Visualization with Matplotlib

Python Basics

Python Basics Quiz: Solution

The Numpy Package

Introduction to Numpy Arrays

Numpy Arrays: Vectorization

Numpy Arrays: Indexing and Slicing

Numpy Arrays: Shape and Dimensions

Numpy Arrays: Indexing and Slicing of multi-dimensional Arrays

Numpy Arrays: Boolean Indexing

Generating Random Numbers

Performance Issues

Case Study: Numpy vs. Python Standard Library

Summary Statistics

Visualization and (Linear) Regression


Numpy Quiz: Solution

Statistical Concepts

Statistics - Overview, Terms and Vocabulary

Population vs. Sample

Visualizing Frequency Distributions with plt.hist()

Relative and Cumulative Frequencies with plt.hist()

Measures of Central Tendency (Theory)

Coding Measures of Central Tendency - Mean and Median

Coding Measures of Central Tendency - Geometric Mean

Variability around the Central Tendency / Dispersion (Theory)

Minimum, Maximum and Range with Python/Numpy

Percentiles with Python/Numpy

Variance and Standard Deviation with Python/Numpy

Skew and Kurtosis (Theory)

How to calculate Skew and Kurtosis with scipy.stats

How to generate Random Numbers with Numpy

Reproducibility with np.random.seed()

Probability Distributions - Overview

Discrete Uniform Distributions

Continuous Uniform Distributions

The Normal Distribution (Theory)

Creating a normally distributed Random Variable

Normal Distribution - Probability Density Function (pdf) with scipy.stats

Normal Distribution - Cumulative Distribution Function (cdf) with scipy.stats

The Standard Normal Distribution and Z-Values

Properties of the Standard Normal Distribution (Theory)

Probabilities and Z-Values with scipy.stats

Confidence Intervals with scipy.stats

Covariance and Correlation Coefficient (Theory)

Cleaning and preparing the Data - Movies Database (Part 1)

Cleaning and preparing the Data - Movies Database (Part 2)

How to calculate Covariance and Correlation in Python

Correlation and Scatterplots – visual Interpretation

What is Linear Regression? (Theory)

A simple Linear Regression Model with numpy & Scipy

How to interpret Intercept and Slope Coefficient

Case Study (Part 1): The Market Model (Single Factor Model)

Case Study (Part 2): The Market Model (Single Factor Model)

What´s next?

Get your special BONUS here!


Abhishek13 October 2020

Course Content is fantastic,instructor is good, but the only problem is your pronunciation is not clear.

Sean11 October 2020

A well organized and presented course. I had some experience with Pandas prior to starting this but still found a lot of useful information and expanded my skill set. It was a great second step for me. Thanks for a great course.

Joanna23 September 2020

Excellent comprehensive learning resource, very well structured. The course covers a wide range of topics and everything is explained in details. Also big plus for additional sections on Python basics, numpy and statistcal concepts.

Sebastian19 September 2020

Very good course! The course covers all the things I needed to start with Pandas and will be my first reference - before I search stackoverflow. It was very easy to follow along and the exercies are fun and very well set up. I wish, there were more of them!

John18 September 2020

I found the instructions on how to manage the video quality very clear and useful. I learned several new things I didn't know before about the video controls in Udemy.

Asier1 March 2020

The course is very detailed and you can concentrate in some parts. I like the opnions based on the experience.

Bhanurdra23 February 2020

Thanks Alex for such a great course. This is the best course for anyone like to learn Pandas. I can sense that you have brought the do's and don't's from your real life work experience. The course is curated such a way anyone can learn. Anyone wants to learn Machine Learning, this is the starting point. Only thing I would like to some Videos regarding handling of Json, databases, except that it is perfect. Thanks Alex for such great course !!

Mauro1 February 2020

Honestly pretty bad, the course was useful in the sense that I had to see a good part of pandas functions and use cases. The huge problem is that probably 70/75% of the course time is completely useless as it's just the instructor wasting time on repeating the same steps or reading values/column names. I'm currently at maybe 80% and udemy doesn't even count the videos as seen since I have to skip most of the view time to arrive to the functions the video is actually supposed to show, which aren't even explained well (maybe literally 5/10% of the video time to explain what it does, the rest is just showing values and saying them out loud). As I already said pretty much most of the time is spent on repeating things that pretty much anyone already understood after 2 videos, and thats for EVERY SINGLE VIDEO. Like, whats the point of doing .describe/.info to say out loud every single variable type "you see thats a string, thats a float, average is 6000, blabla" dude I can see that, you dont have to spend 1 minute and a half reading types and numbers. Has made me so nervous at a point that I had to write this to feel better

Sanjai21 January 2020

Very comprehensive and covers a lot of ground. It took me a while, but I am so happy I did this course.

Shubham14 January 2020

By far it is the most thorough course I have come across on udemy. Plenty of examples as exercises to develop understanding.

Sven12 January 2020

Great course. Very clear instructions. Probably the best Pandas course out there. Maybe a couple of more coding exercises would help, but nonetheless excellent course!

Robert29 December 2019

So far it has been a very nice match.....Actually I appreciate the instructors accent more than the course content at this point['just a personal opinion']

Mert21 December 2019

Excellent course..... teacher organized that everything and every details very well... he offers lot of deepest detailed information about pandas and data structures of the pandas.... it has lot of exercises.... if you are looking to pandas or data science course , you can not find better course....!!!!!! muhtesem hazirlanmis kesinlikle tavsiye ederim

Mohit9 December 2019

All good content, mini projects type of something could also be added after every 2/3 sections based on cumulative course completed on that specific step.

Konstantin7 December 2019

Great comprehensive course. After completion I'm able to use Pandas to work with data tables. For complex tasks I use it as a reference material. I would definitely recommend this course for those who interested in data analysis.


2/28/202095% OFFExpired


Udemy ID


Course created date


Course Indexed date
Course Submitted by