4.74 (2162 reviews)
☑ Bring your Data Handling & Data Analysis skills to an outstanding level.
☑ Learn and practice all relevant Pandas methods and workflows with Real-World Datasets
☑ Learn Pandas based on NEW Version 1.x (the days of versions 0.x are over)
☑ Import, clean, and merge messy Data and prepare Data for Machine Learning
☑ Master a complete Machine Learning Project A-Z with Pandas, Scikit-Learn, and Seaborn
☑ Analyze, visualize, and understand your Data with Pandas, Matplotlib, and Seaborn
☑ Practice and master your Pandas skills with Quizzes, 150+ Exercises, and Comprehensive Projects
☑ Import Financial/Stock Data from Web Sources and analyze them with Pandas
☑ Learn and master the most important Pandas workflows for Finance
☑ Learn how to best transition from Versions 0.x to new Version 1.x
☑ Learn the Basics of Pandas and Numpy Coding (Appendix)
☑ Learn and master important Statistical Concepts with scipy
######### UPDATE (November 2020) ###########
Added: Introduction to Machine Learning with Pandas and scikit-learn - incl. a comprehensive ML Project A-Z
Added: Another comprehensive Final Project (Explanatory Data Analysis) to test your skills
Updated to latest Pandas Version 1.1! This is the first course that covers Pandas 1.x. It gives optimal guidance on how to transition from version 0.x to version 1.x!
Welcome to the web´s most comprehensive Pandas Bootcamp with 34 hours of video content, 150+ exercises, and two large and comprehensive Final Projects that test your skills! This course has one goal: Bringing your data handling skills to the next level to build your career in Data Science, Machine Learning, Finance & co.
This course has five parts:
Pandas Basics - from Zero to Hero (Part 1).
The complete data workflow A-Z with Pandas: Importing, Cleaning, Merging, Aggregating, and Preparing Data for Machine Learning. (Part 2)
Two Comprehensive Project Challenges that are frequently used in Data Science job recruiting/assessment centers: Test your skills! (Part 3).
Application 1: Pandas for Finance, Investing and other Time Series Data (Part 4)
Application 2: Machine Learning with Pandas and scikit-learn (Part 5)
Why should you learn Pandas?
The world is getting more and more data-driven. Data Scientists are gaining ground with $100k+ salaries. It´s time to switch from soapbox cars (spreadsheet software like Excel) to High Tuned Racing Cars (Pandas)!
Python is a great platform/environment for Data Science with powerful Tools for Science, Statistics, Finance, and Machine Learning. The Pandas Library is the Heart of Python Data Science. Pandas enables you to import, clean, join/merge/concatenate, manipulate, and deeply understand your Data and finally prepare/process Data for further Statistical Analysis, Machine Learning, or Data Presentation. In reality, all of these tasks require a high proficiency in Pandas! Data Scientists typically spend up to 85% of their time manipulating Data in Pandas.
Can you start right now?
A frequently asked question of Python Beginners is: "Do I need to become an expert in Python coding before I can start working with Pandas?"
The clear answer is: "No! Do you need to become a Microsoft Software Developer before you can start with Excel? Probably not!"
You require some Python Basics like data types, simple operations/operators, lists and numpy arrays. In the Appendix of this course, you can find a Python crash course. This Python Introduction is tailor-made and sufficient for Data Science purposes!
In addition, this course covers fundamental statistical concepts (coding with scipy).
As a Summary, if you primarily want to use Python for Data Science or as a replacement for Excel, this course is a perfect match!
Why should you take this Course?
It is the most relevant and comprehensive course on Pandas.
It is the most up-to-date course and the first that covers Pandas Version 1.x. The Pandas Library has experienced massive improvements in the last couple of months. Working with and relying on outdated code can be painful.
Pandas isn´t an isolated tool. It is used together with other Libraries: Matplotlib and Seaborn for Data Visualization | Numpy, Scipy and Scikit-Learn for Machine Learning, scientific and statistical computing. This course covers all these Libraries.
In real-world projects, coding and the business side of things are equally important. This is probably the only Pandas course that teaches both: in-depth Pandas Coding and Big-Picture Thinking.
It serves as a Pandas Encyclopedia covering all relevant methods, attributes, and workflows for real-world projects. If you have problems with any method or workflow, you will most likely get help and find a solution in this course.
It shows and explains the full real-world Data Workflow A-Z: Starting with importing messy data, cleaning data, merging and concatenating data, grouping and aggregating data, Explanatory Data Analysis through to preparing and processing data for Statistics, Machine Learning, Finance, and Data Presentation.
It explains Pandas Coding on real Data and real-world Problems. No toy data! This is the best way to learn and understand Pandas.
It gives you plenty of opportunities to practice and code on your own. Learning by doing. In the exercises, you can select the level of difficulty with optional hints and guidance/instruction.
Pandas is a very powerful tool. But it also has pitfalls that can lead to unintended and undiscovered errors in your data. This course also focuses on commonly made mistakes and errors and teaches you, what you should not do.
Guaranteed Satisfaction: Otherwise, get your money back with 30-Days-Money-Back-Guarantee.
I am looking forward to seeing you in the course!
Overview / Student FAQ
Tips: How to get the most out of this course
Did you know that...?
More FAQ / Important Information
Installation of Anaconda
Opening a Jupyter Notebook
How to use Jupyter Notebooks
How to tackle Pandas Version 1.0
---PART 1: PANDAS FROM ZERO TO HERO (BUILDING BLOCKS)---
Intro to Tabular Data / Pandas
Download: Part 1 Course Materials
Pandas Basics (DataFrame Basics I)
Create your very first Pandas DataFrame (from csv)
Pandas Display Options and the methods head() & tail()
First Data Inspection
Built-in Functions, Attributes and Methods with Pandas
Make it easy: TAB Completion and Tooltip
Explore your own Dataset: Coding Exercise 1 (Intro)
Explore your own Dataset: Coding Exercise 1 (Solution)
Selecting one Column with the "dot notation"
Zero-based Indexing and Negative Indexing
Selecting Rows with iloc (position-based indexing)
Slicing Rows and Columns with iloc (position-based indexing)
Position-based Indexing Cheat Sheets
Selecting Rows with loc (label-based indexing)
Slicing Rows and Columns with loc (label-based indexing)
Label-based Indexing Cheat Sheets
Indexing and Slicing with reindex()
Summary, Best Practices and Outlook
Indexing and Slicing
Coding Exercise 2 (Intro)
Coding Exercise 2 (Solution)
Pandas Series and Index Objects
First Steps with Pandas Series
Analyzing Numerical Series with unique(), nunique() and value_counts()
UPDATE Pandas Version 0.24.0 (Jan 2019)
EXCURSUS: Updating Pandas / Anaconda
Analyzing non-numerical Series with unique(), nunique(), value_counts()
Creating Pandas Series (Part 1)
Creating Pandas Series (Part 2)
Indexing and Slicing Pandas Series
Sorting of Series and Introduction to the inplace - parameter
nlargest() and nsmallest()
idxmin() and idxmax()
Manipulating Pandas Series
Coding Exercise 3 (Intro)
Coding Exercise 3 (Solution)
First Steps with Pandas Index Objects
Creating Index Objects from Scratch
Changing Row Index with set_index() and reset_index()
Changing Column Labels
Renaming Index & Column Labels with rename()
Pandas Index objects
Coding Exercise 4 (Intro)
Coding Exercise 4 (Solution)
DataFrame Basics II
Filtering DataFrames by one Condition
Filtering DataFrames by many Conditions (AND)
Filtering DataFrames by many Conditions (OR)
Advanced Filtering with between(), isin() and ~
any() and all()
Adding new Columns to a DataFrame
Creating Columns based on other Columns
Adding Columns with insert()
Creating DataFrames from Scratch with pd.DataFrame()
Adding new Rows (hands-on approach)
DataFrame Basics II
Coding Exercise 5 (Intro)
Coding Exercise 5 (Solution)
Manipulating Elements in a DataFrame / Slice +++Important, know the Pitfalls!+++
Best Practice (How you should do it)
Chained Indexing: How you should NOT do it (Part 1)
Chained Indexing: How you should NOT do it (Part 2)
View vs. Copy
Simple Rules what to do when...
Manipulating DataFrames / Slices
Coding Exercise 6 (Intro)
Coding Exercise 6 (Solution)
DataFrame Basics III
Sorting DataFrames with sort_index() and sort_values()
Ranking DataFrames with rank()
nunique() and nlargest() / nsmallest() with DataFrames
Summary Statistics and Accumulations
The agg() method
Coding Exercise 7 (Intro)
Coding Exercise 7 (Solution)
User-defined Functions with apply(), map() and applymap()
Hierarchical Indexing (Part 1)
Hierarchical Indexing (Part 2)
String Operations (Part 1)
String Operations (Part 2)
Coding Exercise 8 (Intro)
Coding Exercise 8 (Solution)
Visualization with Matplotlib
The plot() method
Customization of Plots
Histograms (Part 1)
Histograms (Part 2)
Barcharts and Piecharts
Coding Exercise 9 (Intro)
Coding Exercise 9 (Solution)
----PART 2: FULL DATA WORKFLOW A-Z----
Welcome to PART 2: Full Data Workflow A-Z
Download: Part 2 Course Materials
Importing csv-files with pd.read_csv
Importing messy csv-files with pd.read_csv
Importing Data from Excel with pd.read_excel()
Importing messy Data from Excel with pd.read_excel()
Importing Data from the Web with pd.read_html()
Coding Exercise 10
First Inspection & Handling of inconsistent Data
Changing Datatype of Columns with astype()
Intro NA values / missing values
Detection of missing Values
Removing missing values
Replacing missing values
Detection of Duplicates
Handling / Removing Duplicates
Detection of Outliers
Handling / Removing Outliers
Coding Exercise 11 (Intro)
Coding Exercise 11 (Solution)
Merging, Joining, and Concatenating Data
Adding Rows with append() and pd.concat() (Part 1)
Adding Rows with pd.concat() (Part 2)
Arithmetic with Pandas Objects / Data Alignment
EXCURSUS: Comparing two DataFrames / Identify Differences
Outer Joins with merge()
Inner Joins with merge()
Outer Joins (without Intersection) with merge()
Left Joins (without Intersection) with merge()
Right Joins (without Intersection) with merge()
Left Joins with merge()
Right Joins with merge()
Joining on different Column Names / Indexes
Joining on more than one Column
pd.merge() and join()
Coding Exercise 12
Understanding the GroupBy Object
Splitting with many Keys
Advanced aggregation with agg()
GroupBy Aggregation with Relabeling (NEW - Pandas Version 0.25)
Transformation with transform()
Replacing NA Values by group-specific Values
Generalizing split-apply-combine with apply()
Hierarchical Indexing with Groupby
stack() and unstack()
Coding Exercise 13 (Intro)
Coding Exercise 13 (Solution)
Reshaping and Pivoting DataFrames
Transposing Rows and Columns
Pivoting DataFrames with pivot()
Limits of pivot()
melting DataFrames with melt()
Coding Exercise 14
Data Preparation and Feature Creation
Arithmetic Operations (Part 1)
Arithmetic Operations (Part 2)
Transformation/Mapping with map()
Discretization and Binning with pd.cut() (Part 1)
Discretization and Binning with pd.cut() (Part 2)
Discretization and Binning with pd.qcut()
Floors and Caps
Scaling / Standardization
Creating Dummy Variables
Coding Exercise 15
Advanced Visualization with Seaborn
First Steps in Seaborn
Joint Plots / Regression Plots
Matrixplots / Heatmaps
Coding Exercise 16
---PART 3: COMPREHENSIVE PROJECT CHALLENGE---
Download: Part 3 Course Materials
Olympic Medal Tables (Instruction & Hints)
Olympic Medal Tables (Solution Part 1)
Olympic Medal Tables (Solution Part 2)
Olympic Medal Tables (Solution Part 3)
----PART 4: MANAGING TIME SERIES DATA WITH PANDAS----
Welcome to PART 4: Time Series Data with Pandas
Download: Part 4 Course Materials
Time Series Basics
Importing Time Series Data from csv-files
Converting strings to datetime objects with pd.to_datetime()
Initial Analysis / Visualization of Time Series
Indexing and Slicing Time Series
Creating a customized DatetimeIndex with pd.date_range()
More on pd.date_range()
Downsampling Time Series with resample() (Part 1)
Downsampling Time Series with resample (Part 2)
The PeriodIndex object
Advanced Indexing with reindex()
Time Series Advanced / Financial Time Series
Getting Ready (Installing required package)
Importing Stock Price Data from Yahoo Finance (it still works!)
Initial Inspection and Visualization
Normalizing Time Series to a Base Value (100)
The shift() method
The methods diff() and pct_change()
Measuring Stock Performance with MEAN Returns and STD of Returns
Financial Time Series - Return and Risk
Financial Time Series - Covariance and Correlation
Helpful DatetimeIndex Attributes and Methods
Filling NA Values with bfill, ffill and interpolation
Coding Exercise 17
+++ WHAT´S NEW IN PANDAS VERSION 1.0? - A HANDS-ON GUIDE +++
Intro and Overview
How to update Pandas to Version 1.0
Downloads for this Section
Important Recap: Pandas Display Options (Changed in Version 0.25)
Info() method - new and extended output
NEW Extension dtypes ("nullable" dtypes): Why do we need them?
Creating the NEW extension dtypes with convert_dtypes()
NEW pd.NA value for missing values
The NEW "nullable" Int64Dtype
The NEW StringDtype
The NEW "nullable" BooleanDtype
Addition of the ignore_index parameter
Removal of prior Version Deprecations
---APPENDIX: PYTHON BASICS, NUMPY & STATISTICS---
Welcome to the Appendix
Data Types: Integers and Floats
Data Types: Strings
Data Types: Lists (Part 1)
Data Types: Lists (Part 2)
Data Types: Tuples
Data Types: Sets
Operators & Booleans
Conditional Statements (if, elif, else, while)
Key words break, pass, continue
Generating Random Numbers
User Defined Functions (Part 1)
User Defined Functions (Part 2)
User Defined Functions (Part 3)
Visualization with Matplotlib
Python Basics Quiz: Solution
The Numpy Package
Introduction to Numpy Arrays
Numpy Arrays: Vectorization
Numpy Arrays: Indexing and Slicing
Numpy Arrays: Shape and Dimensions
Numpy Arrays: Indexing and Slicing of multi-dimensional Arrays
Numpy Arrays: Boolean Indexing
Generating Random Numbers
Case Study: Numpy vs. Python Standard Library
Visualization and (Linear) Regression
Numpy Quiz: Solution
Statistics - Overview, Terms and Vocabulary
Population vs. Sample
Visualizing Frequency Distributions with plt.hist()
Relative and Cumulative Frequencies with plt.hist()
Measures of Central Tendency (Theory)
Coding Measures of Central Tendency - Mean and Median
Coding Measures of Central Tendency - Geometric Mean
Variability around the Central Tendency / Dispersion (Theory)
Minimum, Maximum and Range with Python/Numpy
Percentiles with Python/Numpy
Variance and Standard Deviation with Python/Numpy
Skew and Kurtosis (Theory)
How to calculate Skew and Kurtosis with scipy.stats
How to generate Random Numbers with Numpy
Reproducibility with np.random.seed()
Probability Distributions - Overview
Discrete Uniform Distributions
Continuous Uniform Distributions
The Normal Distribution (Theory)
Creating a normally distributed Random Variable
Normal Distribution - Probability Density Function (pdf) with scipy.stats
Normal Distribution - Cumulative Distribution Function (cdf) with scipy.stats
The Standard Normal Distribution and Z-Values
Properties of the Standard Normal Distribution (Theory)
Probabilities and Z-Values with scipy.stats
Confidence Intervals with scipy.stats
Covariance and Correlation Coefficient (Theory)
Cleaning and preparing the Data - Movies Database (Part 1)
Cleaning and preparing the Data - Movies Database (Part 2)
How to calculate Covariance and Correlation in Python
Correlation and Scatterplots – visual Interpretation
What is Linear Regression? (Theory)
A simple Linear Regression Model with numpy & Scipy
How to interpret Intercept and Slope Coefficient
Case Study (Part 1): The Market Model (Single Factor Model)
Case Study (Part 2): The Market Model (Single Factor Model)
Get your special BONUS here!
Course Content is fantastic,instructor is good, but the only problem is your pronunciation is not clear.
A well organized and presented course. I had some experience with Pandas prior to starting this but still found a lot of useful information and expanded my skill set. It was a great second step for me. Thanks for a great course.
Excellent comprehensive learning resource, very well structured. The course covers a wide range of topics and everything is explained in details. Also big plus for additional sections on Python basics, numpy and statistcal concepts.
Very good course! The course covers all the things I needed to start with Pandas and will be my first reference - before I search stackoverflow. It was very easy to follow along and the exercies are fun and very well set up. I wish, there were more of them!
I found the instructions on how to manage the video quality very clear and useful. I learned several new things I didn't know before about the video controls in Udemy.
The course is very detailed and you can concentrate in some parts. I like the opnions based on the experience.
Thanks Alex for such a great course. This is the best course for anyone like to learn Pandas. I can sense that you have brought the do's and don't's from your real life work experience. The course is curated such a way anyone can learn. Anyone wants to learn Machine Learning, this is the starting point. Only thing I would like to some Videos regarding handling of Json, databases, except that it is perfect. Thanks Alex for such great course !!
Honestly pretty bad, the course was useful in the sense that I had to see a good part of pandas functions and use cases. The huge problem is that probably 70/75% of the course time is completely useless as it's just the instructor wasting time on repeating the same steps or reading values/column names. I'm currently at maybe 80% and udemy doesn't even count the videos as seen since I have to skip most of the view time to arrive to the functions the video is actually supposed to show, which aren't even explained well (maybe literally 5/10% of the video time to explain what it does, the rest is just showing values and saying them out loud). As I already said pretty much most of the time is spent on repeating things that pretty much anyone already understood after 2 videos, and thats for EVERY SINGLE VIDEO. Like, whats the point of doing .describe/.info to say out loud every single variable type "you see thats a string, thats a float, average is 6000, blabla" dude I can see that, you dont have to spend 1 minute and a half reading types and numbers. Has made me so nervous at a point that I had to write this to feel better
Very comprehensive and covers a lot of ground. It took me a while, but I am so happy I did this course.
By far it is the most thorough course I have come across on udemy. Plenty of examples as exercises to develop understanding.
Great course. Very clear instructions. Probably the best Pandas course out there. Maybe a couple of more coding exercises would help, but nonetheless excellent course!
So far it has been a very nice match.....Actually I appreciate the instructors accent more than the course content at this point['just a personal opinion']
Excellent course..... teacher organized that everything and every details very well... he offers lot of deepest detailed information about pandas and data structures of the pandas.... it has lot of exercises.... if you are looking to pandas or data science course , you can not find better course....!!!!!! muhtesem hazirlanmis kesinlikle tavsiye ederim
All good content, mini projects type of something could also be added after every 2/3 sections based on cumulative course completed on that specific step.
Great comprehensive course. After completion I'm able to use Pandas to work with data tables. For complex tasks I use it as a reference material. I would definitely recommend this course for those who interested in data analysis.