The Ultimate Pandas Bootcamp: Advanced Python Data Analysis

Master the powerful pandas library to analyze, manipulate and visualize data. More than 10 datasets & bonuses included!

4.79 (355 reviews)


32 hours


Jul 2021

Last Update
Regular Price

Unlimited access to 30 000 Premium SkillShare courses

What you will learn

Learn everything there is to know about pandas - from absolute scratch!

Gain a deep and hands-on understanding of pandas data structures.

Transform, clean, filter, groupby, pivot, and otherwise manipulate a any dataset.

Understand related computer science topics like random-number generators, binary operators, memory pointers, and more!

Practice reading data from the web, pickles, Excel files right within pandas.

Discover and learn hundreds of methods, attributes, and techniques to manipulate data in pandas and python.


Welcome to the best resource online for learning and mastering data analysis with pandas and python.

Over 32 hours, 10+ datasets, and 50+ skill challenges, you will gain hands-on mastery of, not only pandas 1.x, but also tens of computer science, statistics, and programming concepts.

We will break down, understand, and practice hundreds of methods, attributes, and techniques in pandas and python that will fundamentally change the way you work with data.

In The Ultimate Pandas Bootcamp (2021) you won’t be working with outdated versions of pandas, writing repetitive commands on the same boring dataset. Instead, you’ll learn pandorable and pythonic solutions to interesting, real-world data problems, while working with many diverse datasets that range from wine servings, video game sales, and SAT scores to stock prices, college salaries and more!

Data analysis is an applied science, which is why in each section, you’ll stop and practice what you learn in dedicated skill challenges, followed by detailed solutions where we often consider and compare alternative solutions.

Data analysis is one of the most in-demand skill across all industries and an increasing number of roles. And python is increasingly the language of choice.

Pandas is the wonderful open-source library that is the embodiment of those trends: based on the python programming language, pandas is the de facto data analysis library in the python data science community.

––––– Structure & Curriculum –––––

Over more than 31 hours, we'll cover everything that pandas has to offer, from manipulating series and dataframes, to merging datasets, handling time series, aggregations, filtering, sorting and much more!

The first four sections of the bootcamp constitute the core curriculum. You'll get acquainted with series and dataframes and develop an in-depth understanding of pandas data structures.

· Series at a Glance

· Series Methods and Handling

· Introducing DataFrames

· DataFrames More In Depth

In the next eight sections, you will dive into more advanced topics and take your pandas skills to another level, learning how to work with multiple datasets, manipulate time series, visualize data, write custom functions to transform data and much more.

· Working With Multiple DataFrames

· Going MultiDimensional

· GroupBy And Aggregates

· Reshaping With Pivots

· Working With Dates And Time

· Regular Expressions And Text Manipulation

· Visualizing Data

· Data Formats And I/O

Pandas and python go hand-in-hand which is why this bootcamp also includes a full-length introduction to the python programming language, to get you up and running writing pythonic code in no time.

This is the ultimate course on one of the most-valuable skills today. I hope you commit to mastering data analysis with pandas.

See you inside!


The Ultimate Pandas Bootcamp: Advanced Python Data Analysis
The Ultimate Pandas Bootcamp: Advanced Python Data Analysis
The Ultimate Pandas Bootcamp: Advanced Python Data Analysis
The Ultimate Pandas Bootcamp: Advanced Python Data Analysis



Course Structure

Pandas Is Not Single


Jupyter Notebooks

Cloud vs Local

Hello, Python


Series At A Glance

Section Intro

What Is A Series?

Parameters vs Arguments

What’s In The Data?

The .dtype Attribute

BONUS: What Is dtype('o'), Really?

Index And RangeIndex

Series And Index Names

Skill Challenge


Another Solution

The head() And tail() Methods

Extracting By Index Position

Accessing Elements By Label

BONUS: The add_prefix() And add_suffix() Methods

Using Dot Notation

Boolean Masks And The .loc Indexer

Extracting By Position With .iloc

BONUS: Using Callables With .loc And .iloc

Selecting With .get()

Selection Recap

Skill Challenge


Series Methods And Handling

Section Intro

Reading In Data With read_csv()

Series Sizing With .size, .shape, And len()

Unique Values And Series Monotonicity

The count() Method

Accessing And Counting NAs

BONUS: Another Approach

The Other Side: notnull() And notna()

BONUS: Booleans Are Literally Numbers In Python

Skill Challenge


Dropping And Filling NAs

Descriptive Statistics

The describe() Method

mode() And value_counts()

idxmax() And idxmin()

Sorting With sort_values()

nlargest() And nsmallest()

Sorting With sort_index()

Skill Challenge


Series Arithmetics And fill_value()

BONUS: Calculating Variance And Standard Deviation

Cumulative Operations

Pairwise Differences With diff()

Series Iteration

Filtering: filter(), where(), And mask()

Transforming With update(), apply() And map()

Skill Challenge

Solution I - Reading Data

Solution II - Mean, Median, And Standard Deviation

Solution III - Z-scores

Working With DataFrames

Section Intro

What Is A DataFrame

Creating A DataFrame

BONUS - Four More Ways To Build DataFrames

The info() Method

Reading In Nutrition Data

Some Cleanup: Removing The Duplicated Index

The sample() Method

BONUS - Sampling With Replacement Or Weights

BONUS - How Are Random Numbers Generated?

DataFrame Axes

Changing The Index

Extracting From DataFrames By Label

DataFrame Extraction by Position

Single Value Access With .at And .iat

BONUS - The get_loc() Method

Skill Challenge


More Cleanup: Going Numeric

The astype() Method

DataFrame replace() + A Glimpse At Regex

Part I: Collecting The Units

The rename() Method

DataFrame dropna()

BONUS - dropna() With Subset

Part II: Merging Units With Column Names

Part III: Removing Units From Values

Filtering in 2D

DataFrame Sorting

Using Series between() With DataFrames

BONUS - Min, Max and Idx[MinMax], And Good Foods

DataFrame nlargest() And nsmallest()

Skill Challenge


Another Skill Challenge


DataFrames In Depth

Section Intro

Introducing A New Dataset

Quick Review: Indexing With Boolean Masks

More Approaches To Boolean Masking

Binary Operators With Booleans

BONUS - XOR and Complement Binary Ops

Combining Conditions

Conditions As Variables

Skill Challenge


2d Indexing

Fancy Indexing With lookup()

Sorting By Index Or Column

Sorting vs. Reordering

BONUS - Another Way

15. BONUS - Please Avoid Sorting Like This

Skill Challenge


Identifying Dupes

Removing Duplicates

Removing DataFrame Rows

BONUS - Removing Columns

BONUS - Another Way: pop()

BONUS - A Sophisticated Alternative

Null Values In DataFrames

Dropping And Filling DataFrame NAs

BONUS - Methods And Axes With fillna()

Skill Challenge


Calculating Aggregates With agg()

Same-shape Transforms

More Flexibility With apply()

Element-wise Operations With applymap()

Skill Challenge


Setting DataFrame Values

The SettingWithCopy Warning

View vs Copy

Adding DataFrame Columns

Adding Rows To DataFrames

BONUS - How Are DataFrames Stored In Memory

Skill Challenge


Working With Multiple DataFrames

Section Intro

Introducing (Five?) New Datasets

Concatenating DataFrames

The Duplicated Index Issue

Enforcing Unique Indices

BONUS - Creating Multiple Indices With concat()

Column Axis Concatenation

The append() Method: A Special Case Of concat()

Concat On Different Columns

Skill Challenge


The merge() Method

The left_on And right_on Params

Inner vs Outer Joins

Left vs Right Joins

One-to-One and One-to-Many Joins

Many-to-Many Joins

Merging By Index

The join() Method

Skill Challenge


Going MultiDimensional

Section Intro

Introducing New Data

Index And RangeIndex

Creating A MultiIndex

MultiIndex From read_csv()

Indexing Hierarchical DataFrames

Indexing Ranges And Slices

BONUS - Use : With pd.IndexSlice!

Cross Sections With xs()

Skill Challenge


The Anatomy Of A MultiIndex Object

Adding Another Level

Shuffling Levels

Removing MultiIndex Levels

MultiIndex sort_index()

More MultiIndex Methods

Reshaping With stack()

The Flipside: unstack()

BONUS: Creating MultiLevel Columns Manually

An Easier Way: transpose()

BONUS - What About Panels?

Skill Challenge


GroupBy And Aggregates

Section Intro

New Data: Game Sales

Simple Aggregations Review

Conditional Aggregates

The Split-Apply-Combine Pattern

The groupby() Method

The DataFrameGroupBy Object

Customizing Index To Group Mappings

BONUS - Series groupby()

Skill Challenge


Iterating Through Groups

Handpicking Subgroups

MultiIndex Grouping

Fine-tuned Aggregates

Named Aggregations

The filter() Method

GroupBy Transformations

BONUS - There's Also apply()

Skill Challenge


Reshaping With Pivots

Section Intro

New Data: New York City SAT Scores

Pivoting Data

Undoing Pivots

What About Aggregates?

The pivot_table()

BONUS: The Problem With Average Percentage

Replicating Pivot Tables With GroupBy

Adding Margins

MultiIndex Pivot Tables

Applying Multiple Functions

Skill Challenge


Handling Date And Time

Section Intro

The Python datetime Module

Parsing Dates From Text

Even Better: dateutil

From Datetime To String

Performant Datetimes With Numpy

The Pandas Timestamp

Our Dataset: Brent Prices

Date Parsing And DatetimeIndex

A Cool Shorcut: read_csv() With parse_dates

Indexing Dates

Skill Challenge


DateTimeIndex Attribute Accessors

Creating Date Ranges

Shifting Dates With pd.DateOffset

BONUS: Timedeltas And Absolute Time

Resampling Timeseries

Upsampling And Interpolation

What About asfreq()?

BONUS: Rolling Windows

Skill Challenge


Regex And Text Manipulation

Section Intro

Our Data: Boston Marathon Runners

String Methods In Python

Vectorized String Operations In Pandas

Case Operations

Finding Characters And Words

Strips And Whitespace

String Splitting And Concatenation

More Split Parameters

Skill Challenge


Slicing Substrings

Masking With String Methods

BONUS: Parsing Indicators With get_dummies()

Text Replacement

Introduction To Regular Expressions

More Regex Concepts

How To Approach Regex?

Is This A Valid Email?

BONUS: What's The Point Of re.compile()?

Pandas str contains(), split() And replace() With Regex

Skill Challenge


Visualizing Data

Section Intro

The Art Of Data Visualization

The Preliminaries Of matplotlib

Line Graphs

Bar Charts

Pie Plots


Scatter Plots

Other Visualization Options

BONUS: Data Ink And Chartjunk

Skill Challenge


Data Formats And I/O

Section Intro

Reading JSON

Reading HTML

Reading Excel

Creating Output: The to_* Family Of Methods

BONUS: Introduction To Pickling

Pickles In Pandas

The Many Other Formats

Skill Challenge


Appendix A - Rapid-Fire Python Fundamentals

Section Intro

Data Types


Arithmetic And Augmented Assignment Operators

Ints And Floats

Booleans And Comparison Operators



Containers I: Lists

Lists vs. Strings

List Methods And Functions

Containers II: Tuples

Containers III: Sets

Containers IV: Dictionaries

Dictionary Keys And Values

Membership Operators

Controlling Flow: if, else, And elif

Truth Value Of Non-booleans

For Loops

The range() Immutable Sequence

While Loops

Break And Continue

Zipping Iterables

List Comprehensions

Defining Functions

Function Arguments: Positional vs Keyword


Importing Modules

Appendix B - Going Local: Installation And Setup

Installing Anaconda And Python - Windows

Installing Anaconda And Python - Mac

Installing Anaconda And Python - Linux


Geert15 January 2021

Tutor talks very slowly, I have to put it at 1,5x speed and even then I'm skipping a lot because I had already time to test it. I also find that a lot of exotic stuff is explained, only to be followed by "will hardly be used" or "you won't need this". I was hoping to get a more structured approach with also guidelines on which steps to take to approach a typical data analysis. Leave the details for later or refer to documentation, focus on the most commonly used tools and explain via projects, from basic to more advanced.

Shibbu28 December 2020

I found this course very detailed and well arranged from start to end. All concepts are explained very well from basics to advance. After taking this course i feel confident in analyzing data with Pandas.

Abhishek16 December 2020

This is an excellent course in pandas , regular expressions and string manipulation. The best part about this course is that it covers all the use cases that one can think about. The clarity and depth that Andy exhibits is truly remarkable. I am glad that I enrolled for this course. Would highly recommend this course :)

Ruparaja13 November 2020

Alhamdhulilah!!! Excellent, from my own experience this course is amazing to learn python for Data analyis

Neil8 October 2020

Superb. I am an advanced pandas user and I find this course an excellent reference material. Highly recommended. Now I have both (the other one also 30+ hours long) pandas courses available on Udemy and they are both worth keeping. I hope both authors continue to add to the course as there is just so, so much more pandas can do. This course, as is, is a steal too. Thank you for creating it.

Enes22 August 2020

Best course in udemy! Such a nice, deep and brilliant course is this!.. I have learnt to much things from Andy and he is amazing person!

Pravin7 August 2020

This is really good for me. it improve my knowledge about Python (Linear Regression). I know about Regression but i can explore my with python it help me lot. Thank You.

Yogesh28 July 2020

Course Contents are meeting my expectations. Course is going to cover each and every detail areas required to perform operations with Pandas module. Instructor is having good knowledge of Pandas, clear with his language and pronunciation.

Rohan4 July 2020

This is the only Pandas Bootcamp that covers everything. I haven't found anything else. Andy is also using Google Colab which I like more than Jupyter Notebooks which kinda seems old. He goes on a somewhat fast paced which I really like and he teaches well. Would definately recommend this course

Isidor26 June 2020

Impressive depth. Massive amount of very useful content. Well-broken up into bite-size pieces. Well-designed "skill challenges". I also like the sprinkling of humor and brief detours to follow his curiosity when interesting results show up in the example data sets ("Hmm...I wonder what brains are made of?" :D ). I'm only about a third of the way through, and this is my first Udemy course...so I don't know what other Udemy courses are like...but I'm learning a lot from this course of Andy's. Thank you Andy and Udemy.

Mochamad25 June 2020

could be better if there any task for us an asking some our project problem that were used pandas, but the contents all are super detail well explained

Turki21 June 2020

good until now, and could be better . sometimes seems a little bit confusing , but good job in general.

Emin13 June 2020

Konular gerçekten sade ve anlaşılır bir şekilde anlatılmış :) Verilen örnekler sayesinde konular daha iyi bir şekilde anlaşılıyor.

Dosses12 June 2020

The author expresses himself clearly, in an ordered way and demonstrates familiarity with the topics introduced. The content is very complete, easy to understand and to follow, with exercises that cover solutions for everyday work. Great job and excellent course!

Dener11 June 2020

This is an amazing course !! Andy knows a lot, and his didactic is awesone, easy to understand ! It's a professional course, the instructor doesn't make jokes while teaching. Thank you Andy !


6/11/2020100% OFFExpired


Udemy ID


Course created date


Course Indexed date
Course Submitted by