Data Science


Natural Language Processing (NLP) Fundamentals in Python

Learn the fundamentals of NLP and Text Mining by using NLTK, Word2Vec, Neural Networks and Sentiment Analysis

4.41 (11 reviews)


14 hours


Jul 2021

Last Update
Regular Price

Unlimited access to 30 000 Premium SkillShare courses

What you will learn

Dealing with Strings in Python

Working with the Natural Language Toolkit Library

Understanding the Intuition behind Word Vectors

Pre-Processing Text for Analytics

Understanding Text Vectorization

Train a Neural Network to generate Word Embeddings

Obtain Text Data from Web Pages

Read Files with Textual Data

Developing a Sentiment Analysis Tool

Train a Machine Learning Model


Have you ever wondered how big companies like Google, Amazon or Facebook work with textual data?

Natural Language Processing is one of the most exciting fields in Data Science and Analytics nowadays. The ability to make a computer understand words and phrases is a technological innovation that brought a huge transformation to tasks such as Information Retrieval, Translation or Text Classification.

In this course we are going to learn the fundamentals of working with Text data in Python and discuss the most important techniques that you should know to start your journey in Natural Language Processing. This course was designed for absolute beginners - meaning that everything regarding NLP that we are going to speak in the course will be explained during the lectures, assuming that the student does not have any prior knowledge in the subject.

Don't worry if you don't know Python code by heart - this course also contains a Python crash course that will help you to get familiar with the language and support the rest of the use cases that we will develop with Python throughout the lectures. In this course we are going to approach the following concepts:

  • Working with the raw material of Natural Language Processing - strings - in Python;

  • Tokenizing Sentences and Documents;

  • Stemming and Lemmatizing words;

  • Training machine learning models using text;

  • Extracting the Part-of-Speech Tag from words in a sentence;

  • Extracting Text Data from a Web Page;

  • Training a Neural Network to extract Word Embeddings;

  • Developing your own sentiment classifier (Sentiment Analysis);

  • Representing Sentences as Tabular Data;

After finishing the course you should able to build your own NLP applications and also understand most of the fundamental concepts that are the base of most NLP algorithms. This will give you the flexibility to study more advanced Natural Language Processing concepts and also enable you to get familiar with the strategies and techniques that most companies have used when they started their NLP applications.

Join me in this exciting NLP journey and I'm looking forward to see you in the course!


Natural Language Processing (NLP) Fundamentals in Python
Natural Language Processing (NLP) Fundamentals in Python
Natural Language Processing (NLP) Fundamentals in Python
Natural Language Processing (NLP) Fundamentals in Python


Course Introduction


Course Materials and Speed Up

Installing Anaconda and Initial Setup

Installing the Anaconda Distribution

3 Alternatives to Setup your Environment

[1] - Creating an Environment and Installing Libraries via Anaconda

[2] - Creating an Environment by Importing the YML File

Launching a Jupyter Notebook via Anaconda Navigator

[3] - Creating an Environment via Conda

Installing Libraries via Conda

Launching a Jupyter Notebook via Conda

Testing if your environment is OK

Summary on Environment Setup

Setting up Environment - Quiz

Python Basics Mini-Course

Jupyter Notebook Overview

Python Integers, Floats and Strings

Python Libraries

Python Lists and Sets

Python Dictionaries and Tuples

Python Control Flow

Python Functions

Numpy Overview

Pandas Overview

Tutorial - How to Complete the Exercises

Quiz - Python Quick Course

Python Quick Course - Exercises

Basic Text Processing

Manipulating Text Objects

Combining Strings

Iterating Strings and Format Method

Testing if String is in Sentence

Escaping Characters

Sentence Length, Conversions and Casing Methods

Is Alpha, Strip and Split

Join and Capitalize

Replace, Count and Find

Working with Text - Quiz

Working with Text - Exercises

Exploring NLTK (Natural Language Toolkit)

Natural Language Toolkit Introduction and Sentence Tokenizer

Word Tokenizer

Tokenizer Application and Cleaning Tokens

Counting Frequency of Digits in Sentence

FreqDist NLTK Function

Porter, Snowball and Lancaster Stemmers

Stemming Sentences

Sentence Lemmatization

Part-of-Speech (POS) Tagging

Training a POS Tagger from Scratch - Accessing Tagged Data from Brown Corpus

Training a POS Tagger from Scratch - Unigram Tagger

Training a POS Tagger from Scratch - Bigram Tagger

Plotting the Frequency of Tags in a Sentence

Lemmatization and POS Tagging

Stop Words


Natural Language Toolkit - Quiz

Natural Language Toolkit - Exercises

Reading Text Data into Python

Read Data from a CSV File - Using Pandas

Read Data from a CSV File - Using Python CSV

Read Data from a TXT File

Scraping a Web Page using Requests and BeautifulSoup - Wikipedia Example

Scraping a Web Page using Requests and BeautifulSoup - Yahoo Finance Example

Scraping a Web Page - Errors in Request

Scraping a Web Page using Specific Libraries

Reading Text Data - Quiz

Reading Text Data - Exercises

Word Vectors Intuition

Introduction to Word Vectors

Binary Word Vectors

Word Co-Occurence Matrices

Filling Co-Occurence Matrix

Visualizing Word Vectors

Similarity between Words - Cosine

Word Similarities from Co-Occurence Matrix

Word as Vectors - Quiz

Word Vectors - Exercises

Continuous Bag of Words Implementation and Word2Vec

Continuous Bag of Words Model (CBOW) Introduction

CBOW - Creating Vocab and Binary Word Arrays

CBOW - Building Features and Target Variable

CBOW - Accuracy of Random Model and Training Process

CBOW - Training the Neural Network

CBOW - Obtaining Word Vectors (Embeddings)

Pre-Processing Wikipedia Data for CBOW Model

Building Features and Target for Wikipedia Data

Fitting Neural Network on Wikipedia Data

Performance of the Neural Network

Predicting a Word Given a Context

Retrieving Word Embeddings and Word Similarities


Word2Vec - Operations with Vectors

Word2Vec - Word Clustering

Continuous Bag of Words Implementation and Word2Vec - Quiz

Continuous Bag of Words Implementation - Exercises

Text Representation

Binary Vectorizer

Count Vectorizer


Text Representation - Exercises

Course Ending

Thank you!


Tiago25 July 2021

The course is really well structured and Ivo explains everything with great detail. Above expectations


Udemy ID


Course created date


Course Indexed date
Course Submitted by