Udemy

Platform

English

Language

Other

Category

Deep Learning for NLP - Part 2

Part 2: Encoder-decoder models, attention and Transformers

Students

3 hours

Content

Jul 2021

Last Update
Regular Price

SKILLSHARE
SkillShare
Unlimited access to 30 000 Premium SkillShare courses
30-DAY FREE TRIAL

What you will learn

Deep Learning for Natural Language Processing

Encoder-decoder models, Attention models, ELMo

GLUE, Transformers, GPT, BERT

DL for NLP


Description

This course is a part of "Deep Learning for NLP" Series. In this course, I will introduce concepts like Encoder-decoder attention models, ELMo, GLUE, Transformers, GPT and BERT. These concepts form the base for good understanding of advanced deep learning models for modern Natural Language Processing.

The course consists of two main sections as follows.

In the first section, I will talk about Encoder-decoder models in the context of machine translation and how beam search decoder works. Next, I will talk about the concept of encoder-decoder attention. Further, I will elaborate on different types of attention like Global attention, local attention, hierarchical attention, and attention for sentence pairs using CNNs as well as LSTMs. We will also talk about attention visualization. Finally, we will discuss ELMo which is a way of using recurrent models to compute context sensitive word embeddings.

In the second section, I will talk about details about the various tasks which are a part of the GLUE benchmark and details about other benchmark NLP datasets across tasks. Then we will start our modern NLP journey with understanding different parts of an encoder-decoder Transformer model. We will delve into details of Transformers in terms of concepts like self attention, multi-head attention, positional embeddings, residual connections, and masked attention. After that I will talk about two most popular Transformer models: GPT and BERT. In the GPT part, we will discuss how is GPT trained and what are differences in variants like GPT2 and GPT3. In the BERT part, we will discuss how BERT is different from GPT, how it is pretrained using the masked language modeling and next sentence prediction tasks. We will also quickly talk about finetuning for BERT and multilingual BERT.


Screenshots

Deep Learning for NLP  - Part 2
Deep Learning for NLP  - Part 2
Deep Learning for NLP  - Part 2
Deep Learning for NLP  - Part 2

Content

Encoder-decoder attention models, ELMo

Introduction

Encoder-decoder models

Global, local, hierarchical attention; attention for sentence pairs

Attention based models

ELMo

Summary

GLUE, Transformers, GPT, BERT

Introduction

GLUE benchmark

Transformers-1

Transformers-2

GPT

BERT

Summary


4037084

Udemy ID

5/9/2021

Course created date

5/30/2021

Course Indexed date
Bot
Course Submitted by