Optical Character Recognition (OCR) in Python

OpenCV, Tesseract, EasyOCR and EAST applied to images and videos! Create your own OCR from scratch using Deep Learning!

4.25 (28 reviews)
Udemy
platform
English
language
Data Science
category
instructor
Optical Character Recognition (OCR) in Python
501
students
13 hours
content
Apr 2022
last update
$19.99
regular price

What you will learn

Use Tesseract, EAST and EasyOCR tools for text recognition in images and videos

Understand the differences between OCR in controlled and natural environments

Apply image pre-processing techniques to improve image quality, such as: thresholding, inversion, resizing, morphological operations and noise reduction

Use EAST architecture and EasyOCR library for better performance in natural scenes

Train an OCR from scratch using Deep Learning and Convolutional Neural Networks

Application of natural language processing techniques in the texts extracted by OCR (word cloud and named entity recognition)

License plate reading

Description

Within the area of Computer Vision is the sub-area of Optical Character Recognition (OCR), which aims to transform images into texts. OCR can be described as converting images containing typed, handwritten or printed text into characters that a machine can understand. It is possible to convert scanned or photographed documents into texts that can be edited in any tool, such as the Microsoft Word. A common application is automatic form reading, in which you can send a photo of your credit card or your driver's license, and the system can read all your data without the need to type them manually. A self-driving car can use OCR to read traffic signs and a parking lot can guarantee access by reading the license plate of the cars!

To take you to this area, in this course you will learn in practice how to use OCR libraries to recognize text in images and videos, all the code implemented step by step using the Python programming language! We are going to use Google Colab, so you do not have to worry about installing libraries on your machine, as everything will be developed online using Google's GPUs! You will also learn how to build your own OCR from scratch using Deep Learning and Convolutional Neural Networks! Below you can check the main topics of the course:

  • Recognition of texts in images and videos using Tesseract, EasyOCR and EAST

  • Search for specific terms in images using regular expressions

  • Techniques for improving image quality, such as: thresholding, color inversion, grayscale, resizing, noise removal, morphological operations and perspective transformation

  • EAST architecture and EasyOCR library for better performance in natural scenes

  • Training an OCR from scratch using TensorFlow and modern Deep Learning techniques, such as Convolutional Neural Networks

  • Application of natural language processing techniques in the texts extracted by OCR (word cloud and named entity recognition)

  • License plate reading

These are just some of the main topics! By the end of the course, you will know everything you need to create your own text recognition projects using OCR!

Content

Introduction

Course content
Introduction to OCR
Course materials
FREE COURSE! Until April 22

OCR with Tesseract

Introduction to Tesseract
Preparing the environment
First text recognition
Support for other languages
Page segmentation mode (PSM)
Page orientation detection
Selection of texts 1
Selection of texts 2
Selection of texts 3
Search using regular expressions
Detections in natural scenarios

Techniques for image pre-processing

Grayscale
Thresholding - intuition
Simple thresholding
Thresholding with Otsu method
Adaptive thresholding
Gaussian adaptive thresholding
Color inversion
Resizing - intuition
Resizing - implementation
Morphological operations - intuition
Morphological operations - implementation
Noise removal - intuition
Noise removal - implementation
Text recognition with OCR
HOMEWORK
Homework solution

OCR with EAST for natural scenes

EAST - introduction
Pre-processing the image
Loading the neural network
Decoding the image 1
Decoding the image 2
Text recognition

Training a custom OCR

Importing the libraries
MNIST 0-9 dataset
Kaggle A-Z dataset
Joining the datasets
Pre-processing the data
Building the neural network
Training the neural network
Evaluating the neural network
Saving the neural network
Testing with images
Preparing the environment
Pre-processing the image
Contour detection
Processing the detections 1
Processing the detections 2
Character recognition
Problems with 0 and O, 1 and l, 5 and S
Problems with undetected texts

Natural scenarios with EasyOCR

Preparing the environment
Text recognition
Writing the results on the image
Other languages - French and Chinese
Text recognition (background)

OCR in videos

Preparing the environment
Video settings
Processing the video
OCR with EAST and Tesseract
OCR with EasyOCR

Project 1: Searching for specific terms

Preparing the environment
Text recognition
Searching for texts
Word cloud
Named entity recognition
Search for texts in images
Saving the results

Project 2: Scanner + OCR

Preparing the environment
Contour detection
Perspective transformation
OCR with Tesseract
Improving image quality
Putting all together

Project 3: License plate reading

Pre-processing the image
Text recognition
Improving image quality

Extra content 1: artificial neural networks

Biological fundamentals
Single layer perceptron
Multilayer perceptron – sum and activation functions
Multilayer perceptron – error calculation
Gradient descent
Delta parameter
Updating weights with backpropagation
Bias, error, stochastic gradient descent, and more parameters

Extra content 2: convolutional neural networks

Introduction to convolutional neural networks
Convolutional operation
Pooling
Flattening
Dense neural network

Final remarks

Final remarks
BONUS

Reviews

Ezekiel
June 12, 2022
would be great if you gave all the coding resources in a IPYNB with all the demo images preloaded rather than making a clone of the collab book and uploading the images manually
Daniel
May 23, 2022
He gives very detailed explanations about what the different variables are, and what they mean. I 100% recommend the course.

Charts

Price

Optical Character Recognition (OCR) in Python - Price chart

Rating

Optical Character Recognition (OCR) in Python - Ratings chart

Enrollment distribution

Optical Character Recognition (OCR) in Python - Distribution chart
4639122
udemy ID
4/12/2022
course created date
4/23/2022
course indexed date
Bot
course submited by