Computer Vision: Python OCR & Object Detection Quick Starter

Quick Starter for Optical Character Recognition, Image Recognition Object Detection and Object Recognition using Python

4.53 (528 reviews)
Udemy
platform
English
language
Data Science
category
instructor
5,943
students
4.5 hours
content
Nov 2023
last update
$69.99
regular price

What you will learn

Optical Character Recognition with Tesseract Library, Image Recognition using Keras, Object Recognition using MobileNet SSD, Mask R-CNN, YOLO, Tiny YOLO

optical Character Recognition

Image Recognition

Object Recognition

Description

Hi There!


welcome to my new course 'Optical Character Recognition and Object Recognition Quick Start with Python'. This is the third course from my Computer Vision series.


Image Recognition, Object Detection, Object Recognition and also Optical Character Recognition are among the most used applications of Computer Vision.


Using these techniques, the computer will be able to recognize and classify either the whole image, or multiple objects inside a single image predicting the class of the objects with the percentage accuracy score. Using OCR, it can also recognize and convert text in the images to machine readable format like text or a document.


Object Detection and Object Recognition is widely used in many simple applications and also complex ones like self driving cars.


This course will be a quick starter for people who wants to dive into Optical Character Recognition, Image Recognition and Object Detection using Python without having to deal with all the complexities and mathematics associated with typical Deep Learning process.


Let's now see the list of interesting topics that are included in this course.


At first we will have an introductory theory session about Optical Character Recognition technology.


After that, we are ready to proceed with preparing our computer for python coding by downloading and installing the anaconda package and will check and see if everything is installed fine.


Most of you may not be coming from a python based programming background. The next few sessions and examples will help you get the basic python programming skill to proceed with the sessions included in this course. The topics include Python assignment, flow-control, functions and data structures.


Then we will install the dependencies and libraries that we require to do the Optical Character Recognition. We are using Tesseract Library to do the OCR. At first we will install the Library and then its python bindings. We will also install OpenCV, which is the Open Source Computer Vision library in Python.


We also will install the Pillow library, which is the Python Image Library. Then we will have an introduction to the steps involved in the Optical Character Recognition and later will proceed with coding and implementing the OCR program. We will use few example images to do a Character Recognition testing and will verify the results.


Then we will have an introduction to Convolutional Neural Networks , which we will be using to do the Image Recognition. Here we will be classifying a full image based on the single primary object in it.


We will then proceed with installing the Keras Library which we will be using to do the Image recognition. We will be using the built in , pre-trained Models that are included in Keras. The base code in python is also provided in the Keras documentation.


At first We will be using the popular pre-trained model architecture called the VGGNet. we will have an introductory session about the architecture of VGGNet. Then we will proceed with using the pre-trained VGGNet 16 Model included in keras to do Image Recognition and classification. We will try with few sample images to check the predictions. Then will move on to a deeper VGGNet 19 Model included in keras to do Image Recognition and classification.


Then we will try the ResNet pre-trained model included with the Keras library. We will include the model in the code and then we will try with few sample images to check the predictions.


And after that we will try the Inception pre-trained model. We will also include the model in the code and then we will try with few sample images to check the predictions. Then will go ahead with the Xception pre-trained model. Here also, we will  include the model in the code and then we will try with few sample images.


And those were Image Recognition pre-trained models, which can only label and classify a complete image based on the primary object in it. Now we will proceed with Object Recognition in which we can detect and label multiple objects in a single image.


At first we will have an introduction to MobileNet-SSD Pre-trained Model, which is single shot detector that is capable of detecting multiple objects in a scene. We will be also be having a quick discussion about the dataset that is used to train this model.


Later we will be implementing the MobileNet-SSD Pre-trained Model in our code and will get the predictions and bounding box coordinates for every object detected. We will draw the bounding box around the objects in the image and write the label along with the confidence value.


Then we will go ahead with object detection from a live video. We will be streaming the real-time live video from the computer's webcam and will try to detect objects from it. We will draw rectangle around each object detected in the live video along with the label and confidence.


In the next session, we will go ahead with object detection from a pre-saved video. We will be streaming the saved video from our folder and will try to detect objects from it. We will draw rectangle around each object detected along with the label and confidence.


Later we will be going ahead with the Mask-RCNN Pre-trained Model. In the previous model, we were only able to get a bounding box around the object, but in Mask-RCNN, we can get both the box co-ordinates as well the mask over the exact shape of object detected. We will have an introduction about this model and its details.


Later we will be implementing the Mask-RCNN Pre-trained Model in our code and as the first step we will get the predictions and bounding box coordinates for every object detected. We will draw the bounding box around the objects in the image and write the label along with the confidence value.


Later we will be getting the mask returned for each object predicted. We will process that data and use it to draw translucent multi coloured masks over each and every object detected and write the label along with the confidence value.


Then we will go ahead with object detection from a live video using Mask-RCNN. We will be streaming the real-time live video from the computer's webcam and will try to detect objects from it. We will draw the mask over the perimeter of each object detected in the live video along with the label and confidence.


And like we did for our previous model, we will go ahead with object detection from a pre-saved video using Mask-RCNN. We will be streaming the saved video from our folder and will try to detect objects from it. We will draw coloured masks for object detected along with the label and confidence.


The Mask-RCNN is very accurate with vast class list but will be very slow in processing images using low power CPU based computers. MobileNet-SSD is fast but less accurate and low in number of classes. We need a perfect blend of speed and accuracy which will take us to Object Detection and Recognition using YOLO pre-trained model. we will have an overview about the yolo model in the next session and then we will implement yolo object detection from a single image.


And using that as the base, we will try the yolo model for object detection from a real time webcam video and we will check the performance. Later we will use it for object recognition from the pre-saved video file.


To further improve the speed of frames processed, we will use the model called Tiny YOLO which is a light weight version of the actual yolo model. We will use tiny yolo at first for the pre-saved video and will analyse the accuracy as well as speed and then we will try the same for a real-time video from webcam and see the difference in performance compared to actual yolo.


That's all about the topics which are currently included in this quick course. The code, images and libraries used in this course has been uploaded and shared in a folder. I will include the link to download them in the last session or the resource section of this course. You are free to use the code in your projects with no questions asked.


Also after completing this course, you will be provided with a course completion certificate which will add value to your portfolio.


So that's all for now, see you soon in the class room. Happy learning and have a great time.

Content

Course Introduction and Table of Contents

Course Introduction and Table of Contents

Introduction to OCR Concepts and Libraries

Introduction to OCR Concepts and Libraries

Setting up Environment - Anaconda

Setting up Environment - Anaconda

Python Basics (Optional)

Python Basics - Part 1 - Assignment
Python Basics - Part 2 - Flow Control
Python Basics - Part 3 - Data Structures
Python Basics - Part 4 - Functions

Tesseract OCR Setup

Tesseract OCR Setup - Part 1
Tesseract OCR Setup - Part 2

OpenCV Setup

OpenCV Setup

Tesseract Image OCR Implementation

Tesseract Image OCR Implementation - Part 1
Tesseract Image OCR Implementation - Part 2

cv2.imshow() Not Responding Issue Fix

cv2.imshow() Not Responding Issue Fix

Introduction to CNN - Convolutional Neural Networks - Theory Session

Introduction to CNN - Convolutional Neural Networks - Theory Session

Installing Additional Dependencies for CNN

Installing Additional Dependencies for CNN

Introduction to VGGNet Architecture

Introduction to VGGNet Architecture

Image Recognition using Pre-Trained VGGNet16 Model

Image Recognition using Pre-Trained VGGNet16 Model - Part 1
Image Recognition using Pre-Trained VGGNet16 Model - Part 2
TensorFlow "Module Not Found" Error Fix (Optional) - Do ONLY if you have error

Image Recognition using Pre-Trained VGGNet19 Model

Image Recognition using Pre-Trained VGGNet19 Model

Image Recognition using Pre-Trained ResNet Model

Image Recognition using Pre-Trained ResNet Model

Image Recognition using Pre-Trained Inception Model

Image Recognition using Pre-Trained Inception Model

Image Recognition using Pre-Trained Xception Model

Image Recognition using Pre-Trained Xception Model

Introduction to MobileNet-SSD Pretrained Model

Introduction to MobileNet-SSD Pretrained Model

Mobilenet SSD Object Detection

Mobilenet SSD Object Detection - Part 1
Mobilenet SSD Object Detection - Part 2

Mobilenet SSD Realtime Video

Mobilenet SSD Realtime Video

Mobilenet SSD Pre-saved Video

Mobilenet SSD Pre-saved Video

Mask RCNN Pre-trained model Introduction

Mask RCNN Pre-trained model Introduction

MaskRCNN Bounding Box Implementation

MaskRCNN Bounding Box Implementation - Part 1
MaskRCNN Bounding Box Implementation - Part 2

MaskRCNN Object Mask Implementation

MaskRCNN Object Mask Implementation - Part 1
MaskRCNN Object Mask Implementation - Part 2

MaskRCNN Realtime Video

MaskRCNN Realtime Video - Part 1
MaskRCNN Realtime Video - Part 2

MaskRCNN Pre-saved Video

MaskRCNN Pre-saved Video

YOLO Pre-trained Model Introduction

YOLO Pre-trained Model Introduction

YOLO Implementation

YOLO Implementation - Part 1
YOLO Implementation - Part 2

YOLO Real-time Video

YOLO Real-time Video

YOLO Pre-saved Video

YOLO Pre-saved Video

Tiny YOLO Pre-saved Video

Tiny YOLO Pre-saved Video

Tiny YOLO Real-time Video

Tiny YOLO Real-time Video

SOURCE CODE AND FILES ATTACHED

SOURCE CODE AND FILES ATTACHED

Screenshots

Computer Vision: Python OCR & Object Detection Quick Starter - Screenshot_01Computer Vision: Python OCR & Object Detection Quick Starter - Screenshot_02Computer Vision: Python OCR & Object Detection Quick Starter - Screenshot_03Computer Vision: Python OCR & Object Detection Quick Starter - Screenshot_04

Reviews

Xacobe
November 9, 2023
The course is generally well structured, with lot of practical examples. However, the OCR part is quite short, it does not go into detail, just showing how to configure Tesseract using Python.
Yazid
July 25, 2023
The course is overall good but I thought we will be training a model to recognize an object but we ended up only using pre-trained models the course would have been so much better if he tought us the way of teaching the computer so that it can recognize some other stuff that we wanted it to recognize.
Johnbull
February 11, 2023
This is the 2nd course that I got from Abhilash, his mode of teaching is unique. I gained a lot from this course. Thanks, Abhilash.
Richard
November 23, 2022
He cuts through so much of the BS I've endured in other courses, is easy going and funny in his own way. If you learn best by example of application, this is 100% for you. It is how I best learn, and after taking several ML course over the same material, this is the only one that gets to the point with just enough explanation of what is going on under the hood; without hours of unnecessary math.
Alexander
July 18, 2022
The examples in the Videos are way outdated, if you install the latest versions, most of the examples described in the videos will not work anymore, as so many things has changed with the updates. In between the Videos there are several informations, about how the examples have to be changed to get it working. The author has just been to lazy to overwork his examples with the latest versions. But the extensions between the videos with bugfixing the error of latest versions just make it more complicated. In my case this was not succesful. Also his english is not good understandable. Because of this: Not usable, not working, NOT HELPFUL, OUTDATED
Ahmad
June 6, 2022
sehr guter Einstieg in Optical Character Recognition, Image Recognition Object Detection and Object Recognition. Danke dir
Simon
June 2, 2022
It's difficult to understand the lecturer. The subtitles help greatly, but then that takes my eyes off the presentation which may be showing me something important.
Thaika
December 28, 2021
I knew the concepts little and I wanted to know how to implement it in a code. I'm at section 7, So far it is unto my expectation. May be later part of the course I may give complete feedback.
Talal
October 26, 2021
Amazing Workshop. Speaker has knowledge, and speed of delivery is Excellent. Recommended this workshop to starters or to the beginners Level.
Steven
October 18, 2021
This course is good for getting you up and running. However, there are things in the code that could be improved. For instance, it seems the author reads in the model for every iteration of the loop. That is why some of his video work is very slow. Additionally, only tells you that using a GPU is faster. However, he doesn't go over how to implement it. Getting opencv to use your GPU is not a trivial task. The code examples are good, but it would be nice if there were more explanations about concepts and theories.
Jaswant
August 30, 2021
awesome course to understand basic of OCR AND DETECTION OF IMAGE, I will suggest to follow this course
Pramod
July 3, 2021
Every topic was very well explained audio-visually with relevant details. I would highly recommend this course to all those who are thinking of object detection & work on Yolo modal. Thank you Mr Abhilash Sir for this exceptionally informative course which includes a lot of knowledge regarding object detection
Tushar
June 24, 2021
Was expecting a more diverse session covering more real life scenario problems(even the introduction would be fine). Rather the whole lecture was covering the same functionality again and again just using different preloaded configs.
Vit
November 9, 2020
Well cooked course! Explanations are clear. Each line of code is commented. THANK YOU ABHILASH!!!!!!!!!!!

Coupons

DateDiscountStatus
5/13/2020100% OFF
expired

Charts

Price

Computer Vision: Python OCR & Object Detection Quick Starter - Price chart

Rating

Computer Vision: Python OCR & Object Detection Quick Starter - Ratings chart

Enrollment distribution

Computer Vision: Python OCR & Object Detection Quick Starter - Distribution chart
2983186
udemy ID
4/10/2020
course created date
5/13/2020
course indexed date
Lee Jia Cheng
course submited by