Udemy

Platform

English

Language

Data Science

Category

Basic Statistics & Regression for Machine Learning in Python

A quick and easy guide on statistical regression for machine learning

4.50 (6 reviews)

Students

5 hours

Content

Apr 2021

Last Update
Regular Price


What you will learn

Python Basics, Statistics and Regression behind Machine Learning in Python and also using Manual Calculations


Description

Hello and welcome to my new course Basic Statistics and Regression for Machine Learning


You know.. there are mainly two kinds of ML enthusiasts.


The first type fantasize about Machine Learning and Artificial Intelligence. Thinking that its a magical voodoo thing. Even if they are into coding, they will just import the library, use the class and its functions. And will rely on the function to do the magic in the background.


The second kind are curious people. They are interested to learn what's actually happening behind the scenes of these functions of the class. Even though they don't want to go deep with all those mathematical complexities, they are still interested to learn what's going on behind the scenes at least in a shallow Layman's perspective way.


In this course, we are focusing mainly on the second kind of learners.


That's why this is a special kind of course. Here we discuss the basics of Machine learning and the Mathematics of Statistical Regression which powers almost all of the the Machine Learning Algorithms.


We will have exercises for regression in both manual plain mathematical calculations and then compare the results with the ones we got using ready-made python functions.


Here is the list of contents that are included in this course.


In the first session, we will set-up the computer for doing the basic machine learning python exercises in your computer. We will install anaconda, the python framework. Then we will discuss about the components included in it. For manual method, a spreadsheet program like MS Excel is enough.


Before we proceed for those who are new to python, we have included few sessions in which we will learn the very basics of python programming language. We will learn Assignment, Flow control, Lists and Tuples, Dictionaries and Functions in python. We will also have a quick peek of the Python library called Numpy which is used for doing matrix calculations which is very useful for machine learning and also we will have an overview of Matplotlib which is a plotting library in python used for drawing graphs.


In the third session, we will discuss the basics of machine learning and different types of data.


In the next session we will learn a statistics technique called Central Tendency Analysis which finds out a most suitable single central value that attempts to describe a set of data and its behaviour. In statistics, the three common measures of central tendency are the mean, median, and mode. We will find mean, median, and mode using both manual calculation method and also using python functions.


After that we will try the statistics techniques called variance and standard deviation. Variance of a dataset measures how far a set of numbers is spread out from their average or central value. The Standard Deviation is a measure of how much these spread out numbers are. We will at first try the variance and standard deviation manually using plain mathematical calculations. After that, we will implement a python program to find both these values for the same dataset and we will verify the results.


Then comes a simple yet very useful technique called percentile. In statistics, a percentile is a score below which a given percentage of scores in a distribution falls. For easy understanding, we will try an example with manual calculation of percentile using raw data set at first and later we will do it with the help of python functions. We will then double-check the results


After that we will learn about distributions. It describes the grouping or the density of the samples in a dataset. There are two types. Normal Distribution where probability of x is highest at centre and lowest in the ends whereas in Uniform Distribution probability of x is constant. We will try both these distributions using visualization of data. We will do the calculation using manual calculation methods and also using python language.


There is a value called z score or standard score in statistics which helps us to determine where the value lies in the distribution. For z score also at first we will try calculation using python functions. Later the z score will be calculated with manual methods and will compare the results.


Those were the case of a single valued dataset. That is the dataset containing only a single column of data. For multi-variable dataset, we have to calculate the regression or the relation between the columns of data. At first we will visualize the data, analyse its form and structure using a scatter plot graph.


Then as the first type of regression analysis we will start with an introduction to simple Linear Regression. At first we will manually find the co-efficient of correlation using manual calculation and will store the results. After that we will find the slope equation using the obtained results. And then using the slope equation, we will predict future values. This prediction is the basic and important feature of all Machine Learning Systems. Where we give the input variable and the system will predict the output variable value.


Then we will repeat all these using Python Numpy library Methods and will do the future value prediction and later will compare the results. We will also discuss the scenarios which we can consider as a strong Linear Regression or weak Linear Regression.


Then we will see another type of regression analysis technique called as Polynomial Linear Regression which is best suited for finding the relation between the independent variable x and the dependent variable y.


The regression line in the graph will be a straight line with slope for Simple Linear Regression and for Polynomial Linear Regression, it will be a curve.


In the coming sessions, we will have a brief introduction about polynomial linear regression and the visualization of the modified dataset with x and y values. Using python we will then find the polynomial regression co-efficient value, the r2 value and also we will do future value prediction using python numpy library.


Then we will repeat the same using the plain old manual calculation method. At first we have to manually find the Standard Deviation components. Then later we will substitute these SD components in the equation to find a, b and c values. using these a,b,c values we will then find the final polynomial regression equation. This equation will enable us to do a manual prediction for future values.


And after that, here comes the Multiple regression. Here in this regression we can consider multiple number of independent x variables and one independent y variable. We will have an introduction about this type of regression. We will make necessary changes to our dataset to match the multiple regression requirement.


Since our dataset is getting more complex by the introduction of multiple independent variable columns, it may not be able to be managed by using a plain array for the dataset. We will use a csv or comma separated values file to save the dataset. We will have an exercise to read data from a csv file and save the data in corresponding data-frames. Once we have the data imported to our python program, we will do a visualization using a new library called seaborn which is a derivative of the scikitlearn library.


Using the python numpy and scikitlearn library, multiple regression can be done very easily. Just use the method and pass in the required parameters. Rest will be done by the python library itself. We will find the regression object and then using that we can do prediction for future values.


But with manual calculation, things will start getting complex. Its a lengthy calculation which needs to be done in multiple steps. In the first step we will have an introduction about the equations that we are going to use in the manual method and also we will find the mean values. Then in the second step we will find the components that are required to find the a,b and c values. Then in the third step, we will find the a,b and c values. And in the final step, using a,b,c values we will find the multiple regression equation and using this equation we will do future value prediction of our dataset. We will also try to get the value of the co-efficient of regression.


That's all about the popular regression methods that are included in the course. Now we can go ahead with a very important topic in data preparation for machine learning. Many machine learning algorithms love to have input values which are scaled to a standard range. We will learn a technique called data normalization or standardization in which all the different ranges of data values will be scaled down to fit within a range of 0 to 1. This will improve the performance of the algorithms very much compared to a non scaled dataset.


For normalization also, just like the regression examples, we will at first try the normalization using python code which will be very easy to generate values. Then later we will repeat this with plain old school type of mathematical calculations.


In the final session, we will discuss more resources which you can follow for going further from the point that we have already learned.


That's all about the topics which are currently included in this quick course. The code, notepad and jupyter notebook files used in this course has been uploaded and shared in a folder. I will include the link to download them in the last session or the resource section of this course. You are free to use the code in your projects with no questions asked.


Also after completing this course, you will be provided with a course completion certificate which will add value to your portfolio.

So that's all for now, see you soon in the class room. Happy learning and have a great time.


Screenshots

Basic Statistics & Regression for Machine Learning in Python
Basic Statistics & Regression for Machine Learning in Python
Basic Statistics & Regression for Machine Learning in Python
Basic Statistics & Regression for Machine Learning in Python

Content

Course Introduction and Table of Contents

Course Introduction and Table of Contents

Environment Setup: Preparing your Computer

Environment Setup - Part 1

Environment Setup - Part 2

Essential Components Included in Anaconda

Essential Components Included in Anaconda

Python Basics - Assignment

Python Basics - Assignment

Python Basics - Flow Control

Python Basics - Flow Control - Part 1

Python Basics - Flow Control - Part 2

Python Basics - List and Tuples

Python Basics - List and Tuples

Python Basics - Dictionary and Functions

Python Basics - Dictionary and Functions - part 1

Python Basics - Dictionary and Functions - part 2

Numpy Basics

Numpy Basics - Part 1

Numpy Basics - Part 2

Matplotlib Basics

Matplotlib Basics - part 1

Matplotlib Basics - part 2

Basics of Data for Machine Learning

Basics of Data for Machine Learning

Central Data Tendency - Mean

Central Data Tendency - Mean

Central Data Tendency - Median and Mode

Central Data Tendency - Median and Mode - Part 1

Central Data Tendency - Median and Mode - Part 2

Variance and Standard Deviation Manual Calculation

Variance and Standard Deviation Manual Calculation - Part 1

Variance and Standard Deviation Manual Calculation - Part 2

Variance and Standard Deviation using Python

Variance and Standard Deviation using Python

Percentile Manual Calculation

Percentile Manual Calculation

Percentile using Python

Percentile using Python

Uniform Distribution

Uniform Distribution

Normal Distribution

Normal Distribution - Part 1

Normal Distribution - Part 2

Manual Z score calculation

Manual Z score calculation

Z score calculation using python

Z score calculation using python

Multi Variable Dataset Scatter Plot

Multi Variable Dataset Scatter Plot

Introduction to Linear Regression

Introduction to Linear Regression

Manually finding Linear Regression Correlation Coefficient

Manually finding Linear Regression Correlation Coefficient - Part 1

Manually finding Linear Regression Correlation Coefficient - Part 2

Manually finding Linear Regression Slope Equation

Manually finding Linear Regression Slope Equation - Part 1

Manually finding Linear Regression Slope Equation - Part 2

Manually Predicting the Future Value using Equation

Manually Predicting the Future Value using Equation

Linear Regression using Python Introduction

Linear Regression using Python Introduction

Linear Regression using Python

Linear Regression using Python - Part 1

Linear Regression using Python - Part 2

Strong and Weak Linear Regression

Strong and Weak Linear Regression

Predicting Future value using Linear Regression in Python

Predicting Future value using Linear Regression in Python

Linear Regression using Python Introduction

Linear Regression using Python Introduction

Linear Regression using Python

Linear Regression using Python - Part 1

Linear Regression using Python - Part 2

Strong and Weak Linear Regression

Strong and Weak Linear Regression

Predicting Future value using Linear Regression in Python

Predicting Future value using Linear Regression in Python

Polynomial Regression Introduction

Polynomial Regression Introduction

Polynomial Regression Visualization

Polynomial Regression Visualization

Polynomial Regression Prediction and R2 value

Polynomial Regression Prediction and R2 value

Polynomial Regression Finding SD Components

Polynomial Regression Finding SD Components

Polynomial Regression Manual Method Equations

Polynomial Regression Manual Method Equations

Finding SD components for abc

Finding SD components for abc

Finding abc

Finding abc

Polynomial Regression Equation and Prediction

Polynomial Regression Equation and Prediction

Polynomial Regression coefficient

Polynomial Regression coefficient

Multiple Regression Introduction

Multiple Regression Introduction

Multiple Regression using Python - Part 1 - Data Import as CSV

Multiple Regression using Python - Part 1 - Data Import as CSV

Multiple Regression using Python - Part 2 - Data Visualization

Multiple Regression using Python - Part 2 - Data Visualization

Creating Multiple Regression Object and Prediction using Python

Creating Multiple Regression Object and Prediction using Python

Manual Multiple Regression - Intro and Finding Means

Manual Multiple Regression - Intro and Finding Means

Manual Multiple Regression - Finding Components

Manual Multiple Regression - Finding Components - Part 1

Manual Multiple Regression - Finding Components - Part 2

Manual Multiple Regression - Finding a b c

Manual Multiple Regression - Finding a b c

Manual Multiple Regression Equation Prediction and Coefficients

Manual Multiple Regression Equation Prediction and Coefficients

Feature Scaling Introduction

Feature Scaling Introduction

Standardization Scaling using Python

Standardization Scaling using Python - Part 1

Standardization Scaling using Python - Part 2

Standardization Scaling using Manual Calculation

Standardization Scaling using Manual Calculation - Part 1

Standardization Scaling using Manual Calculation - Part 2

Further Learning References and Resource Download

Further Learning References and Resource Download

Download Source code, datasets and text files from here

Download Source code, datasets and text files from here


Reviews

K
Karthik4 June 2021

as I'm from non-computer science background , the course is designed in such a way to easily understand the basic concepts and python is taught in step by step process...


Coupons

DateDiscountStatus
4/28/2021100% OFFExpired

3677816

Udemy ID

12/3/2020

Course created date

4/28/2021

Course Indexed date
Bot
Course Submitted by