Data Science:Hands-on Diabetes Prediction with Pyspark MLlib

Diabetes Prediction using Machine Learning in Apache Spark

4.50 (157 reviews)
Udemy
platform
English
language
Data Science
category
Data Science:Hands-on Diabetes Prediction with Pyspark MLlib
11,846
students
1 hour
content
Sep 2020
last update
$19.99
regular price

What you will learn

Diabetes Prediction using Spark Machine Learning (Spark MLlib)

Learn Pyspark fundamentals

Working with dataframes in Pyspark

Analyzing and cleaning data

Process data using a Machine Learning model using Spark MLlib

Build and train logistic regression model

Performance evaluation and saving model

Why take this course?

Would you like to build, train, test and evaluate a machine learning model that is able to detect diabetes using logistic regression?


This is a Hands-on Machine Learning Course where you will practice alongside the classes. The dataset will be provided to you during the lectures. We highly recommend that for the best learning experience, you practice alongside the lectures.


You will learn more in this one hour of Practice than hundreds of hours of unnecessary theoretical lectures.


Learn the most important aspect of Spark Machine learning (Spark MLlib) :


  • Pyspark fundamentals and implementing spark machine learning

  • Importing and Working with Datasets

  • Process data using a Machine Learning model using spark MLlib

  • Build and train Logistic regression model

  • Test and analyze the model


The entire course has been divided into tasks. Each task has been very carefully created and designed to give you the best learning experience. In this hands-on project, we will complete the following tasks:


  • Task 1: Project overview

  • Task 2: Intro to Colab environment & install dependencies to run spark on Colab

  • Task 3: Clone & explore the diabetes dataset

  • Task 4: Data Cleaning

  • Task 5: Correlation & feature selection

  • Task 6: Build and train Logistic Regression Model using Spark MLlib

  • Task 7: Performance evaluation & Test the model

  • Task 8: Save & load model


About Pyspark:


Pyspark is the collaboration of Apache Spark and Python. PySpark is a tool used in Big Data Analytics.

Apache Spark is an open-source cluster-computing framework, built around speed, ease of use, and streaming analytics whereas Python is a general-purpose, high-level programming language. It provides a wide range of libraries and is majorly used for Machine Learning and Real-Time Streaming Analytics.

In other words, it is a Python API for Spark that lets you harness the simplicity of Python and the power of Apache Spark in order to tame Big Data. We will be using Big data tools in this project.


Make a leap into Data science with this Spark MLlib project and showcase your skills on your resume.


Click on the “ENROLL NOW” button and start learning.


Happy Learning.

Reviews

Aya_Hamdy_
January 11, 2021
I just think it would have been nice to explain a bit more the code and break it down a bit to see each step or each part of the step and its impact on the data. Thank you for the effort :)
Ayushman
July 26, 2020
It would've been great if you could've said a little more about pyspark sql queries. Would've also liked a little bit about how to properly filter for feature selection on simple models like this. But loved how you made it look so simple!
Gokhan
July 25, 2020
Training is good, just more explanation for codes and algorithm would be better. explaining by flowcharts for each steps and equations for each analysis makes it perfect
Mayank
July 21, 2020
Nicely explained the library and methods used in the ML model training, and other concepts. Thanks for this good video and explanation.

Charts

Price

Data Science:Hands-on Diabetes Prediction with Pyspark MLlib - Price chart

Rating

Data Science:Hands-on Diabetes Prediction with Pyspark MLlib - Ratings chart

Enrollment distribution

Data Science:Hands-on Diabetes Prediction with Pyspark MLlib - Distribution chart
3304348
udemy ID
7/6/2020
course created date
7/17/2020
course indexed date
Bot
course submited by