Building Big Data Pipelines with PySpark + MongoDB + Bokeh

Build intelligent data pipelines with big data processing and machine learning technologies

3.85 (67 reviews)
Udemy
platform
English
language
Data Science
category
instructor
Building Big Data Pipelines with PySpark + MongoDB + Bokeh
2,407
students
5 hours
content
Feb 2020
last update
$54.99
regular price

What you will learn

PySpark Programming

Data Analysis

Python and Bokeh

Data Transformation and Manipulation

Data Visualization

Big Data Machine Learning

Geo Mapping

Geospatial Machine Learning

Creating Dashboards

Why take this course?

Welcome to the ​Building Big Data Pipelines with PySpark & MongoDB & Bokeh​ course. In

this course we will be building an intelligent data pipeline using big data technologies like

Apache Spark and MongoDB.


We will be building an ETLP pipeline, ETLP stands for Extract Transform Load and Predict.

These are the different stages of the data pipeline that our data has to go through in order for it

to become useful at the end. Once the data has gone through this pipeline we will be able to

use it for building reports and dashboards for data analysis.


The data pipeline that we will build will comprise of data processing using PySpark, Predictive

modelling using Spark’s MLlib machine learning library, and data analysis using MongoDB and

Bokeh.


  • You will learn how to create data processing pipelines using PySpark

  • You will learn machine learning with geospatial data using the Spark MLlib library

  • You will learn data analysis using PySpark, MongoDB and Bokeh, inside of jupyter notebook

  • You will learn how to manipulate, clean and transform data using PySpark dataframes

  • You will learn basic Geo mapping

  • You will learn how to create dashboards

  • You will also learn how to create a lightweight server to serve Bokeh dashboards


Screenshots

Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Screenshot_01Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Screenshot_02Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Screenshot_03Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Screenshot_04

Reviews

Evgenia
July 8, 2023
I am disappointed because it's outdated in terms of the code, it takes too much time to adjust the code to the relevant syntax and it's not worth it. There are also very minimal explanations. I had to research every step myself to understand why it's done and then spend hours adjusting the code to run.
Marcos
October 6, 2022
El curso es correcto pero la asistencia del profesor es escasa. He tenido problemas para ejecutar los scripts de los apartados 22, 23 y 24.
Harry
March 26, 2021
Really enjoy the straight forward style so we can actually code and get some repetitions utilizing functions rather than focusing on some of the underlying concepts which can be studied outside the course
Rez
September 23, 2020
Covered materials are not sufficient to make learner comfortable with the spark and Apache eco. system. I didn't learn enough by following this course

Charts

Price

Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Price chart

Rating

Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Ratings chart

Enrollment distribution

Building Big Data Pipelines with PySpark + MongoDB + Bokeh - Distribution chart
2806989
udemy ID
2/10/2020
course created date
2/25/2020
course indexed date
Bot
course submited by