Best Hands-on Big Data Practices with PySpark & Spark Tuning

Semi-Structured (JSON), Structured and Unstructured Data Analysis with Spark and Python & Spark Performance Tuning

4.65 (744 reviews)
Udemy
platform
English
language
Other
category
instructor
Best Hands-on Big Data Practices with PySpark & Spark Tuning
7,018
students
13 hours
content
Mar 2024
last update
$89.99
regular price

What you will learn

Understand Apache Spark’s framework, execution and programming model for the development of Big Data Systems

Learn step-by-step hands-on PySpark practices on structured, unstructured and semi-structured data using RDD, DataFrame and SQL

Learn how to work with a free Cloud-based and a Desktop computer for Spark setup and configuration

Build simple to advanced Big Data applications for different types of data (volume, variety, veracity) through real case studies

Investigate and apply optimization and performance tuning methods to manage data Skewness and prevent Spill

Investigate and apply Adaptive Query Execution (AQE) to optimize Spark SQL query execution at runtime

Investigate and be able to explain the lazy evaluations (Narrow vs Wide transformation) and internal working of Spark

Build and learn Spark SQL applications using JDBC (Java Database Connectivity)

Why take this course?

In this course, students will be provided with hands-on PySpark practices using real case studies from academia and industry to be able to work interactively with massive data. In addition, students will consider distributed processing challenges, such as data skewness and spill within big data processing. We designed this course for anyone seeking to master Spark and PySpark and Spread the knowledge of Big Data Analytics using real and challenging use cases.

We will work with Spark RDD, DF, and SQL to process huge sized of data in the format of semi-structured, structured, and unstructured data. The learning outcomes and the teaching approach in this course will accelerate the learning by Identifying the most critical required skills in the industry and understanding the demands of Big Data analytics content.

We will not only cover the details of the Spark engine for large-scale data processing, but also we will drill down big data problems that allow users to instantly shift from an overview of large-scale data to a more detailed and granular view using RDD, DF and SQL in real-life examples. We will walk through the Big Data case studies step by step to achieve the aim of this course.

By the end of the course, you will be able to build Big Data applications for different types of data (volume, variety, veracity) and you will get acquainted with best-in-class examples of Big Data problems using PySpark.

Screenshots

Best Hands-on Big Data Practices with PySpark & Spark Tuning - Screenshot_01Best Hands-on Big Data Practices with PySpark & Spark Tuning - Screenshot_02Best Hands-on Big Data Practices with PySpark & Spark Tuning - Screenshot_03Best Hands-on Big Data Practices with PySpark & Spark Tuning - Screenshot_04

Reviews

Mabs
October 5, 2023
The course is to the point instructor makes it very easy to setup the system and makes complicated things very easy to understand.
Ali
July 30, 2023
Very informative course explained briefly. I think hands on for salting technique was not clear for me. Apart from that, rest of the course was clear and informative. Thanks.
Ting
July 14, 2023
The world needs more amazing instructors like Dr Amin Karami. So practical and industry relavant. He deserves a standing ovation.
Vignesan
July 6, 2023
this course really helps me a lot thank you, Amin, ? and please post a real-time streaming video (kinesis,kafka).
Anonymized
June 21, 2023
This is an excellent course on Spark. Learned a lot from your simple explanation of complex concepts. Thank you !!
Arnab
June 20, 2023
This is one of the best spark course you can find in udemy. This is real time and almost zero power point. Make sure you follow along and code with Amin - the best way to understand the concept is by making hands dirty and not by ppt. This course surely does that. Conceptual explanations are in whiteboard - again best way to explain than ppt. Amin is one of the trainers in udemy who is easily accessible. A very few trainers reply regularly in QnA section. Amin helped me a lot in setting up an environment along with his course. A very few people understands the environment as he does. A definite must for anyone who wants to know spark or anyone who already knows spark. A lot of information is there in this course that you can only get after spending a lot of hours in spark.
Sanjit
June 19, 2023
This course really helps to get good hands-on experience on working with various types of data including Structured, Semi-Structured, Un-Structured data. Along with that it also has some of the optimization technique which are really helpful. Thank you Amin for this course.
Vish
June 17, 2023
This was an amazing course about Spark. I have taken other Spark courses where they teach you the mechanics of using Spark and talk a bit about the architecture and this course does a great job of diving deep into practical concepts that will get used in your day to day job. Also a learnt some interesting optimization techniques that I did not know before.
Sumit
June 3, 2023
This is an Unbiased Review. This is by far the best course available on PySpark. I have considered courses available on both Udemy and YouTube. I am from India and as per top in-demand skills, Pyspark tops the list. I am halfway through the course and trust me each and every explanation is crystal clear. The best part is the support provided by the instructor and also the fact that he is not reading through the presentation but using a digital board to explain the concepts. Amin is very helpful. I would like to see more courses on the subject of BigData. I will write a detailed review after I complete the course.
Kondapalli
April 24, 2023
100% useful course so far in udemy. I enrolled in a couple of courses related to pyspark but none of them helped me in practical application. But this course has the most practical explanation of optimizations using rdd,dataframes. Also went through spark UI which has tasks performance, jobs/stages/plans analysis. This is what 99% courses are missing but this one is the most practical and helpful to me. Now I truly confidential in attending data engineering interviews all thanks to the instructor and udemy. Most humble moment.
Apan
April 14, 2023
This course is above expectations. The teaching style is very good and gives you this chance to learn PySpark without difficulties.
LBP
April 11, 2023
Amazing course. Thanks a lot for this great teaching. I have not seen such a high-quality PySpark course anywhere else.
Kortly
April 10, 2023
I wrote a thankful message inside the course to express my feeling to the learners. This is a world-class course with a very knowledgeable teacher. I strongly recommend this to everyone.
Juno
April 10, 2023
I got advice from one of my friends to learn this course. This is a super fantastic course with a great combination of theory and practice that makes this course unique with lots of detailed discussions and practices that are not available anywhere else. Very hands-on and unique. I strongly recommend this course to everyone. Enjoy PySpark with this course.
Eugenio
April 5, 2023
Best PySpark course I have ever done. It blends theory and practice perfectly. It really helped me to understand why you have to do calculations like that while working with RDDs that I completely missed in other courses. The assignments are challenging and checked by the instructor. It took my theoretical and practical PySpark knowledge to the level that I needed.

Charts

Price

Best Hands-on Big Data Practices with PySpark & Spark Tuning - Price chart

Rating

Best Hands-on Big Data Practices with PySpark & Spark Tuning - Ratings chart

Enrollment distribution

Best Hands-on Big Data Practices with PySpark & Spark Tuning - Distribution chart
4496750
udemy ID
1/15/2022
course created date
4/17/2022
course indexed date
Bot
course submited by