Data Engineering with Spark Databricks Delta Lake Lakehouse

Apache Spark Databricks Lakehouse Delta Lake Delta Tables Delta Caching Scala Python Data Engineering for beginners

4.47 (99 reviews)
Udemy
platform
English
language
Data Science
category
instructor
Data Engineering with Spark Databricks Delta Lake Lakehouse
1,654
students
3.5 hours
content
Feb 2024
last update
$49.99
regular price

What you will learn

Acquiring the necessary skills to qualify for an entry-level Data Engineering position

Developing a practical comprehension of Data Lakehouse concepts through hands-on experience

Learning to operate a Delta table by accessing its version history, recovering data, and utilizing time travel functionality

Optimizing a delta table with various techniques like caching, partitioning, and z-ordering for faster analytics

Obtaining practical knowledge in constructing a data pipeline through the usage of Apache Spark on the Databricks platform

Doin analytics within a Databricks AWS Account

Why take this course?

Data Engineering is a vital component of modern data-driven businesses. The ability to process, manage, and analyze large-scale data sets is a core requirement for organizations that want to stay competitive. In this course, you will learn how to build a data pipeline using Apache Spark on Databricks' Lakehouse architecture. This will give you practical experience in working with Spark and Lakehouse concepts, as well as the skills needed to excel as a Data Engineer in a real-world environment.


Throughout the Course, You Will Learn:

  • Conducting analytics using Python and Scala with Spark.

  • Applying Spark SQL and Databricks SQL for analytics.

  • Developing a data pipeline with Apache Spark.

  • Becoming proficient in Databricks' community edition.

  • Managing a Delta table by accessing version history, restoring data, and utilizing time travel features.

  • Optimizing query performance using Delta Cache.

  • Working with Delta Tables and Databricks File System.

  • Gaining insights into real-world scenarios from experienced instructors.

Course Structure:

  • Beginning with familiarizing yourself with Databricks' community edition and creating a basic pipeline using Spark.

  • Progressing to more complex topics after gaining comfort with the platform.

  • Learning analytics with Spark using Python and Scala, including Spark transformations, actions, joins, Spark SQL, and DataFrame APIs.

  • Acquiring the knowledge and skills to operate a Delta table, including accessing its version history, restoring data, and utilizing time travel functionality using Spark and Databricks SQL.

  • Understanding how to use Delta Cache to optimize query performance.

Optional Lectures on AWS Integration:

  • 'Setting up Databricks Account on AWS' and 'Running Notebooks Within a Databricks AWS Account.'

  • Building an ETL pipeline with Delta Live Tables

  • Providing additional opportunities to explore Databricks within the AWS ecosystem.


This course is designed for Data Engineering beginners with no prior knowledge of Python and Scala required. However, some familiarity with databases and SQL is necessary to succeed in this course. Upon completion, you will have the skills and knowledge required to succeed in a real-world Data Engineer role.


Throughout the course, you will work with hands-on examples and real-world scenarios to apply the concepts you learn. By the end of the course, you will have the practical experience and skills required to understand Spark and Lakehouse concepts, and to build a scalable and reliable data pipeline using Apache Spark on Databricks' Lakehouse architecture.

Screenshots

Data Engineering with Spark Databricks Delta Lake Lakehouse - Screenshot_01Data Engineering with Spark Databricks Delta Lake Lakehouse - Screenshot_02Data Engineering with Spark Databricks Delta Lake Lakehouse - Screenshot_03Data Engineering with Spark Databricks Delta Lake Lakehouse - Screenshot_04

Reviews

Rupak
November 11, 2023
As part of lakehouse I was also expecting to see how to create sql endpoints and how the applications can connect to Lakehouse
TRGalan
August 29, 2023
Very good and to-the-point review of Delta and Python-SQL; I've recommended to my team. Is code script notebook available for download to make this more relevant? Thanks.
Gene
August 27, 2023
Where is the instructor getting all of these long commands he's pasting into the screens? I can't find a place to cut/paste so it's wasting my time typing commands and validating against what I see on the screen. The content is good for the most part but capturing and issuing most of the commands takes a lot of time. The transcript for the course is OK but useless for commands and no external files containing them.
Bo
August 8, 2023
One of the top three courses on the subject on Udemy. Well presented, illustrated and explained. I will definitely look for more courses by this instructor. 5 stars PLUS!

Charts

Price

Data Engineering with Spark Databricks Delta Lake Lakehouse - Price chart

Rating

Data Engineering with Spark Databricks Delta Lake Lakehouse - Ratings chart

Enrollment distribution

Data Engineering with Spark Databricks Delta Lake Lakehouse - Distribution chart
5133314
udemy ID
2/3/2023
course created date
2/18/2023
course indexed date
Bot
course submited by