4.28 (98 reviews)
☑ Build your first data pipeline to process CSV, JSON, XML
☑ Orchestrate data pipeline on Azure data factory
☑ Spin up spark cluster
☑ Delta tables
☑ Concept of time travel and vacuum on delta tables
☑ Apache Spark SQL
☑ Filtering Dataframe
☑ Renaming, drop, Select, Cast
☑ Aggregation operations SUM, AVERAGE, MAX, MIN
☑ Rank, Row Number, Dense Rank
☑ Building dashboards
☑ Build Complete project
☑ Build End to End data pipeline
Welcome to the course on Mastering Databricks & Apache spark -Build ETL data pipeline
Databricks combines the best of data warehouses and data lakes into a lakehouse architecture. In this course we will be learning how to perform various operations in Scala, Python and Spark SQL. This will help every student in building solutions which will create value and mindset to build batch process in any of the language. This course will help in writing same commands in different language and based on your client needs we can adopt and deliver world class solution. We will be building end to end solution in azure databricks.
Key Learning Points
We will be building our own cluster which will process our data and with one click operation we will load different sources data to Azure SQL and Delta tables
After that we will be leveraging databricks notebook to prepare dashboard to answer business questions
Based on the needs we will be deploying infrastructure on Azure cloud
These scenarios will give student 360 degree exposure on cloud platform and how to step up various resources
All activities are performed in Azure Databricks
Concept of versions and vacuum on delta tables
Apache Spark SQL
Renaming, drop, Select, Cast
Aggregation operations SUM, AVERAGE, MAX, MIN
Rank, Row Number, Dense Rank
This course is suitable for Data engineers, BI architect, Data Analyst, ETL developer, BI Manager
Getting Started with Databricks
What is Databricks
Create Azure Account
Setting up databricks environment
Understanding Distributed Processing
How to create cluster
Create table or dataframe by uploading data
Extraction of Data
Extraction of data from Azure account
Adding Schema to data files
Transformation of Data
Scala - Filtering Dataframe
Scala - Common Operations
Scala - Aggregation commands
Scala - Rank, Row Number, Dense Rank
Python - Filtering Dataframe
Python - Common Operations
Python - Aggregation commands
Python - Rank, Row Number, Dense Rank
Spark SQL - Common Operations
Spark SQL - Aggregation Commands
Spark SQL - Rank, Row Number, Dense Rank
Spark SQL - Global View
Spark SQL - Temp View
Scala - Joins
Python - Joins
Spark SQL - Joins
Processing XML, JSON, Delta tables
Processing Nested XML file
Processing Nested JSON file
Delta Table - Time Travel and Vacuum
Loading data and building ETL data pipeline with dashboard
Spinning up Azure SQL
Project building and mounting of containers
Reading XML,JSON,CSV and loading to Delta tables & Azure SQL
Move files from one container to another
Azure Data Factory to orchestrate
Thank you for this fundamental yet detailed course on Databricks. This was my introduction to this software and the content included was totally appropriate and helpful for my basic understanding of the same. This course depicts the capabilities of this tool in a smooth way, the interactive notebooks and workspaces, highly optimized processing of data definitely motivates me to explore more. The step-by-step demonstration for setting up the resources, databases, containers, processing data using all the supported languages, loading onto delta tables and Azure SQL, developing simple dashboard gives a clear overview of the lifecycle of data analytics project development using Databricks. This course was perfectly paced for a fresher like me. Looking forward for many such courses from the Instructor. It was a great learning experience.
What a great course for starting my Databricks journey. Kudos to the Instructor for making such a great course on databricks, The course is structured well for better understanding, simple to understand, sessions with clear voice clarity & good resolution. Lectures are very detailed and concepts well explained. The time spent on each module was worth it. The best part was delving to project, its great to see how data can be used to make meaningful insights from dashboard built. I've improved my knowledge not only Databricks but also Apache Spark SQL and Building dashboards. I really recommend this course for everyone!
I am really excited to use this skill to make an impact in my future projects. Simple dataset always helps me in understanding the concept in much better way. Simple and precise are the two words that I have for this course. I really don’t like lengthy course which are 10 hrs or 15 hrs. This course checks all criteria’s that I have and totally worth the money. I like the passion that author have to teach. This course does talk about building something end to end.
Excellent knowledge sharing. It really help me to implement it in my project.I can really build ETL pipeline that can run in production.
This is very good course diving into building end to end data pipeline in a most simplest way and I really love the part where author had build dashboards. Looking forward for more courses.
This course can be a lot better. The instructor at times presented the materials poorly and some times it felt very robotic. The content can be enriched with more useful stuff. I didn't have much trouble because I have some databricks experience and this was more like a refreshing course for me. Overall it's not bad but it definitely can be a lot better...
Until the momment (50% course) I'm very interest and happy with my choice, because the Teacher have a good experience and have been add the information step by step
Impossible to see what’s going on. Explanations seem to be missing entirely or very poor. EG magic command don’t get explained. Overall, I’m feeling very disconnected and disappointed.
I'm transitioning into a tech career and have found this course helpful. Part of me wants to jump ahead to the other courses in the specialization but I have been learning more about the basics which is good. Good pace in explaining different topics and nice choice of colors while presenting. I will encourage author to create more advanced materials for us. I really like the part of processing same dataset in different language. Keep teaching.
What a fantastic course! The content is very well organized, and the instructor makes it all easy to understand. The project we develop along the course is really helpful and gives us a good knowledge of Databricks. Totally recommend!
O curso é sensacional! Fácil de entender e acompanhar. Tem me ajudado bastante a ampliar meu conhecimento.
Instructor did an excellent job with this course. He has prepared excellent study material and presents the information in a very clear manner. Value for money this course is an easy 5 star rating.
This course provides the in-depth knowledge of the concepts. I feel the modules are divided perfectly so that you don’t get confused. Like the way Priyank has stated everything, easy to listen and understand.
Amazing course! It has an excellent instructor with clear accent. I've improved my knowledges not only Databricks or Apache Spark, but also Scala, SQL and Python!
I bought multiple courses for databricks and this is best so far. Covering different aspect about platform and architecting databases, ADF.