Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo

In-depth course on Big Data - Apache Spark , Hadoop , Sqoop , Flume & Apache Hive, MongoDB & Big Data Cluster setup

4.57 (1576 reviews)
Udemy
platform
English
language
Other
category
instructor
Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo
13,144
students
11.5 hours
content
Feb 2024
last update
$74.99
regular price

What you will learn

Hadoop distributed File system and commands. Lifecycle of sqoop command. Sqoop import command to migrate data from Mysql to HDFS. Sqoop import command to migrate data from Mysql to Hive. Working with various file formats, compressions, file delimeter,where clause and queries while importing the data. Understand split-by and boundary queries. Use incremental mode to migrate the data from Mysql to HDFS. Using sqoop export, migrate data from HDFS to Mysql. Using sqoop export, migrate data from Hive to Mysql. Understand Flume Architecture. Using flume, Ingest data from Twitter and save to HDFS. Using flume, Ingest data from netcat and save to HDFS. Using flume, Ingest data from exec and show on console. Flume Interceptors.

Why take this course?

In this course, you will start by learning what is hadoop distributed file system and most common hadoop commands required to work with Hadoop File system.


Then you will be introduced to Sqoop Import

  • Understand lifecycle of sqoop command.

  • Use sqoop import command to migrate data from Mysql to HDFS.

  • Use sqoop import command to migrate data from Mysql to Hive.

  • Use various file formats, compressions, file delimeter,where clause and queries while importing the data.

  • Understand split-by and boundary queries.

  • Use incremental mode to migrate the data from Mysql to HDFS.


Further, you will learn Sqoop Export to migrate data.

  • What is sqoop export

  • Using sqoop export, migrate data from HDFS to Mysql.

  • Using sqoop export, migrate data from Hive to Mysql.



Further, you will learn about Apache Flume

  • Understand Flume Architecture.

  • Using flume, Ingest data from Twitter and save to HDFS.

  • Using flume, Ingest data from netcat and save to HDFS.

  • Using flume, Ingest data from exec and show on console.

  • Describe flume interceptors and see examples of using interceptors.

  • Flume multiple agents

  • Flume Consolidation.


In the next section, we will learn about Apache Hive

  • Hive Intro

  • External & Managed Tables

  • Working with Different Files - Parquet,Avro

  • Compressions

  • Hive Analysis

  • Hive String Functions

  • Hive Date Functions

  • Partitioning

  • Bucketing


You will learn about Apache Spark

  • Spark Intro

  • Cluster Overview

  • RDD

  • DAG/Stages/Tasks

  • Actions & Transformations

  • Transformation & Action Examples

  • Spark Data frames

  • Spark Data frames - working with diff File Formats & Compression

  • Dataframes API's

  • Spark SQL

  • Dataframe Examples

  • Spark with Cassandra Integration

  • Running Spark on Intellij IDE

  • Running Spark on EMR


Screenshots

Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Screenshot_01Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Screenshot_02Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Screenshot_03Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Screenshot_04

Reviews

Anthony
September 25, 2023
Basically good. Sqoop is deprecated, which may not be obvious to a beginner and should be disclosed in the course description. Wasted a bunch of hours learning a package that is unlikely to be used in practice and that i can no longer even download and install.
Daniel
September 20, 2023
É um curso muito rico em exemplos práticos das ferramentas utilizadas para engenharia de dados. Diversos exemplos com linha de comando, ótimo para quem está iniciando.
Amit
July 14, 2023
unorganized course sample file missing in too many lectures, like in lecture 78 i am unable to perform the hands-on because no sample file provided (this is just an example like this we have too many ) Need to more improvement on content so that people can learn easily .
Kumar
March 30, 2023
Many Thanks Navdeep for the course. Your teachings are crystal clear. I connected lot of dots in my understanding.
David
February 12, 2023
Just reached Section 3, and Link for Cloudera VM is no longer valid. Cloudera have now moved too Cloudera Data Platform ( CDP ) So course is no longer usable.
Rohit
November 23, 2022
This is the good course to understand basics and detail processing of hadoop and specifically spark...
Nagendra
August 31, 2022
Given Questions , Resources , answers are out of sync but Content for technical understanding is good .
Vanita
May 19, 2022
Great Course and very well explained. Easy to follow. Thanks a lot for your efforts in making this course so easy.
SHIVENDRA
March 27, 2022
Setups on google cloud has changed now and not as shown in video, that needs a fix as that has completely messed up, Its little risky and challenging as its using credit points and amount. It may charge back as card information is also stored. Not sure, how to remove and delete. This is serious issue, rest is good.
Arjun
March 6, 2022
Course has only things related like.. do this and then this will happen. In the era of data, actual teaching is to start with why do we need it, how do we do it and then what will come at the end. Still people follow it just to do stuff so not sure about the pathway of real growth based learning but i feel approach could be completely better if roadmap were managed in a certain way.
Ruben
January 9, 2022
El video no está alineado a los archivos de recursos y no se tiene mayor explicación de algunos comportamientos aparentemente aleatorios. El instructor no responde a las consultas realizadas.
Ankush
November 15, 2021
I'm sure, based on other reviews, that the concepts were explained well, however I failed to reach that point as the Cluster Setup on Google Cloud on my end (and seemingly others) was riddled with errors that I failed to solve after weeks of trying and corresponding with various sources, including the course owner. The explanation at this part wasn't great, just instructing us to copy and paste code from a .txt file that was different to the video's progression of the task and as such I was never able to get to a stage where I could do any of the actual course.
Kunal
November 12, 2021
I felt really disappointed because course started by configuring the Google-cloud which is not done correctly and faced issue at the end for which I have'nt receive any response from last couple of weeks from the instructor, most probably she could also not able to configure it that why she switch to cloudera and running every command on that for all explaining videos.
Rhosung
October 6, 2021
Makes it very easy for the beginners. Gives comprehensive understanding from the concept of big data to practices of each big data components. Would recommend.
Chandra
September 22, 2021
I would recommend this course to everyone as the concepts of the BigData were explained very well and the examples provided as part of the explanation are very basic to the beginner level and a beginner can understand the concepts very well. Definitely I would give 5 star rating to this course. Thanks to Kaur for the course.

Charts

Price

Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Price chart

Rating

Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Ratings chart

Enrollment distribution

Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume/Mongo - Distribution chart
2170564
udemy ID
1/23/2019
course created date
11/22/2019
course indexed date
Bot
course submited by