Flume and Sqoop for Ingesting Big Data

Import data to HDFS, HBase and Hive from a variety of sources , including Twitter and MySQL

3.50 (172 reviews)
Udemy
platform
English
language
Software Engineering
category
instructor
3,432
students
2.5 hours
content
Oct 2016
last update
$19.99
regular price

What you will learn

Use Flume to ingest data to HDFS and HBase

Use Sqoop to import data from MySQL to HDFS and Hive

Ingest data from a variety of sources including HTTP, Twitter and MySQL

Description

Taught by a team which includes 2 Stanford-educated, ex-Googlers. This team has decades of practical experience in working with Java and with billions of rows of data. 

Use Flume and Sqoop to import data to HDFS, HBase and Hive from a variety of sources, including Twitter and MySQL

Let’s parse that.

Import data : Flume and Sqoop play a special role in the Hadoop ecosystem. They transport data from sources like local file systems, HTTP, MySQL and Twitter which hold/produce data to data stores like HDFS, HBase and Hive. Both tools come with built-in functionality and abstract away users from the complexity of transporting data between these systems. 

Flume: Flume Agents can transport data produced by a streaming application to data stores like HDFS and HBase. 

Sqoop: Use Sqoop to bulk import data from traditional RDBMS to Hadoop storage architectures like HDFS or Hive. 

What's Covered:

Practical implementations for a variety of sources and data stores ..

  • Sources : Twitter, MySQL, Spooling Directory, HTTP
  • Sinks : HDFS, HBase, Hive

.. Flume features : 

Flume Agents, Flume Events, Event bucketing, Channel selectors, Interceptors

.. Sqoop features : 

Sqoop import from MySQL, Incremental imports using Sqoop Jobs

Content

You, This Course and Us

You, This Course and Us

Why do we need Flume and Sqoop?

Why do we need Flume and Sqoop?

Flume

Installing Flume
Flume Agent - the basic unit of Flume
Example 1 : Spool to Logger
Flume Events are how data is transported
Example 2 : Spool to HDFS
Example 3: HTTP to HDFS
Example 4: HTTP to HDFS with Event Bucketing
Example 5: Spool to HBase
Example 6: Using multiple sinks and Channel selectors
Example 7: Twitter Source with Interceptors
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables

Sqoop

Installing Sqoop
Example 8: Sqoop Import from MySQL to HDFS
Example 9: Sqoop Import from MySQL to Hive
Example 10: Incremental Imports using Sqoop Jobs

Screenshots

Flume and Sqoop for Ingesting Big Data - Screenshot_01Flume and Sqoop for Ingesting Big Data - Screenshot_02Flume and Sqoop for Ingesting Big Data - Screenshot_03Flume and Sqoop for Ingesting Big Data - Screenshot_04

Reviews

Pardeep
December 6, 2020
Flume was explained nicely with examples. Sqoop could be explained in more details. Overall the learning experience was nice.
Amin
May 13, 2020
Very simple and shallow lesson. Sqoop and Flume have very important options and configurations that have not covered here. The most commands explained in theory that we can find them easily in Internet. This is not hands-on. Only wasting my time on this lesson.
Shaik
November 24, 2018
Very precise and also detailed explanation when required.. I liked it very much and appreciate the practical use case examples.
Khalidktl
September 16, 2018
Got a good basic regarding for both Flume and Sqoop. The examples are easy to practice with full explanation. Really nice course to get start with both technologies.
Pallab
August 15, 2018
This course is very good but it is very basic to learn and Sqoop part need to be describe more like flume
Subhadip
July 17, 2018
Short crisp course. Suggestion: Discuss the internals/architecture of Sqoop/Flume; and Real World troubleshooting/optimizations.
Yusra
January 12, 2018
I love all loony corn courses. Other than having to bear hearing the bragging about IIT, Stanford, Google and Flipkart everytime ;D (just kidding :) ), all their courses including this one are always practical, useful and simple to understand. This course may not be a extensive and definitive guide about flume or sqoop, but a good quick start for a beginner that allows you to grasp what the software is about and demonstrates how simply you can start using it.
Lokanathan
November 28, 2017
Well explained, I'm starting to like the sessions more... just that the back ground music is sometimes distracting... Overall I'm going to subscribe to more Loonycorn classes
Shiv
November 22, 2017
just the instruction of standalone installation of both flume and sqoop is not helpful. At least talk about how these are implemented in production environment and how these tie into existing Hadoop cluster. Talk about scalability and availability, fail-over of flume agents and sqoop jobs
Denise
December 5, 2016
good introduction. need more of an architecture overview. great use cases to illustrates the topics being presented.
Manish
July 31, 2016
Good info but below are missing Would like to see support for different file formats like ORC, Parquet, AVRO etc. as the syntax differs for formats like ORC etc as we need to use HCatalog. Whether we can change engine from MR to Tez or even Spark for Sqoop jobs? No tutorials on Export!? Special Sqoop connectors for Oracle and other databases?

Charts

Price

Flume and Sqoop for Ingesting Big Data - Price chart

Rating

Flume and Sqoop for Ingesting Big Data - Ratings chart

Enrollment distribution

Flume and Sqoop for Ingesting Big Data - Distribution chart

Related Topics

886072
udemy ID
6/23/2016
course created date
11/22/2019
course indexed date
Bot
course submited by