4.47 (199 reviews)
☑ Learn Full In & Out of Google Cloud BigQuery with proper HANDS-ON examples from scratch.
☑ Get an Overview of Google Cloud Platform and a brief introduction to the set of services it provides.
☑ Start with Bigquery core concepts like understanding its Architecture, Dataset, Table, View, Materialized View, Schedule queries, Limitations & Quotas.
☑ ADVANCE Big query topics like Query Execution plan, Efficient schema design, Optimization techniques, Partitioning, Clustering, etc.
☑ Build Big data pipelines using various Google Cloud Platform services - Dataflow, Pub/Sub, BigQuery, Cloud storage, Beam, Data Studio, Cloud Composer/Airflow etc.
☑ Learn to interact with Bigquery using Web Console, Command Line, Python Client Library etc.
☑ Learn Best practices to follow in Real-Time Projects for Performance and Cost saving for every component of Big query.
☑ Bigquery Pricing models for Storage, Querying, API requests, DMLs and free operations.
☑ Data-sets and Queries used in lectures are available in resources tab. This will save your typing efforts.
Note : This Bigquery course is NOT intended to teach SQL or PostgreSQL. The focus of the course is kept to give you In-depth knowledge of Google Bigquery concepts/Internals.
"BigQuery is server-less, highly scalable, and cost-effective Data warehouse designed for Google cloud Platform (GCP) to store and query petabytes of data."
What's included in the course ?
Brief introduction to the set of services Google Cloud provides.
Complete In-depth knowledge of Google BigQuery concepts explained from Scratch to ADVANCE to Real-Time implementation.
Each and every BigQuery concept is explained with HANDS-ON examples.
Includes each and every, even thin detail of Big Query.
Learn to interact with BigQuery using its Web Console, Bq CLI and Python Client Library.
Create, Load, Modify and Manage BigQuery Datasets, Tables, Views, Materialized Views etc.
*Exclusive* - Query Execution Plan, Efficient schema design, Optimization techniques, Partitioning, Clustering.
Build and deploy end-to-end data pipelines (Batch & Stream) of Real-Time case studies in GCP.
Services used in the pipelines- Dataflow, Apache Beam, Pub/Sub, Bigquery, Cloud storage, Data Studio, Cloud Composer/Airflow etc.
Learn Best practices and Optimization techniques to follow in Real-Time Google Cloud BigQuery Projects.
After completing this course, you can start working on any BigQuery project with full confidence.
Questions and Queries will be answered very quickly.
Queries and datasets used in lectures are attached in the course for your convenience.
I am going to update it frequently, every time adding new components of Bigquery.
Introduction to GCP & its services
Introduction to Google Cloud Platform
GCP vs AWS vs Azure - Why choose GCP
Compute Services in GCP
Storage Services in GCP
Big data Services in GCP
AI & ML Services in GCP
Big data ecosystem in GCP
Introduction to BigQuery
Conventional Datawarehouse Problems
What is BigQuery
BigQuery Out-of-the Box Features
Architecture of BigQuery
Dataset & Table creation
Setup a GCP account
Create a Project
BigQuery UI Tour
Region Vs Multi-region
Create a Dataset
Create a Table
Using BigQuery Dashboard options
Running query with various Query Settings
Caching features & limitations
Querying Wildcard Tables
Wildcard Table Limitations
Schedule, Save, Share a Query
Schema Auto detection
Efficient Schema Design in BigQuery
Design an Efficient schema for BigQuery Tables
Nested & Repeated Columns
Operations on Datasets & Tables
Transfer Service for scheduling Copy Jobs
Native operations on Table for Schema change
Manual operations on Table
Execution Plan of BigQuery
How BigQuery creates Execution Plan of a Query
Understanding Execution Plan in UI Dashboard
Partitioned Tables in BigQuery
What is Partitioning & its benefits
Ingestion time Partitioned Tables
Date column Partitioned Tables
Integer based Partitioned Tables
ALTER, COPY operations on Partitioned Tables
DML operations on Partitioned Tables
Best Practices for Partitioning
Clustered Tables in BigQuery
What is Clustering
When to use Clustering OR Partitioning OR Both
Create Clustered Table
Dos & Don'ts for Clustering
Loading & Querying External Data Sources
Introduction and Create Cloud Storage Bucket
Create & Query Permanent Table on Cloud Storage bucket
External data source Limitations
Views in Bigquery
Introduction to Views & its Advantages
Create Views in BigQuery
Restrict rows at User level in Views
Limitations of Views
Materialized Views in BigQuery
What are Materialized Views
Create a Materialized View
ALTER Materialized View
Design an optimized query for Materialized View
Auto & Manual Refreshes of Materialized Views
Limitations & Quotas of Materialized Views
Best Practices in Materialized Views
BQ Command Line
Cloud SDK Setup
BQ Basic commands
BQ - Querying Commands
BQ- Dataset creation command
BQ - Create all types of Tables
BQ - Load data into Table
BQ - Exclusive operations
Python Client Library of BigQuery
Python code to create dataset
Python code to create table
Python code to query tables
Build end-to-end Data Pipelines
Case Study Requirements
Apache Beam Pipeline creation
Write Transformations in Beam
Write to BigQuery
Create View for Daily data
Run the Beam Pipeline
Create Reports in Cloud DataStudio
Create monthly reports in DataStudio
API, DML pricing
Free operations in BigQuery
Google Cloud Pricing Calculator
Best Practices / Optimization Techniques
Methods to restrict data scan
Ways to reduce CPU time
Which SQL anti-patterns to avoid
BONUS - Different File Formats & BEAM
What do we need from a File
Text, Sequence, Avro Files
RC, ORC, Parquet Files
Performance Test results of Various Files
Which File Format to choose
Introduction to Apache Beam
Google Pub/Sub Architecture
The UI for bigquery has been updated since this course was published- so it takes a little longer to follow
the content is great and practical. 1 suggestions is that to increase the assignment for each section. right now only 2 assignments are there in the whole course which is not sufficient for self paratice
The course is a very detailed course on Big Query and explains the various features of Big Query in-depth.
So far it's a good match. But I was hoping to get a better and deeper understanding about the other gcp services that is explained in the sub-section 3, section 1. But I can understand if the explanation isn't too deep since this course mainly covers the GCP's BigQuery service.
As with all his previous courses, J Garg, kept the course simple and engaging, while covering almost all the aspects of the course. The course will give enough confidence to deal with real world problems for Data Engineers. Few things I would like to be added though , are - Publish the course presentations. Add Big Query ML capabilities
Good, but if you already have basic knowledge of SQL and data modelisation, some lessons can be skipped.