Data Engineering

Airflow · Spark · Warehousing — 14 Weeks

14 Weeks Hands-On Project-Based

Data Engineering Program

Master pipelines, infrastructure, and systems behind large-scale data platforms. Learn Airflow workflows, Spark processing, data warehousing, and ETL best practices.

Duration: 14 weeks

Format: Live labs + recorded content

Level: Intermediate

Program Overview

This 14-week Data Engineering program teaches building robust, scalable data infrastructure, scheduling workflows, large-scale processing with Spark, and architecting data warehouses. Labs, weekly mini-projects, and a capstone ensure practical experience.

Core Skills Covered

ETL/ELT pipeline design & implementation using Airflow
Distributed processing with Apache Spark
Data modelling for warehouses & lakes
SQL & database technologies
Performance optimization & resource management
Incremental data loading & monitoring
Batch & streaming data handling

Who should join?

Software engineers, data analysts, ML engineers, and anyone building backend pipelines and infrastructure. Some Python/SQL experience helpful.

Prerequisites

Programming in Python/Scala and SQL basics
Understanding of data structures & systems
Laptop + internet; cloud access optional

Outcomes & Support

Multiple pipeline projects + final capstone
Hands-on experience with monitoring, logging & orchestration
Resume & interview prep for Data Engineering roles
Completion certificate with technical feedback

14-Week Curriculum

Weekly labs & mini-projects for practical experience.

Week 1 — Fundamentals of Data Engineering

Data ecosystem, components, batch vs streaming, data formats.

Week 2 — SQL & Databases for Warehousing

Relational vs columnar stores, indexing, query optimisation.

Week 3 — Data Modelling & Warehousing Principles

Star/snowflake schemas, dimensional modelling, fact vs dimension tables.

Week 4 — Data Ingestion & ELT/ETL

Scheduling, pipelines, ingestion from various sources, incremental loads.

Week 5 — Apache Spark Basics

RDD vs DataFrame APIs, transformations & actions, performance fundamentals.

Week 6 — Spark Advanced & Streaming

Structured Streaming, stateful operations, windowing.

Week 7 — Workflow Orchestration with Airflow

DAGs, scheduling, dependencies, retries & failure handling.

Week 8 — Monitoring, Logging & Observability

Metrics, logs, data lineage, alerting.

Week 9 — Data Quality & Testing

Assertions, unit tests, schema evolution, error handling.

Week 10 — Performance & Scalability

Partitioning, caching, joins optimization, resource management.

Week 11 — Real-World Tools

Cloud warehouses, lakehouses, storage formats (Parquet/ORC), distributed file systems.

Week 12 — Security & Governance

Access control, privacy, audit, compliance, GDPR.

Week 13 — Incremental & Streaming Architectures

Change data capture, micro-batch streaming, lambda/kappa architectures.

Week 14 — Capstone Project & Hiring Prep

End-to-end pipeline, portfolio, mock interviews.

Capstone & Projects

Build a full data engineering pipeline, including ingestion, storage, processing, quality checks, monitoring, and reporting.

Sample Projects

End-to-end product analytics pipeline
Streaming ETL for IoT sensors to lake + dashboard
Data lakehouse with versioned storage & partitions

Assessment

Weekly labs, mid-program reviews, final capstone evaluation with feedback and completion certificate.

How to Apply

Fill out the application form with your basic details and background information.
Complete the course payment of ₹9,500/- to confirm your enrollment. Multiple payment options are available.
Receive your official admission letter after payment confirmation.
Start attending classes, available both online and offline according to your preference.

Refund & Cancellation

Full refund if cancelled within 7 days of enrollment and before course start. Contact admissions for details.

Application Form