DE
Data Engineering
Airflow · Spark · Warehousing — 14 Weeks
14 Weeks Hands-On Project-Based

Data Engineering Program

Master pipelines, infrastructure, and systems behind large-scale data platforms. Learn Airflow workflows, Spark processing, data warehousing, and ETL best practices.

Duration: 14 weeks
Format: Live labs + recorded content
Level: Intermediate

Program Overview

This 14-week Data Engineering program teaches building robust, scalable data infrastructure, scheduling workflows, large-scale processing with Spark, and architecting data warehouses. Labs, weekly mini-projects, and a capstone ensure practical experience.

Core Skills Covered

  • ETL/ELT pipeline design & implementation using Airflow
  • Distributed processing with Apache Spark
  • Data modelling for warehouses & lakes
  • SQL & database technologies
  • Performance optimization & resource management
  • Incremental data loading & monitoring
  • Batch & streaming data handling

Who should join?

Software engineers, data analysts, ML engineers, and anyone building backend pipelines and infrastructure. Some Python/SQL experience helpful.

Prerequisites

  • Programming in Python/Scala and SQL basics
  • Understanding of data structures & systems
  • Laptop + internet; cloud access optional

Outcomes & Support

  • Multiple pipeline projects + final capstone
  • Hands-on experience with monitoring, logging & orchestration
  • Resume & interview prep for Data Engineering roles
  • Completion certificate with technical feedback

14-Week Curriculum

Weekly labs & mini-projects for practical experience.

Week 1 — Fundamentals of Data Engineering
Data ecosystem, components, batch vs streaming, data formats.
Week 2 — SQL & Databases for Warehousing
Relational vs columnar stores, indexing, query optimisation.
Week 3 — Data Modelling & Warehousing Principles
Star/snowflake schemas, dimensional modelling, fact vs dimension tables.
Week 4 — Data Ingestion & ELT/ETL
Scheduling, pipelines, ingestion from various sources, incremental loads.
Week 5 — Apache Spark Basics
RDD vs DataFrame APIs, transformations & actions, performance fundamentals.
Week 6 — Spark Advanced & Streaming
Structured Streaming, stateful operations, windowing.
Week 7 — Workflow Orchestration with Airflow
DAGs, scheduling, dependencies, retries & failure handling.
Week 8 — Monitoring, Logging & Observability
Metrics, logs, data lineage, alerting.
Week 9 — Data Quality & Testing
Assertions, unit tests, schema evolution, error handling.
Week 10 — Performance & Scalability
Partitioning, caching, joins optimization, resource management.
Week 11 — Real-World Tools
Cloud warehouses, lakehouses, storage formats (Parquet/ORC), distributed file systems.
Week 12 — Security & Governance
Access control, privacy, audit, compliance, GDPR.
Week 13 — Incremental & Streaming Architectures
Change data capture, micro-batch streaming, lambda/kappa architectures.
Week 14 — Capstone Project & Hiring Prep
End-to-end pipeline, portfolio, mock interviews.

Capstone & Projects

Build a full data engineering pipeline, including ingestion, storage, processing, quality checks, monitoring, and reporting.

Sample Projects

  • End-to-end product analytics pipeline
  • Streaming ETL for IoT sensors to lake + dashboard
  • Data lakehouse with versioned storage & partitions

Assessment

Weekly labs, mid-program reviews, final capstone evaluation with feedback and completion certificate.

How to Apply

  1. Fill out the application form with your basic details and background information.
  2. Complete the course payment of ₹9,500/- to confirm your enrollment. Multiple payment options are available.
  3. Receive your official admission letter after payment confirmation.
  4. Start attending classes, available both online and offline according to your preference.

Refund & Cancellation

Full refund if cancelled within 7 days of enrollment and before course start. Contact admissions for details.