ETL (Extract, Transform, Load) is the process of pulling data from source systems, transforming it into a usable format, and loading it into a data warehouse or database. It's how companies consolidate data from dozens of sources for analytics and reporting.

How ETL Works

Extract: pull sales data from Shopify API, user data from PostgreSQL, marketing data from Google Analytics. Transform: clean dates, normalize currencies, join datasets, compute metrics. Load: write transformed data into BigQuery or Snowflake for analysis. Tools: dbt, Airflow, Fivetran.

Key Concepts

  • Extract — Pull data from source systems — APIs, databases, files, streams
  • Transform — Clean, validate, reshape, and enrich data — handle nulls, normalize formats, compute metrics
  • Load — Write processed data into the destination — data warehouse, database, or data lake

Frequently Asked Questions

ETL vs ELT?

ETL transforms before loading (traditional). ELT loads raw data first, transforms inside the warehouse (modern approach using dbt + BigQuery/Snowflake). ELT is more flexible — you keep the raw data.