What Is ETL?
Extract, Transform, Load
ETL (Extract, Transform, Load) is the process of pulling data from source systems, transforming it into a usable format, and loading it into a data warehouse or database. It's how companies consolidate data from dozens of sources for analytics and reporting.
How ETL Works
Extract: pull sales data from Shopify API, user data from PostgreSQL, marketing data from Google Analytics. Transform: clean dates, normalize currencies, join datasets, compute metrics. Load: write transformed data into BigQuery or Snowflake for analysis. Tools: dbt, Airflow, Fivetran.
Key Concepts
- Extract — Pull data from source systems — APIs, databases, files, streams
- Transform — Clean, validate, reshape, and enrich data — handle nulls, normalize formats, compute metrics
- Load — Write processed data into the destination — data warehouse, database, or data lake
Frequently Asked Questions
ETL vs ELT?
ETL transforms before loading (traditional). ELT loads raw data first, transforms inside the warehouse (modern approach using dbt + BigQuery/Snowflake). ELT is more flexible — you keep the raw data.