dbt: Data transformation using software engineering practices
SQL-based transformation framework for analytics data warehouses.
Learn more about dbt
dbt is a Python-based tool that compiles SQL select statements into executable transformations for data warehouses. It operates on a directed acyclic graph (DAG) model where each SQL query represents a node, and dbt automatically determines execution order based on declared dependencies between models. The tool supports multiple data warehouse backends and provides features for testing data quality, generating documentation, and version controlling transformation logic. Common use cases include building data marts, creating dimensional models, and establishing repeatable ELT (Extract, Load, Transform) pipelines in analytics workflows.

SQL-first approach
dbt uses standard SQL select statements as the primary interface for defining transformations, rather than requiring a proprietary language or visual interface. This allows analysts familiar with SQL to define complex data pipelines without learning additional frameworks.
Dependency management via DAG
dbt automatically constructs a directed acyclic graph from declared model relationships and determines the correct execution order. This eliminates manual scheduling logic and allows developers to reference upstream models using the ref() function rather than hardcoding table names.
Built-in testing and documentation
dbt includes native support for data quality tests and automatic documentation generation from model definitions and comments. These capabilities are integrated into the project structure and CI/CD workflows rather than requiring separate tools.
-- models/orders_summary.sql
-- Reference a staging model to build aggregated orders
select
customer_id,
count(*) as order_count,
sum(order_total) as total_spent
from {{ ref('stg_orders') }}
group by customer_idFixed deadlock issues in concurrent batch execution and removed unnecessary deprecation warnings for Python models.
- –Avoid deadlock edgecases of concurrent microbatch/batch execution
- –Stop raising deprecation warnings for internal python model configs
- –Move click minimum to 8.3.0
Fixed catalog integration functionality to ensure proper initialization even when a manifest already exists.
- –Add addcatalogintegration call even if we have a pre-existing manifest
Introduced new configuration metadata methods and fixed a critical bug in the meta_require method implementation.
- –Implement config.metaget and config.metarequire
- –Adds omitted return statement to RuntimeConfigObject.meta_require method
- –Bump lower bound for dbt-common to 1.37.2
See how people are using dbt
Top in Data Engineering
Related Repositories
Discover similar tools and frameworks used by developers
n8n
Node-based automation platform with JavaScript and Python scripting.
Apache Airflow
Python platform for DAG-based task orchestration and scheduling.
pandas
Labeled data structures for tabular data analysis.
ClickHouse
Column-oriented database for real-time analytics with SQL support and distributed computing capabilities.
PostHog
Event tracking, analytics, and experimentation platform.