Introduction
This page is for dltHub Feature, which requires a license. Join our early access program for a trial license.
What is dltHub?
dltHub is an LLM-native data engineering platform that lets any Python developer build, run, and operate production-grade data pipelines, and deliver end-user-ready insights without managing infrastructure.
dltHub is built around the open-source library dlt. It uses the same core concepts (sources, destinations, pipelines) and extends the extract-and-load focus of dlt with:
- Enhanced developer experience
- Transformations
- Data quality
- AI-assisted (“agentic”) workflows
- Managed runtime
dltHub supports both local and managed cloud development. A single developer can deploy and operate pipelines, transformations, and notebooks directly from a dltHub Workspace, using a single command. The dltHub Runtime, customizable pipeline dashboard, and validation tools make it straightforward to monitor, troubleshoot, and keep data reliable throughout the whole end-to-end data workflow:
In practice, this means any Python developer can:
- Build and customize data pipelines quickly (with LLM help when desired).
- Derisk data insights by keeping data quality high with checks, tests, and alerts.
- Ship fresh dashboards, reports, and data apps.
- Scale the data workflows easily without babysitting infra, schema drift, and silent failures.
Want to see it end-to-end? Watch the dltHub Workspace demo.
To get started quickly, follow the installation instructions.
Overview
Key capabilities
-
LLM-native workflow: accelerate pipeline authoring and maintenance with guided prompts and copilot experiences.
-
Transformations: write Python or SQL transformations with
@dlt.hub.transformation, orchestrated within your pipeline. -
Data quality: define correctness rules, run checks, and fail fast with actionable messages.
-
Data apps & sharing: build lightweight, shareable data apps and notebooks for consumers.
-
AI agentic support: use MCP servers to analyze pipelines and datasets.
-
Managed runtime: deploy and run with a single command—no infra to provision or patch.
-
Storage choice: pick managed Iceberg-based lakehouse, DuckLake, or bring your own storage.
How dltHub fits with dlt (OSS)
dltHub embraces the dlt library, not replaces it:
- dlt (OSS): Python library focused on extract & load with strong typing and schema handling.
- dltHub: Adds transformations, quality, agentic tooling, managed runtime, and storage choices, so you can move from local dev to production seamlessly.
If you like the dlt developer experience, dltHub gives you everything around it to run in production with less toil.
dltHub products
dltHub consists of three main products. You can use them together or compose them based on your needs.
Workspace
Workspace [Public preview] - the unified environment for building, running, and maintaining data workflows end-to-end.
- Scaffolding and LLM helpers for faster pipeline creation.
- Integrated transformations (@dlt.hub.transformation decorator).
- Data quality rules, test runs, and result surfacing.
- Notebook and data apps (e.g., Marimo) for sharing insights.
- Visual dashboards for pipeline health and run history.
Runtime [Private preview]
Runtime - a managed cloud runtime operated by dltHub:
- Scalable execution for pipelines and transformations.
- APIs, web interfaces, and auxiliary services.
- Secure, multi-tenant infrastructure with upgrades and patching handled for you.
Prefer full control? See Enterprise below for self-managed options.
Storage
Storage [In development]. Choose where your data lives:
- Managed lakehouse: Iceberg open table format (or DuckLake) managed by dltHub.
- Bring your own storage: connect to your own lake/warehouse when needed.
Tiers & licensing
Some of the features described in this documentation are free to use. Others require a paid plan. Latest pricing & full feature matrix can be found live on our website. Most features support a self-guided trial right after install, check the installation instructions for more information.
| Tier | Best for | Runtime | Typical use case | Notes | Availability |
|---|---|---|---|---|---|
| dltHub Basic | Solo developers or small teams owning a single pipeline + dataset + reports end-to-end | Managed dltHub Runtime | Set up a pipeline quickly, add tests and transformations, share a simple app | Optimized for velocity with minimal setup | Private preview |
| dltHub Scale | Data teams building composable data platforms with governance and collaboration | Managed dltHub Runtime | Multiple pipelines, shared assets, team workflows, observability | Team features and extended governance | Alpha |
| dltHub Enterprise | Organizations needing enterprise controls or self-managed runtime | Managed or self-hosted Runtime | On-prem/VPC deployments, custom licensing, advanced security | Enterprise features and deployment flexibility | In developement |
Who is dltHub for?
- Python developers who want production outcomes without becoming infra experts.
- Lean data teams standardizing on dlt and wanting integrated quality, transforms, and sharing.
- Organizations that prefer managed operations but need open formats and portability.
- You can start on Basic and upgrade to Scale or Enterprise later, no code rewrites.
- We favor open formats and portable storage (e.g., Iceberg), whether you choose our managed lakehouse or bring your own.
- For exact features and pricing, check the site; this section is meant to help you choose a sensible starting point.