Skip to main content
Version: devel

Introduction

dltHub

This page is for dltHub Feature, which requires a license. Join our early access program for a trial license.

What is dltHub?

dltHub is an LLM-native data engineering platform that lets any Python developer build, run, and operate production-grade data pipelines, and deliver end-user-ready insights without managing infrastructure.

dltHub is built around the open-source library dlt. It uses the same core concepts (sources, destinations, pipelines) and extends the extract-and-load focus of dlt with:

  • Enhanced developer experience
  • Transformations
  • Data quality
  • AI-assisted (“agentic”) workflows
  • Managed runtime

dltHub supports both local and managed cloud development. A single developer can deploy and operate pipelines, transformations, and notebooks directly from a dltHub Workspace, using a single command. The dltHub Runtime, customizable pipeline dashboard, and validation tools make it straightforward to monitor, troubleshoot, and keep data reliable throughout the whole end-to-end data workflow:

In practice, this means any Python developer can:

  • Build and customize data pipelines quickly (with LLM help when desired).
  • Derisk data insights by keeping data quality high with checks, tests, and alerts.
  • Ship fresh dashboards, reports, and data apps.
  • Scale the data workflows easily without babysitting infra, schema drift, and silent failures.
tip

Want to see it end-to-end? Watch the dltHub Workspace demo.

To get started quickly, follow the installation instructions.

Overview

Key capabilities

  1. LLM-native workflow: accelerate pipeline authoring and maintenance with guided prompts and copilot experiences.

  2. Transformations: write Python or SQL transformations with @dlt.hub.transformation, orchestrated within your pipeline.

  3. Data quality: define correctness rules, run checks, and fail fast with actionable messages.

  4. Data apps & sharing: build lightweight, shareable data apps and notebooks for consumers.

  5. AI agentic support: use MCP servers to analyze pipelines and datasets.

  6. Managed runtime: deploy and run with a single command—no infra to provision or patch.

  7. Storage choice: pick managed Iceberg-based lakehouse, DuckLake, or bring your own storage.

How dltHub fits with dlt (OSS)

dltHub embraces the dlt library, not replaces it:

  • dlt (OSS): Python library focused on extract & load with strong typing and schema handling.
  • dltHub: Adds transformations, quality, agentic tooling, managed runtime, and storage choices, so you can move from local dev to production seamlessly.

If you like the dlt developer experience, dltHub gives you everything around it to run in production with less toil.

dltHub products

dltHub consists of three main products. You can use them together or compose them based on your needs.

Workspace

Workspace [Public preview] - the unified environment for building, running, and maintaining data workflows end-to-end.

  • Scaffolding and LLM helpers for faster pipeline creation.
  • Integrated transformations (@dlt.hub.transformation decorator).
  • Data quality rules, test runs, and result surfacing.
  • Notebook and data apps (e.g., Marimo) for sharing insights.
  • Visual dashboards for pipeline health and run history.

Runtime [Private preview]

Runtime - a managed cloud runtime operated by dltHub:

  • Scalable execution for pipelines and transformations.
  • APIs, web interfaces, and auxiliary services.
  • Secure, multi-tenant infrastructure with upgrades and patching handled for you.
tip

Prefer full control? See Enterprise below for self-managed options.

Storage

Storage [In development]. Choose where your data lives:

  • Managed lakehouse: Iceberg open table format (or DuckLake) managed by dltHub.
  • Bring your own storage: connect to your own lake/warehouse when needed.

Tiers & licensing

Some of the features described in this documentation are free to use. Others require a paid plan. Latest pricing & full feature matrix can be found live on our website. Most features support a self-guided trial right after install, check the installation instructions for more information.

TierBest forRuntimeTypical use caseNotesAvailability
dltHub BasicSolo developers or small teams owning a single pipeline + dataset + reports end-to-endManaged dltHub RuntimeSet up a pipeline quickly, add tests and transformations, share a simple appOptimized for velocity with minimal setupPrivate preview
dltHub ScaleData teams building composable data platforms with governance and collaborationManaged dltHub RuntimeMultiple pipelines, shared assets, team workflows, observabilityTeam features and extended governanceAlpha
dltHub EnterpriseOrganizations needing enterprise controls or self-managed runtimeManaged or self-hosted RuntimeOn-prem/VPC deployments, custom licensing, advanced securityEnterprise features and deployment flexibilityIn developement

Who is dltHub for?

  • Python developers who want production outcomes without becoming infra experts.
  • Lean data teams standardizing on dlt and wanting integrated quality, transforms, and sharing.
  • Organizations that prefer managed operations but need open formats and portability.
note
  • You can start on Basic and upgrade to Scale or Enterprise later, no code rewrites.
  • We favor open formats and portable storage (e.g., Iceberg), whether you choose our managed lakehouse or bring your own.
  • For exact features and pricing, check the site; this section is meant to help you choose a sensible starting point.

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.