Extend Databricks to Your On-Premises Data: Native Delta Sharing. Zero-Copy. Zero Pipelines.

About this Resource

Enterprises have standardized on Databricks for AI and analytics, but much of the data those workloads need still lives on-premises — held there by sovereignty rules, latency requirements, or the cost of moving petabytes at scale. Until now, bridging that gap meant compromises: masking sensitive fields, downsampling volumes, and maintaining pipelines that keep stale cloud copies in sync. Each workaround reduced the fidelity of data Databricks could access. AIStor Table Sharing removes those compromises by building the open Delta Sharing protocol directly into the MinIO AIStor data platform. Databricks queries on-premises Apache Iceberg and Delta tables live, through Unity Catalog, with read-only access enforced at the storage layer via scoped bearer tokens. Data never moves. This solution brief explains how it works in four steps (store, share credentials, mount the catalog, query directly), covers supported use cases (hybrid AI/ML, enterprise analytics, regulated workloads), and details why it matters: live data instead of stale snapshots, eliminated cloud storage costs for mirrored datasets, full Unity Catalog governance including audit logs and column-level permissions, and no rewrites, migrations, or forced format standardization. Databricks SVP Stephen Orban endorsed this integration directly, noting it accelerates time-to-insight for hybrid workloads without complex replication.

Key Takeaways:

AIStor Table Sharing builds Delta Sharing directly into the storage platform — there is no separate sharing server to deploy or manage, and data never leaves the on-premises environment under any circumstances.

Access is read-only, authenticated with scoped bearer tokens, and configurable to expire — full Unity Catalog governance including access controls, audit logs, and column-level permissions applies automatically.

The solution supports Apache Iceberg and Delta tables natively in mixed-format environments with no rewrites, migrations, or forced standardization required.

Who this is for

Data platform engineers and analytics architects at organizations that run Databricks in the cloud but store regulated, high-gravity, or latency-sensitive datasets on-premises and need live query access without replication.

Related Resources