AIStor Table Sharing: The Storage Layer Databricks Has Been Waiting For

Denis Dubeau, a Partner Solution Architect at Databricks, and I recently published a piece on connecting your on-prem data estate to Databricks without replication. The post makes a clear architectural argument: as enterprises scale AI and analytics in the cloud, the traditional approach of copying data into cloud storage before analysis can begin introduces latency, cost, governance risk, and operational drag that compound as data volumes grow.

The conclusion — that Delta Sharing should be embedded directly into the storage layer rather than deployed as a standalone service — is one that MinIO shares. In fact, it's exactly what MinIO built with AIStor Table Sharing.

This post picks up where Denis and I left off. Our walkthrough covers the architectural rationale and the Databricks perspective. This one covers the implementation — what AIStor Table Sharing actually does, how it works, and why embedding Delta Sharing into the object store changes the economics and operations of hybrid analytics.

The Architectural Gap Denis Identified

Denis frames the problem precisely. Replication creates a tax on insight. Every copy of a dataset multiplies storage cost, widens governance exposure, and introduces staleness. Data engineers spend their time maintaining sync logic rather than enabling analytics. AI teams train on yesterday's data. Business teams wait.

Delta Sharing addresses this by enabling secure, read-only access to live tables without replication. But we also identified a second-order problem that many organizations overlook: even after adopting Delta Sharing, most hybrid deployments still require a standalone sharing server sitting between storage and compute. That intermediary introduces its own operational overhead — another service to deploy, secure, patch, monitor, and integrate with authentication and authorization systems.

The architectural insight is that the sharing plane should converge with the data plane. The object store that already manages the data should also be the system that shares it.

That insight is the design principle behind AIStor Table Sharing.

What AIStor Table Sharing Does

AIStor Table Sharing is a native implementation of the Delta Sharing protocol, built directly into MinIO AIStor. It is the first on-premises object store to embed Delta Sharing at the storage layer.

There is no separate sharing server. No sidecar governance service. No additional infrastructure between the data and the analytics platform. When you define a table share in AIStor, you define it in the same system that stores, governs, encrypts, and manages the lifecycle of the data.

Databricks connects to AIStor via Delta Sharing, and on-premises tables — both Delta and Apache Iceberg — appear as queryable objects in the Databricks workspace. Queries execute against live, current data. The data never moves.

The Layers AIStor Table Sharing Eliminates

Denis and I described a progression from standalone sharing servers toward embedded sharing. Here is what that progression looks like concretely when sharing moves into AIStor:

No standalone Delta Sharing server. The reference architecture for Delta Sharing typically requires deploying and maintaining a separate server that brokers access between storage and analytics consumers. AIStor eliminates that server entirely. The sharing endpoint is the storage endpoint.

No separate governance plane. In many hybrid deployments, the sharing server maintains its own authentication, authorization, and token management — separate from the governance already enforced at the storage layer. AIStor Table Sharing uses the same identity and access controls that govern the data itself. One system, one set of policies.

No export or staging layer. Organizations that can't deploy a sharing server often fall back to SFTP, S3 sync jobs, or manual exports to get data from on-premises storage into cloud analytics. AIStor Table Sharing replaces all of these with a protocol-native connection. Data is shared in place, not extracted.

No pipeline maintenance. Every replication pipeline is a system that must be monitored, maintained, and fixed when it breaks. Removing the need for replication removes the need for the infrastructure that supports it. The operational savings compound with every dataset that no longer needs to be copied.

Multi-Format Flexibility with UniForm

One detail that matters at enterprise scale: AIStor Table Sharing supports both Delta and Apache Iceberg tables through Delta Universal Format (UniForm).

This is significant because enterprise data environments are rarely standardized on a single table format. Different teams, different workloads, and different points in the adoption curve mean that Delta and Iceberg tables often coexist within the same organization.

AIStor Tables — the Iceberg V3-native table engine built into AIStor — manages both structured and unstructured data in a single platform. Table shares can include Delta tables, Iceberg tables, or a mix of both. Databricks queries all of them through the same Delta Sharing protocol.

Organizations don't have to converge on a single format before they can start sharing. They share what they have, in the format it already exists.

The Databricks Experience

For Databricks users, the experience is designed to be invisible. AIStor Table Sharing implements Delta Sharing 1.0, which means any Databricks workspace that supports Delta Sharing can connect to AIStor without custom connectors or proprietary gateways.

The connection uses either bearer tokens or OAuth2 for authentication. Databricks Unity Catalog can import share credentials directly. Once connected, shared tables appear alongside cloud-native tables in the Databricks workspace — same query interface, same governance model, same notebooks and dashboards.

The difference is that the data behind those tables hasn't moved. It still resides on-premises in AIStor, governed by the policies the organization already has in place. Databricks sends queries; AIStor returns results. The data stays put.

As Denis notes in his post, this is the model that preserves both the flexibility of cloud compute and the control of on-premises data. AIStor Table Sharing is the implementation that makes it operational.

Why This Matters Now

Three forces are converging that make this architecture not just attractive but necessary.

Data gravity is real and growing. Enterprises generate and retain more data on-premises than ever, driven by regulatory requirements, performance needs, and the simple economics of storing petabytes. That data isn't moving to the cloud — and the analytics platforms that need it have to meet it where it is.

AI demands fresh data. Models trained on stale snapshots produce stale predictions. Real-time analytics require real-time access. The latency introduced by replication pipelines — even well-maintained ones — directly undermines the value of the AI workloads that Databricks enables.

Cost scrutiny is intensifying. Duplicate storage, continuous sync jobs, cloud egress fees, and pipeline maintenance all add up. Organizations are asking harder questions about the cost of making data available versus the cost of the analysis itself. AIStor Table Sharing shifts that equation by eliminating the data movement costs entirely.

Two Posts, One Architecture

Denis's post and this one describe the same architecture from two vantage points. His starts from Databricks and looks toward the data. This one starts from the data and looks toward Databricks.

The shared conclusion is the same: the most effective hybrid analytics architecture is one where data stays in place and compute reaches it through open protocols. Delta Sharing is the protocol. AIStor is the storage layer that implements it natively.

Together, they establish a pattern for hybrid AI and analytics that eliminates the replication tax without compromising governance, performance, or control.

Getting Started

AIStor Table Sharing is generally available. Here is how to evaluate the approach for your environment:

Check out our white paper. Looking for a more in-depth overview on native Delta Sharing with AIStor? Read our new white paper “Extending Databricks AI and Analytics to Your On-Premises Data.”

Start with AIStor Free. A single-node license for development and testing at no cost. Deploy a local AIStor instance, create Delta or Iceberg tables, and configure your first table share.

Walk through the setup. The Table Sharing documentation covers share configuration, authentication (bearer token and OAuth2), and Databricks connectivity. For a more guided walkthrough, see Zero-Copy Data Sharing: AIStor to Databricks.

Read Denis's post. Our walkthrough covers the rationale, the protocol, and the architectural principles. Between his perspective and the AIStor implementation, both ends of the connection are covered.

Watch the webinar. Join Denis, Dwight Evers and me for a live demo of AIStor Table Sharing at the webinar "From On-Prem to Insight: Secure, Zero-Copy Analytics with Databricks and MinIO AIStor."

The infrastructure to connect on-premises data to Databricks without replication exists today. The question is no longer whether to adopt it — it's how quickly you can retire the pipelines you built to work around not having it.

Whether you're exploring AI-native object storage or planning your next deployment, we'd love to help.
Let's start a conversation or jump right in and try AIStor yourself.

Related Posts