How Modern Table Formats Actually Work (and what they mean for your data architecture)
By the end, you will understand how table formats actually work, where Iceberg, Delta, and Hudi actually differ, and when the lakehouse architecture earns its complexity versus when it doesn't.
Most data professionals use the term “lakehouse” without fully understanding what it means. The word appears in vendor marketing, conference talks, and architectural diagrams, often referring to slightly different things depending on who’s using it. This is partly because the lakehouse is a real architectural shift and partly because the term has been claimed by every vendor looking to position their product.
Underneath the marketing is a genuine technical innovation worth understanding: modern table formats. These are metadata layers that sit on top of Parquet files and turn collections of immutable files into something resembling a real database table — with transactions, schema evolution, time travel, and concurrent write coordination. Apache Iceberg, Delta Lake, and Apache Hudi are the three major implementations. The lakehouse architecture is what you get when you combine table formats with cloud object storage and multiple query engines.
This article is about what’s actually happening at the file and metadata level when you use these formats. By the end, you should understand why Parquet alone isn’t enough for modern workloads, how table formats provide database-like guarantees on top of immutable object storage, where Iceberg, Delta, and Hudi actually differ, and when the lakehouse architecture earns its complexity versus when it doesn’t.
Most of these transfers across vendors and tools. Of course, the specific implementations differ, but the underlying architecture is increasingly standardized.
Why Parquet alone isn’t enough
In an earlier article on columnar storage, I described Parquet as a self-describing, compressed, statistics-rich, parallel-readable analytical artifact. That description is accurate at the file level. It’s also incomplete in a way that becomes obvious as soon as you try to use Parquet for anything beyond read-only analytics.
Parquet files are immutable. Once written, you cannot modify them. If a row needs to change, you have two options: rewrite the entire file with the change applied, or write a new file containing the change and somehow merge it during reads. Neither operation is a transaction, neither handles concurrent writers gracefully, and neither tracks history.
Parquet has no concept of a “table.” A directory full of Parquet files is not a table — it’s a directory full of Parquet files. There’s no schema enforcement across files (one file might have an evolved schema, another might not). There’s no atomic commit across files (a query might see some files but not others if writes are happening). There’s no version history or time travel.
Parquet provides no transactional guarantees. If two writers append to the same dataset, there’s no coordination mechanism to ensure consistency. If one writer fails partway through writing multiple files, you are left with orphaned files with no way to atomically roll back.
For analytical workloads in the early 2010s, this was acceptable. You wrote Parquet files in batch, queried them read-only, and accepted that the dataset was effectively immutable. The world has changed. Modern analytical workloads need to update data continuously, evolve schemas as business requirements change, support concurrent writers across teams, and reason about data over time for audit and reproducibility.
Parquet alone cannot do any of this. Table formats are metadata layers that add these capabilities to collections of Parquet files. They turn a directory of files into a queryable, mutable, transactional table — without modifying the underlying file format. The Parquet stays Parquet. The transformation happens in metadata.
How table formats actually work
The fundamental mechanism shared across all three major formats is straightforward:
Data files hold the actual rows. These are typically Parquet, sometimes ORC or Avro. The data files themselves are unchanged from how they’d exist without a table format.
Metadata files track which data files belong to the table at any given point in time. They contain schema information, column-level statistics, file paths, partition information, and version or snapshot identifiers.
A pointer indicates the “current” version of the table — typically resolved through a single atomic operation that says “this is the table as it exists right now.”
When you write to a table, you don’t modify existing files. You write new data files (or marker files indicating deletions), then update the metadata to point to the new state. The update of the metadata pointer is the atomic commit. Until that commit happens, readers see the old version. After it happens, they see the new version. There’s no in-between state where readers might see partial changes.
This is how table formats provide ACID guarantees on top of immutable cloud storage like S3. Object storage doesn’t natively support multi-file transactions, but it does support atomic single-file operations. Table formats exploit this by reducing every transaction to a single atomic metadata pointer update.
This architecture has consequences worth understanding
Reads are cheap. A reader gets the current metadata pointer, reads the metadata, and now has a complete list of files representing its consistent snapshot of the table. No locking, no waiting for writers, no risk of seeing partial changes.
Writes are versioned by design. Every write produces a new version of the metadata, and old versions remain referenceable until garbage collected. This is what enables “Time Travel — you can query the table as it existed at any past point in time simply by reading an older metadata pointer. Audit, reproducibility, and rollback all become natural operations rather than special engineering efforts.
Concurrent writers need coordination. If two writers both produce new metadata pointing to different new versions, only one can win the atomic commit. The other must retry. The three major formats handle this differently, which is one of the main ways they technically diverge.
Compaction is necessary maintenance. Because every write produces new files (small files for small writes), tables accumulate many small files over time. Periodic compaction merges them into larger files for read efficiency. This is the table format’s equivalent of database vacuum — invisible when working, painful when neglected.
Iceberg, Delta, Hudi — what’s actually different
The marketing makes these formats sound radically different. Technically, they are variations on the same theme. The differences matter, but for most workloads they are roughly equivalent in capability.
Apache Iceberg originated at Netflix in 2017 and is now governed by the Apache Software Foundation. Its key design choice is a hierarchical metadata structure: a current metadata file points to a manifest list, which points to manifest files, which point to data files. This indirection makes metadata operations efficient even on tables with millions of files. Iceberg has the strongest separation between table format and execution engine — the format is a true open standard, with implementations across BigQuery, Snowflake, Spark, Trino, DuckDB, Athena, and others. Its REST catalog specification has emerged as the standard interoperability layer for cross-engine deployments.
Delta Lake originated at Databricks in 2019 and is open-source under the Linux Foundation, though Databricks remains the dominant implementer and contributor. Its design uses a transaction log — a series of JSON files in a _delta_log/ directory — that records every change to the table. Reading a Delta table means reading the log to reconstruct the current state. Delta has the tightest integration with Spark and the Databricks platform, which is both its strength (excellent performance in that ecosystem) and its constraint (engines outside Databricks support it but typically lag in feature parity).
Apache Hudi originated at Uber in 2017. Its design distinguishes between “copy-on-write” tables (full file rewrites for updates) and “merge-on-read” tables (delta files merged at read time). This makes Hudi particularly strong for streaming ingestion workloads where new data arrives constantly and individual records need updating. The tradeoff is more complex operational behavior — merge-on-read tables require careful compaction strategies and produce more variable read performance.
The technical differences in 2026:
Iceberg has the broadest cross-engine support and is becoming the default choice for new lakehouse deployments. Its hierarchical metadata scales well, and its open standard nature has attracted the broadest ecosystem.
Delta Lake has the best integration with Databricks-centric workflows and has matured significantly. For teams already invested in Spark and Databricks, it remains the natural choice.
Hudi has the best support for streaming and CDC workloads with frequent updates. It has retreated from being a general-purpose competitor to being the right tool for specific high-frequency update patterns.
For most decisions, the format choice matters less than ecosystem fit. The technical capabilities have converged. The choice is increasingly about which format your tools support best, not which format is fundamentally superior.
The current state of the format competition
The competition between Iceberg, Delta, and Hudi has resolved more than most discussions acknowledge. By 2026, Apache Iceberg has emerged as the industry standard for open table formats. The signals are significant and converging:
AWS S3 Tables launched in late 2024 with Apache Iceberg as the native format. This was a strong endorsement from a major cloud provider that had previously been format-neutral.
Snowflake and BigQuery both added first-class support for Iceberg tables, allowing users to manage Iceberg tables natively while keeping the underlying files in their own object storage.
Databricks acquired Tabular — the company founded by Iceberg’s original creators — in 2024. Databricks subsequently announced UniForm, enabling Delta tables to be read as Iceberg tables, signaling that even the Delta ecosystem is converging toward Iceberg interoperability.
Apache Polaris, a vendor-neutral catalog implementation of the Iceberg REST specification, graduated to a top-level Apache project in February 2026. This addressed what had previously been the most vendor-locked layer of the lakehouse stack.
The Iceberg REST Catalog specification has become the standard interoperability layer, allowing different engines to share table metadata through a common API. Unity Catalog, AWS Glue, Nessie, and Lakekeeper all implement it.
The technical reasons for Iceberg’s adoption are well-understood: ACID transactions on object storage without a centralized lock service, schema evolution without data rewrites, partition evolution (changing partition schemes as query patterns evolve), and time travel through snapshot history. But the decisive factor has been governance. Iceberg is controlled by the Apache Software Foundation. Delta Lake, while open-source, is effectively controlled by Databricks — and that vendor coupling matters increasingly to enterprises wanting to avoid lock-in.
This is not to say Delta or Hudi are obsolete. Delta remains strong inside Databricks-heavy environments and continues to evolve. Hudi remains the right tool for streaming CDC workloads with high-frequency updates. But for teams making fresh architectural decisions today, Iceberg is the safest cross-engine bet. The momentum, the ecosystem, and the strategic decisions of major vendors all point in the same direction.
This convergence matters because it reduces the operational risk of adopting open table formats. Three years ago, choosing between Iceberg, Delta, and Hudi was a significant architectural decision with uncertain long-term implications. Today, that uncertainty has narrowed substantially.
What the lakehouse actually is
A “lakehouse” is what you get when you combine table formats with cloud object storage and bring multiple query engines to the same data.
The traditional architecture has been split into two parts:
Data lake. Cheap object storage (S3, GCS, Azure Blob), flexible file formats, weak guarantees, hard to query directly with SQL tools. Used primarily for raw data, ML training sets, and unstructured data.
Data warehouse. Expensive managed storage, structured tables, strong ACID guarantees, optimized query engines. Used for analytics, BI, and structured data.
Data flowed from lake to warehouse via ETL pipelines. The lake held everything; the warehouse held what analytics needed. Two systems, two storage costs, two governance models, two infrastructures to maintain — and inevitably, two copies of much of the same data drifting independently.
The lakehouse collapses this into one architecture. The same Parquet files sit in cloud object storage. Table formats provide the warehouse-like guarantees on top of those files. Query engines — Trino, Spark, DuckDB, Snowflake reading external tables, BigQuery reading external Iceberg tables — operate against the same data without copying. ML pipelines read the same files through Python libraries like PyIceberg or directly through Spark.
The benefits the lakehouse promises:
Single source of truth. No copy from lake to warehouse means no drift between them. The data that powers your dashboards is the same data that trains your ML models.
Lower storage cost. Object storage is dramatically cheaper than warehouse-managed storage — often by an order of magnitude. At scale, this becomes substantial.
Engine flexibility. Different workloads can use different engines against the same data. SQL analytics through Trino or BigQuery, ML training through Spark or PyTorch, ad-hoc exploration through DuckDB on a laptop.
Open formats. Your data is not locked into a proprietary format. Switching engines later becomes possible without data migration.
The complications most lakehouse marketing downplays:
Performance is engine-dependent. Not every engine is equally fast on Iceberg or Delta. Snowflake reading external Iceberg is generally slower than Snowflake reading native Snowflake tables. BigQuery reading external Iceberg has similar tradeoffs. The performance gap has narrowed but hasn’t disappeared.
High operational complexity. You’re now managing table formats, compaction schedules, metadata garbage collection, snapshot expiration, and engine-specific quirks. The warehouse vendor handled all of this for you. The lakehouse pushes it back to your team.
Governance is harder. When multiple engines access the same data, access control and audit logging become coordination problems across engines rather than a single system’s responsibility. The promise of “one source of truth” comes with the reality of “one source of truth that multiple systems need to access correctly.”
The “open formats” promise is more theoretical than practical for many teams. Switching engines later is possible but rarely free. Production pipelines, dashboards, and ML training jobs all encode assumptions about specific engines. Migration costs remain substantial.
The honest framing: the lakehouse architecture is genuinely better for some workloads and roughly equivalent (with different tradeoffs) for others. It’s not universally superior to the warehouse model. It is the direction the industry is moving, but adoption should be deliberate.
What’s worth taking away
The architectural conversation about lakehouses is often framed as a binary choice — warehouse versus lakehouse, old versus new. This framing is misleading. Most production architectures aren’t pure lakehouses or pure warehouses. They’re hybrid systems that use whichever component fits each workload.
The deeper insight is that table formats are an enabling technology, and the lakehouse is one architectural pattern they enable. Other patterns are emerging too: managed warehouses adding native Iceberg support (BigQuery, Snowflake), enabling lakehouse-like flexibility within warehouse-managed environments.
For practitioners, this has practical implications. The skills worth investing in are the underlying mechanics — how table formats work, how metadata layers provide ACID guarantees on object storage, how to reason about the analytics/ML split in your own organization. The specific vendor or product choice matters less than understanding the architecture well enough to evaluate options as the landscape continues to shift.
The trend is unmistakable: open table formats are becoming the standard substrate for analytical data, regardless of which engine sits on top. Whether you call the resulting architecture a “lakehouse” or something else is mostly semantic. Whether you understand the architecture well enough to make good decisions about it is not.
Most data professionals will encounter these formats in the next few years, whether by choice or because their existing tools quietly adopt them. The work of understanding the architecture now pays back when those decisions become unavoidable.
Sources
Apache Iceberg:
Delta Lake:
Apache Hudi:
Industry context:
Foundational reading:



