What Isolation Actually Means (and what it looks like in analytical systems)
Have you ever wondered what Isolation (ACID) is or how it relates to analytical data systems? This article translates the principles to analytical systems like BigQuery.
This is the third article in a series on ACID. The first covered atomicity. The second covered consistency. This one covers isolation, which is the hardest of the four properties to understand.
The reason isolation is hard isn’t conceptual difficulty. It’s because isolation isn’t a guarantee in itself. It’s a spectrum of guarantees, with named levels, each allowing specific anomalies that the level above it forbids. And to understand isolation, you need to understand the anomalies and the levels in tandem.
This article covers what isolation actually provides in OLTP databases, where the concept originated, and then translates those principles to analytical systems like BigQuery, because that’s where most of the article’s readers actually work, and the analytical analog of isolation is rarely discussed.
Why isolation exists in the first place
The need for isolation comes from concurrent transactions. If only one transaction ran at a time, isolation would be free — each transaction would see the database as it was when the previous one finished. No anomalies possible. No coordination required.
The problem is that running transactions one at a time is unacceptably slow. A modern database serves thousands of concurrent connections. Forcing them to run sequentially would essentially waste all the parallelism available in modern hardware.
So databases run transactions concurrently. Multiple transactions are in flight at any moment, reading and writing the same data. The question is: what does each transaction see when it reads?
Consider two transactions running simultaneously:
Transaction A: Transaction B:
read balance from acc1 update acc1 set balance = balance + 100
if balance > 500: ... commitThe order matters. If A reads first and B commits later, A sees the original balance. If B commits first and A reads after, A sees the new balance. Both outcomes are reasonable.
But what if A reads while B is in the middle of its update? B has changed the balance in memory but hasn’t committed yet. Does A see the changed value or the original?
The answer depends on the database’s isolation level. The strictest answer is that A sees the original value because B hasn’t committed, so B’s changes don’t officially exist yet. The most permissive answer is that A might see either value, depending on timing.
Real databases sit somewhere on this spectrum, and where they sit is determined by how much you’ve configured them to care about concurrent anomalies. To understand the levels, you need to understand the anomalies they prevent.
The four classical anomalies
Four anomalies form the historical foundation for thinking about isolation. They’re listed here in roughly increasing order of subtlety.
Dirty read. Transaction A reads data that transaction B has written but not yet committed. If B later rolls back, A has seen a value that never officially existed.
Example: B starts a transaction that updates an account balance from $1000 to $500. Before B commits, A reads the balance and sees $500. Then B rolls back because of an error. The balance was never actually $500 — but A already used that value to make a decision. A’s logic is now based on a value that never existed.
Non-repeatable read. Transaction A reads the same row twice within its transaction and gets different values, because transaction B updated the row in between.
Example: A reads account 123’s balance and sees $1000. B updates account 123’s balance to $500 and commits. A reads account 123’s balance again, in the same transaction, and now sees $500. The same row, two different values within A’s transaction.
Phantom read. Transaction A runs the same query twice within its transaction and gets a different set of rows, because transaction B inserted (or deleted) data matching the query in between.
Example: A runs SELECT * FROM orders WHERE amount > 1000 and gets 5 rows. B inserts a new order with amount = 1500 and commits. A runs the same query again and now gets 6 rows. The new row “appeared” from A’s perspective — a phantom that wasn’t there before.
The distinction between non-repeatable reads and phantoms is worth stating precisely, because it confuses people: non-repeatable reads are about existing rows being modified. Phantoms are about new rows appearing (or rows disappearing) that match your filter criteria.
This distinction matters for implementation. Preventing non-repeatable reads is straightforward — when you read a row, the database can lock that row (or, in MVCC, give you a snapshot version) and guarantee you see the same value if you read it again. Preventing phantoms is harder. It’s not enough to lock the rows you’ve read; you have to prevent new matching rows from being inserted. This requires range locks or predicate locks, which are more complex and more restrictive than row locks.
Lost update. Two transactions read the same row, both modify it based on what they read, and one of the modifications is lost because the other transaction committed last.
Example: A reads balance = $1000 and plans to add $100. B reads balance = $1000 and plans to subtract $50. A writes $1100 and commits. B writes $950 and commits. The final balance is $950, but it should have been $1050. A’s $100 addition was lost. Both transactions read the same starting value, and the one that wrote last overwrote the other’s work.
These four anomalies are the historical foundation of isolation theory. The SQL standard’s isolation levels are defined by which of these each level prevents.
The four SQL isolation levels
The SQL standard defines four isolation levels, ordered from weakest to strongest.
Read Uncommitted. The weakest level. Allows all four anomalies, including dirty reads. A transaction can see uncommitted changes from other transactions. Almost never used in practice — modern databases provide stronger guarantees by default.
Read Committed. Prevents dirty reads. A transaction only sees data that has been committed. This is the default in PostgreSQL, Oracle, and SQL Server. It still allows non-repeatable reads, phantom reads, and lost updates.
Repeatable Read. Prevents dirty reads and non-repeatable reads. Within a transaction, every read of the same row returns the same value, even if other transactions have updated it. The SQL standard allows phantom reads at this level, though some databases (notably MySQL with InnoDB) prevent them too.
Serializable. The strongest level. Prevents all four anomalies. Transactions appear to execute as if they ran one at a time, in some serial order. The most expensive level to implement.
The standard’s definitions are useful but somewhat dated. They were written when locking was the dominant implementation strategy, and they don’t fully capture the anomalies that can occur with modern MVCC-based implementations. Postgres’s Serializable level, SQL Server’s Serializable, and Oracle’s Serializable all differ in their guarantees in specific edge cases. The standard is a starting point, not a precise specification.
The practical takeaway: most workloads run at Read Committed by default, and most application code is written assuming this level. Many bugs in transactional logic come from assuming stronger isolation than what’s actually provided.
How isolation is actually implemented
Two fundamental approaches are: locking and MVCC. Most modern databases use MVCC, but understanding both is important because they have different performance characteristics and different failure modes.
Locking-based isolation. When a transaction reads a row, it acquires a read lock. When it writes a row, it acquires a write lock. Other transactions wanting to access the same row have to wait for the lock to be released.
This is conceptually simple. It also creates a significant problem. Two transactions reading the same row can share read locks, but a transaction that wants to write has to wait for all readers to finish. Long-running transactions can block many shorter ones. This is why systems with heavy locking can have surprisingly poor concurrency under load — the locks serialize work that could, in principle, run in parallel.
Locking was the dominant strategy in older systems. It still appears in specific contexts, but most modern databases have moved to MVCC.
MVCC (Multi-Version Concurrency Control). Instead of locking rows, the database keeps multiple versions of each row. When a transaction starts, it sees a snapshot of the database as it existed at the moment the transaction began. Other transactions can update rows during the transaction’s lifetime — their updates create new versions — but the original transaction continues seeing the versions that were current when it started.
The mechanism worth being precise about: MVCC doesn’t pick a “winning” version. It keeps all versions, and each transaction sees the version that matches its snapshot.
If transaction A starts at time T1, A will see the data as it existed at T1. If transaction B starts at time T2 (slightly later), B sees the data as of T2. Both transactions run concurrently, looking at different snapshots. Neither blocks the other for reads.
Old versions are eventually cleaned up by a background process — the “vacuum” in Postgres, similar processes in other MVCC databases. Cleanup happens once no active transaction could possibly need an old version anymore.
The cost of MVCC: storage overhead from multiple versions, plus the periodic cleanup work. The benefit: readers never block writers, and writers never block readers. This is one of the reasons modern databases can handle high concurrency that older locking-based systems couldn’t.
PostgreSQL, Oracle, MySQL with InnoDB, and most modern databases use MVCC. Implementations vary in detail — how they track versions, how they decide which version a given transaction sees, when they clean up — but the core idea is the same.
The anomalies MVCC introduces
MVCC eliminates contention but introduces new anomalies that the classical four don’t capture cleanly. Two are worth knowing.
Write skew. Two transactions read overlapping data, make decisions based on what they read, and write to different rows. Neither transaction’s write conflicts with the other’s write, so both commit successfully — but the combined effect violates an invariant that depended on both transactions seeing each other’s changes.
The canonical example: a hospital requires at least one doctor on call at all times. Two doctors are currently on call. Both want to take themselves off call. Transaction A reads “2 doctors on call” and decides it’s safe to remove doctor A. Transaction B simultaneously reads “2 doctors on call” and decides it’s safe to remove doctor B. Neither transaction writes to the same row as the other, so neither sees a conflict at the row level. Both commit. Now zero doctors are on call — violating the invariant.
Write skew is not one of the four classical anomalies because the classical framework was developed in the era of lock-based databases, where shared read locks would force one of the two transactions to wait. The waiting either resolved the conflict (one transaction sees the other’s commit and decides differently) or produced a deadlock (which the database detects and aborts). Either way, the anomaly was prevented through the contention and deadlocks that shared read locks produced — at significant performance cost.
Snapshot isolation removes shared read locks to eliminate that contention. The performance improvement is dramatic, but a side effect is that write skew becomes possible. Two transactions can read the same data without taking lasting locks, both make decisions based on what they read, and both commit non-overlapping writes successfully — because neither transaction holds a lock that would block the other.
Read skew. A transaction reads multiple rows at different points in time, and another transaction commits between the reads. The first transaction now has a view of the data that was never consistent — some rows are from one moment, others from another.
Example: A reads account 1’s balance ($500). B transfers $200 from account 1 to account 2 and commits. A then reads account 2’s balance ($800, which includes the $200 from the transfer). A now believes the total balance is $1300. But the actual total at any single moment was $1100 (either $700 + $400 before the transfer, or $500 + $800 after). A has a view that never existed at any single point in time.
These MVCC-specific anomalies are why Serializable isolation requires more than just snapshot isolation. PostgreSQL’s “Serializable Snapshot Isolation” detects potential write skew and aborts one of the conflicting transactions. SQL Server’s Serializable uses different mechanisms. The implementations vary, but the goal is the same: prevent the anomalies that snapshot isolation alone allows.
What isolation looks like in analytical systems
Most isolation content is written from the OLTP perspective. That’s the natural fit — isolation is fundamentally an OLTP concept, because OLTP systems have many concurrent transactions reading and writing the same data.
Analytical systems are different. BigQuery, Snowflake, Databricks — these are query-oriented, not transaction-oriented. The unit of work is the query, not a multi-statement transaction. You don’t BEGIN ... COMMIT your way through a series of analytical operations the way you do in Postgres.
This changes how isolation manifests in analytical work.
A single query gets a consistent snapshot. When a query starts in BigQuery or Snowflake, the engine determines the snapshot time, and the entire query — including all CTEs, subqueries, and joins — sees the data as it existed at that snapshot. You cannot have non-repeatable reads or phantom reads within a single query. The query is consistent with itself.
This means complex analytical queries with many CTEs referencing the same table are safe in a way OLTP queries aren’t. The first CTE and the tenth CTE both see the same version of the table. Even if data is loading into the table while your query runs, your query sees the snapshot from when it started.
Multi-statement transactions are limited or absent. BigQuery supports multi-statement transactions only in scripted contexts. Snowflake supports them more broadly. Most analytical work doesn’t use multi-statement transactions at all — each query is its own atomic unit, and there’s no concept of “this query and the next one should see the same snapshot.”
This has practical consequences. If your analysis requires running two queries that need to see the same point-in-time data, you have to design around the absence of multi-statement transactions. Often this means materializing intermediate results to a table you control, then querying that table from subsequent steps.
Concurrent DML on the same table gets handled, but with care. If two MERGE statements run concurrently against the same table, the warehouse handles the conflict — one succeeds and the other may retry or fail. You don’t get partial-update corruption. But you do get the operational concern of what happens when concurrent DMLs conflict, which becomes important in production pipelines.
The pipeline-level equivalent of read skew is real. This is the analytical analog of the MVCC anomaly that’s worth understanding clearly.
Consider a dbt pipeline: model A reads from source table X and writes to intermediate table Y. Model B reads from Y and writes to final table Z. The whole pipeline runs over some period of time.
If source table X is changing during the pipeline run, you can end up with temporal inconsistency. Model A read X at time T1 and produced Y based on what X looked like at T1. Model B read Y at time T2. But by T2, X has changed, so Y might not reflect the current state of X. If model B also joins to X (or to something derived from X at T2), the result combines data from two different moments — exactly the read skew pattern.
The fix in analytical systems is rarely “use Serializable isolation.” It’s usually one of:
Pause upstream writes during pipeline runs (often impractical)
Snapshot the source data at pipeline start and run the entire pipeline against the snapshot (often the cleanest answer)
Accept that the pipeline produces “approximately correct” results and design downstream consumers to tolerate it
Use time-travel features to query specific snapshots explicitly
This is why the senior analytical engineer’s mental model includes asking “what happens when the source data changes during the pipeline?” — that question is the analytical analog of asking “what isolation level does this transaction need?”
What’s worth taking away
Isolation is the most complex of the ACID properties because it’s not a single guarantee but a spectrum. The classical anomalies define the spectrum: dirty reads, non-repeatable reads, phantom reads, lost updates. The SQL standard’s isolation levels are defined by which anomalies they prevent.
Modern databases mostly use MVCC instead of locking. This eliminates the contention problems of locking but introduces new anomalies — write skew and read skew — that the classical framework doesn’t fully capture. Serializable isolation in modern systems requires more than just snapshot isolation to prevent these.
For OLTP work, the practical question is always: what isolation level does this transaction need, and is it the level my database actually provides? Most code is written at the default (Read Committed), which is too weak for many use cases that assume stronger guarantees.
For analytical work, the framework translates differently. A single query gets a consistent snapshot, which is stronger than what most OLTP defaults provide. But analytical pipelines have their own version of read skew — temporal inconsistency between pipeline stages when source data changes during the run. The analog of “what isolation level does this transaction need?” is “what happens when the source data changes during the pipeline?”
Both questions matter. The OLTP version is well-documented. The analytical version is less discussed but equally real, and understanding the connection is what makes the ACID framework useful beyond the database systems where it originated.
Sources
Foundational:
Gray, J. & Reuter, A. (1993), Transaction Processing: Concepts and Techniques — the canonical reference on transaction isolation
Berenson, H. et al. (1995), A Critique of ANSI SQL Isolation Levels — the paper that identified the gaps in the SQL standard’s isolation level definitions and named the additional anomalies that locking-based vs MVCC-based implementations expose
Designing Data-Intensive Applications by Martin Kleppmann, Chapter 7 — the most accessible modern treatment, with extensive coverage of write skew and the MVCC-specific anomalies
On MVCC and snapshot isolation:
Ports, D. & Grittner, K. (2012), Serializable Snapshot Isolation in PostgreSQL — the paper describing how Postgres implements Serializable on top of snapshot isolation
On analytical systems:
🧩 Who am I
I build, explain, and teach analytical data systems — the mechanics underneath reporting, warehouses, pipelines, and the decisions that depend on them.
PipelinePatterns.co | LinkedIn | Substack | Skool Community | Support my Work



