Chapter 8: MVCC and Transaction Isolation
“MVCC: Because readers should never wait for writers.”
When multiple transactions access the database simultaneously, chaos ensues—unless the database carefully manages what each transaction can see. MVCC (Multi-Version Concurrency Control) and transaction isolation levels are the mechanisms that maintain order.
8.1 The Concurrency Problem
Consider two transactions running simultaneously:
-- Transaction A -- Transaction B
BEGIN; BEGIN;
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $1000
UPDATE accounts
SET balance = balance - 100
WHERE id = 1;
COMMIT;
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $1000 or $900?
Should Transaction A see B’s change? It depends on what guarantees we want.
8.2 ACID: The Transaction Guarantees
Transactions provide four guarantees, known as ACID:
Atomicity
All or nothing. Either all changes commit, or none do.
Transfer $100:
1. Deduct from Account A
2. Add to Account B
If crash between steps:
- Without atomicity: A lost $100, B gained nothing
- With atomicity: Transaction rolls back, both unchanged
Consistency
Database moves from one valid state to another. Constraints are maintained.
Constraint: balance >= 0
Account has $50, tries to deduct $100:
- Without consistency: Balance = -$50 (invalid)
- With consistency: Transaction rejected
Isolation
Concurrent transactions don’t interfere. Each sees a consistent view.
Transaction A reads balance
Transaction B modifies balance
Transaction A reads balance again
- Without isolation: A sees different values
- With isolation: A sees consistent view
Durability
Once committed, changes survive crashes.
COMMIT succeeds
Server crashes
Server restarts
- Without durability: Data might be lost
- With durability: Committed data is there
8.3 MVCC: The Key Insight
Multi-Version Concurrency Control maintains multiple versions of each row:
MVCC: MULTIPLE VERSIONS
Logical row: accounts WHERE id = 1
Physical storage:
┌─────────────────────────────────────────────────────────────┐
│ Version 1: balance=$1000, created=TxID 100, deleted=TxID 150│
│ Version 2: balance=$900, created=TxID 150, deleted=∞ │
└─────────────────────────────────────────────────────────────┘
Transaction 140 (started before 150):
- Sees Version 1 (created 100 ≤ 140, deleted 150 > 140)
Transaction 160 (started after 150):
- Sees Version 2 (created 150 ≤ 160, deleted ∞)
Key Benefits
- Readers don’t block writers: Readers see old version while writer modifies
- Writers don’t block readers: Writer creates new version, readers see old
- Consistent snapshots: Each transaction sees data as of its start time
8.4 How MVCC Works: PostgreSQL
Row Structure
Each row has hidden system columns:
Physical Row (tuple):
┌─────────────────────────────────────────────────────────────┐
│ xmin: Transaction that created this version (100) │
│ xmax: Transaction that deleted/updated this version (150) │
│ ctid: Current tuple ID (physical location) │
│ data: balance=$1000, name='Alice', ... │
└─────────────────────────────────────────────────────────────┘
Visibility Rules
A row version is visible to transaction T if:
xminis committed ANDxminstarted before Txmaxis empty ORxmaxis not committed ORxmaxstarted after T
Transaction 140 checks visibility of Version 1:
- xmin = 100 (committed, before 140) ✓
- xmax = 150 (not yet committed OR after 140) ✓
- VISIBLE
Transaction 160 checks visibility of Version 1:
- xmin = 100 (committed, before 160) ✓
- xmax = 150 (committed, before 160) ✗
- NOT VISIBLE (deleted)
UPDATE Creates New Version
UPDATE accounts SET balance = 900 WHERE id = 1;
Before (TxID 150):
┌────────────────────────────────────────────────────────────┐
│ xmin=100, xmax=∞, balance=$1000 │
└────────────────────────────────────────────────────────────┘
After:
┌────────────────────────────────────────────────────────────┐
│ xmin=100, xmax=150, balance=$1000 ← Old version (deleted) │
└────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────┐
│ xmin=150, xmax=∞, balance=$900 ← New version │
└────────────────────────────────────────────────────────────┘
The old row isn’t physically deleted—it’s marked as deleted by setting xmax.
VACUUM: Cleaning Up Old Versions
Old versions accumulate. VACUUM removes them when no transaction can see them:
Before VACUUM:
[xmin=100, xmax=150, balance=$1000] ← Dead, no one can see
[xmin=150, xmax=∞, balance=$900] ← Current
After VACUUM:
[xmin=150, xmax=∞, balance=$900] ← Only current remains
VACUUM is crucial for PostgreSQL performance. Without it, tables bloat with dead rows.
8.5 How MVCC Works: MySQL InnoDB
InnoDB uses a different MVCC implementation:
Undo Logs
Instead of keeping old versions in the table, InnoDB stores them in undo logs:
INNODB MVCC
Clustered Index (current data):
┌─────────────────────────────────────────────────────────────┐
│ id=1, balance=$900, TxID=150, roll_ptr=→ │
└────────────────────────────────────┬────────────────────────┘
│
▼
Undo Log:
┌─────────────────────────────────────────────────────────────┐
│ Previous version: balance=$1000, TxID=100, roll_ptr=NULL │
└─────────────────────────────────────────────────────────────┘
Read View
When a transaction starts, it gets a read view:
Read View for Transaction 140:
- m_low_limit_id: 160 (TxIDs ≥ this are invisible)
- m_up_limit_id: 100 (TxIDs < this are visible)
- m_ids: [150, 155] (active transactions at start, invisible)
To read a row, follow undo chain until finding a visible version.
Purge Thread
InnoDB’s purge thread removes old undo records when no transaction needs them.
8.6 Transaction Isolation Levels
SQL defines four isolation levels, offering different trade-offs:
ISOLATION LEVEL SPECTRUM
Less Isolation More Isolation
Faster More Correct
◄───────────────────────────────────────────────────────────►
Read Read Repeatable Serializable
Uncommitted Committed Read
Read Uncommitted
See uncommitted changes from other transactions. (Rarely used)
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- Transaction A -- Transaction B
BEGIN; BEGIN;
UPDATE accounts SET balance = 0
WHERE id = 1;
-- NOT committed yet!
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $0 (dirty read!)
ROLLBACK;
-- A saw data that never existed!
Anomaly allowed: Dirty reads
Read Committed (PostgreSQL/Oracle default)
Only see committed changes. Each statement sees latest committed data.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- Transaction A -- Transaction B
BEGIN; BEGIN;
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $1000
UPDATE accounts SET balance = 900
WHERE id = 1;
COMMIT;
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $900 (different!)
-- Same query, different result = Non-repeatable read
Anomaly allowed: Non-repeatable reads, phantoms
Repeatable Read (MySQL default)
See a snapshot as of transaction start. Same query returns same result.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- Transaction A -- Transaction B
BEGIN; BEGIN;
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $1000
UPDATE accounts SET balance = 900
WHERE id = 1;
COMMIT;
SELECT balance FROM accounts
WHERE id = 1;
-- Returns: $1000 (same as before!)
COMMIT;
Anomaly allowed: Phantom reads (in standard SQL; PostgreSQL prevents them)
Serializable
Transactions execute as if in serial order. No anomalies.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Fully isolated: equivalent to running one after another
Most restrictive but guarantees correctness.
8.7 Isolation Anomalies Explained
Dirty Read
Reading uncommitted data that may be rolled back.
T1: UPDATE row SET x = 2
T2: SELECT x -- Returns 2
T1: ROLLBACK
-- T2 read value that never existed!
Non-Repeatable Read
Same query returns different results within one transaction.
T1: SELECT x -- Returns 1
T2: UPDATE row SET x = 2
T2: COMMIT
T1: SELECT x -- Returns 2 (different!)
Phantom Read
New rows appear in repeated queries.
T1: SELECT * FROM accounts WHERE balance > 100
-- Returns: Alice, Bob
T2: INSERT INTO accounts (name, balance) VALUES ('Carol', 150)
T2: COMMIT
T1: SELECT * FROM accounts WHERE balance > 100
-- Returns: Alice, Bob, Carol (phantom!)
Write Skew
Two transactions read overlapping data, make decisions, write non-overlapping data, creating invalid state.
Constraint: At least one doctor must be on call
T1: SELECT count(*) FROM on_call -- Returns 2
T2: SELECT count(*) FROM on_call -- Returns 2
T1: DELETE FROM on_call WHERE doctor = 'Alice'
T2: DELETE FROM on_call WHERE doctor = 'Bob'
T1: COMMIT (1 doctor remains, OK)
T2: COMMIT (0 doctors remain, VIOLATES CONSTRAINT!)
Neither transaction saw the other’s delete.
8.8 Isolation Levels and Anomalies Matrix
| Isolation Level | Dirty Read | Non-Repeatable | Phantom | Write Skew |
|---|---|---|---|---|
| Read Uncommitted | Yes | Yes | Yes | Yes |
| Read Committed | No | Yes | Yes | Yes |
| Repeatable Read | No | No | Possible* | Yes |
| Serializable | No | No | No | No |
*PostgreSQL prevents phantoms at Repeatable Read; MySQL allows them
8.9 Snapshot Isolation
Many databases implement Snapshot Isolation (SI), which is between Repeatable Read and Serializable:
Snapshot Isolation Rules:
1. Transaction sees consistent snapshot from start time
2. Write conflicts detected at commit time
3. First committer wins
T1: BEGIN (snapshot at time 100)
T2: BEGIN (snapshot at time 100)
T1: UPDATE accounts SET balance = 900 WHERE id = 1
T2: UPDATE accounts SET balance = 800 WHERE id = 1
T1: COMMIT -- Succeeds
T2: COMMIT -- FAILS! Write conflict on id=1
PostgreSQL: Repeatable Read is actually Snapshot Isolation.
Difference from true Serializable: SI allows write skew anomalies.
8.10 Serializable Snapshot Isolation (SSI)
PostgreSQL’s Serializable level uses SSI (Serializable Snapshot Isolation):
SSI: DETECTING CONFLICTS
Track read/write dependencies:
T1: Read A
T2: Write A ← conflict: T2 wrote what T1 read
T1: Write B ← conflict: T1 wrote after reading T2's input
Dependency graph:
T1 ──rw──► T2 ──wr──► T1 (cycle detected!)
Result: Abort one transaction to break cycle
SSI adds overhead for tracking dependencies but provides true serializability without the locking overhead of traditional approaches.
8.11 Practical Isolation Level Selection
When to Use Read Committed
- Default for most OLTP applications
- Acceptable when slight inconsistency is OK
- When you want maximum concurrency
-- OK: User profile view (momentary inconsistency acceptable)
SELECT name, email, last_login FROM users WHERE id = ?;
When to Use Repeatable Read / Snapshot Isolation
- Reports that need consistent view of data
- Transactions reading the same data multiple times
- When non-repeatable reads would cause bugs
-- Generate consistent report
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT SUM(balance) FROM accounts;
SELECT COUNT(*) FROM accounts WHERE balance < 0;
COMMIT;
-- Both queries see same snapshot
When to Use Serializable
- Financial transactions where write skew is dangerous
- When correctness is critical
- Simpler application logic (fewer edge cases to handle)
-- On-call schedule modification
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT count(*) FROM on_call;
DELETE FROM on_call WHERE doctor = ?;
-- Database prevents invalid state
COMMIT;
8.12 MVCC Overhead and Maintenance
Table Bloat
MVCC creates dead rows that consume space:
-- PostgreSQL: Check table bloat
SELECT
schemaname || '.' || relname as table,
n_live_tup,
n_dead_tup,
round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_pct
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
VACUUM Strategies
-- Manual VACUUM
VACUUM VERBOSE accounts;
-- Aggressive space reclamation
VACUUM FULL accounts; -- Warning: locks table!
-- Autovacuum configuration
ALTER TABLE accounts SET (
autovacuum_vacuum_threshold = 100,
autovacuum_vacuum_scale_factor = 0.1
);
Transaction ID Wraparound
PostgreSQL transaction IDs are 32-bit and wrap around. Long-running transactions or failed vacuums can cause issues:
-- Check transaction ID age
SELECT
datname,
age(datfrozenxid) as xid_age,
2^31 - age(datfrozenxid) as remaining
FROM pg_database
ORDER BY age(datfrozenxid) DESC;
-- If approaching 2^31, VACUUM urgently!
8.13 Read-Only Transactions
For read-only workloads, declare transactions as read-only:
BEGIN READ ONLY;
SELECT * FROM large_table;
COMMIT;
Benefits:
- Database can optimize (skip locking, use replicas)
- Prevents accidental writes
- May use older snapshot if acceptable
-- PostgreSQL: Use old snapshot if within limits
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ READ ONLY;
SET TRANSACTION SNAPSHOT '00000003-0000001A-1';
8.14 Summary
MVCC enables concurrent access without readers blocking writers:
- Multiple versions of each row coexist
- Visibility rules determine which version each transaction sees
- Isolation levels trade correctness for concurrency
- Snapshot Isolation provides repeatable reads efficiently
- SSI enables true serializability without traditional locking
- VACUUM cleans up old versions to prevent bloat
Understanding MVCC helps you choose appropriate isolation levels and diagnose concurrency issues.
What’s Next
In Chapter 9, we’ll explore locking and other concurrency control mechanisms—the locks, latches, and protocols that databases use when MVCC alone isn’t enough.
“In MVCC, every writer is a creator, and every reader sees their own private reality.”