Appendix A: Glossary of Terms

This glossary defines key terms used throughout the book.

A

ACID: The four properties of database transactions: Atomicity, Consistency, Isolation, and Durability.
Analyzer: The query processing component that validates semantic correctness, resolves names, and checks types.
ARIES: Algorithms for Recovery and Isolation Exploiting Semantics. The industry-standard crash recovery algorithm.
AST (Abstract Syntax Tree): A tree representation of parsed SQL, used for semantic analysis and optimization.

B

B-tree: A self-balancing tree data structure that maintains sorted data and allows searches, insertions, and deletions in logarithmic time. The foundation of most database indexes.
B+ tree: A variant of B-tree where all values are stored in leaf nodes, which are linked together for efficient range scans.
Bitmap Index: An index that uses bitmaps to represent which rows contain each distinct value. Efficient for low-cardinality columns.
Bloom Filter: A probabilistic data structure that can tell you if an element is definitely not in a set. Used by LSM trees to avoid unnecessary disk reads.
BRIN (Block Range Index): A compact index storing min/max values for ranges of pages. Efficient for naturally ordered large tables.
Buffer Pool: An area of memory that caches disk pages to reduce I/O. Also called buffer cache or page cache.

C

CAP Theorem: The principle that a distributed system can provide at most two of three guarantees: Consistency, Availability, and Partition tolerance.
Cardinality: The number of distinct values in a column, or the estimated number of rows a query will return.
Checkpoint: A point at which the database ensures all committed changes are written to disk, limiting recovery time.
Clustered Index: An index where the table data is physically stored in index order. InnoDB primary keys are clustered indexes.
CLR (Compensation Log Record): A WAL record written during undo operations to ensure idempotent recovery.
Column Store: A storage format where each column is stored separately, optimized for analytical queries.
Compaction: The process of merging LSM tree SSTables to reduce read amplification and reclaim space.
Covering Index: An index containing all columns needed by a query, enabling index-only scans.

D

Deadlock: A situation where two or more transactions are waiting for each other’s locks, creating a cycle where none can proceed.
Dirty Page: A page in the buffer pool that has been modified but not yet written to disk.
Double-Write Buffer: An InnoDB mechanism that writes pages to a sequential buffer before their final location, protecting against torn pages.
Durability: The ACID guarantee that committed transactions survive system failures.

E

Eventual Consistency: A consistency model where reads may return stale data, but all replicas will eventually converge to the same state.
Exclusive Lock (X Lock): A lock that prevents all other access to the locked resource.
Executor: The query processing component that runs the execution plan and returns results.

F

Fanout: The number of children a B-tree node can have. Higher fanout means shallower trees.
Fill Factor: The percentage of space to use when initially filling index pages, leaving room for future inserts.
Foreign Data Wrapper (FDW): PostgreSQL mechanism to access external data sources as if they were local tables.
Free Space Map (FSM): A data structure tracking available space in each page for efficient insert placement.
fsync: A system call that forces buffered data to be written to physical storage, ensuring durability.
Full Page Write: Writing an entire page to WAL after a checkpoint, enabling torn page recovery.

G

GIN (Generalized Inverted Index): A PostgreSQL index type for composite values like arrays, JSONB, and full-text search.
GiST (Generalized Search Tree): A PostgreSQL index type supporting custom data types and operations like geometric queries.

H

Hash Index: An index using a hash table for O(1) lookup, but not supporting range queries.
Hash Join: A join algorithm that builds a hash table from one relation and probes it with the other.
Heap: A table organization where rows are stored in no particular order. PostgreSQL uses heap storage.
Histogram: Statistics showing the distribution of values in a column, used by the query optimizer.
Hot Standby: A replica that can serve read queries while replicating from the primary.
HTAP: Hybrid Transactional/Analytical Processing. Systems handling both OLTP and OLAP workloads.

I

Index-Only Scan: A scan that retrieves all needed data from the index without accessing the heap.
Isolation Level: A transaction setting controlling what data modifications are visible. Levels include Read Uncommitted, Read Committed, Repeatable Read, and Serializable.
Intent Lock: A lock indicating intention to lock finer-grained resources, enabling hierarchical locking.

J

Join: An operation combining rows from two or more tables based on a related column.

K

Key: A value or set of values used to identify or locate data, such as a primary key or index key.

L

Latch: A lightweight lock protecting internal data structures (like B-tree pages) for very short durations.
Leader-Based Replication: A replication topology where one node (leader/primary) handles all writes and replicates to followers.
Leveled Compaction: An LSM tree compaction strategy organizing SSTables into levels with size ratios.
Lock: A mechanism preventing concurrent access to data to maintain consistency.
Lock Escalation: Converting many fine-grained locks to fewer coarse-grained locks to reduce overhead.
Logical Logging: Recording the operation performed rather than the bytes changed.
LSM Tree (Log-Structured Merge Tree): A write-optimized data structure that buffers writes in memory and periodically flushes sorted files to disk.
LSN (Log Sequence Number): A unique, monotonically increasing identifier for each WAL record.

M

MCV (Most Common Values): Statistics listing the most frequent values in a column and their frequencies.
Memtable: The in-memory component of an LSM tree where recent writes are stored.
Merge Join: A join algorithm that merges two sorted inputs.
MVCC (Multi-Version Concurrency Control): A technique maintaining multiple versions of data to allow concurrent reads and writes without blocking.

N

Nested Loop Join: A join algorithm that, for each row in the outer relation, scans the inner relation.
NVMe: A storage protocol optimized for SSDs, providing lower latency than SATA.

O

OLAP (Online Analytical Processing): Workloads characterized by complex queries over large datasets, typically aggregations and scans.
OLTP (Online Transaction Processing): Workloads characterized by many small transactions with point queries and updates.
Optimizer: The query processing component that evaluates execution strategies and chooses the best plan.

P

Page: The fixed-size unit of storage and I/O, typically 4KB, 8KB, or 16KB.
Parser: The query processing component that converts SQL text into an Abstract Syntax Tree.
Partition Tolerance: The ability of a distributed system to continue operating despite network partitions.
Partitioning (Sharding): Dividing data across multiple nodes based on a partition key.
Phantom Read: An anomaly where new rows appear in repeated queries within a transaction.
Physical Logging: Recording the exact bytes changed in storage.
PITR (Point-in-Time Recovery): Restoring a database to a specific moment by replaying WAL.
Plan Cache: Storage for prepared statement execution plans to avoid repeated planning.
Predicate Pushdown: Moving filter conditions closer to data sources for earlier elimination.
Prefetching: Reading data into cache before it’s requested, anticipating future access.
Primary Key: A column or set of columns uniquely identifying each row in a table.

Q

Query Plan: A tree of operators describing how to execute a query.
Quorum: The minimum number of nodes that must agree for a distributed operation to succeed.

R

Raft: A consensus protocol for maintaining a replicated log across distributed nodes.
Range Partitioning: Distributing data based on value ranges of the partition key.
Read Amplification: The ratio of data read from storage to data actually needed.
Read Committed: An isolation level where transactions only see committed data, but may see different data on repeated reads.
Read-Your-Writes Consistency: A guarantee that a user always sees their own recent writes.
Redo: Reapplying logged changes during recovery to restore committed data.
Repeatable Read: An isolation level where transactions see a consistent snapshot, preventing non-repeatable reads.
Replication: Copying data to multiple nodes for availability and read scalability.
Row Store: A storage format where all columns of a row are stored together, optimized for transactional queries.
RTO (Recovery Time Objective): The maximum acceptable time to restore service after a failure.
RPO (Recovery Point Objective): The maximum acceptable data loss measured in time.

S

Selectivity: The fraction of rows that match a predicate, used in cost estimation.
Sequential Scan (Seq Scan): Reading all pages of a table in order.
Serializable: The strictest isolation level, guaranteeing transactions execute as if in serial order.
Shared Lock (S Lock): A lock allowing multiple readers but preventing writers.
Sharding: See Partitioning.
Slotted Page: A page format with a slot array pointing to variable-length records.
Snapshot Isolation: A consistency level where transactions see a snapshot from their start time.
SSD (Solid State Drive): Flash-based storage with no moving parts, providing faster random I/O than HDDs.
SSI (Serializable Snapshot Isolation): A technique providing serializable isolation using snapshot isolation with conflict detection.
SSTable (Sorted String Table): An immutable, sorted file in an LSM tree containing key-value pairs.
Statistics: Data about table contents (row counts, distinct values, distributions) used by the optimizer.
Strong Consistency: A guarantee that reads always see the most recent write.

T

TID (Tuple ID): PostgreSQL’s row identifier consisting of (page number, slot number).
TOAST (The Oversized-Attribute Storage Technique): PostgreSQL’s mechanism for storing large values out-of-line.
Tombstone: A marker indicating a deleted key in an LSM tree, persisting until compaction.
Torn Page: A page partially written due to a crash, containing a mix of old and new data.
Transaction: A unit of work that is atomic, consistent, isolated, and durable.
Two-Phase Commit (2PC): A protocol for distributed transactions where participants first prepare, then commit.
Two-Phase Locking (2PL): A protocol where transactions acquire all locks before releasing any.

U

Undo: Rolling back uncommitted changes during recovery.

V

Vacuum: PostgreSQL’s process for removing dead rows and updating statistics.
Vectorized Execution: Processing data in batches (vectors) rather than row-by-row for better performance.
Visibility Map: PostgreSQL structure tracking which pages contain only visible (all-committed) tuples.

W

WAL (Write-Ahead Log): A log of changes written before data modifications, ensuring durability and enabling recovery.
Working Set: The portion of data actively being accessed, ideally fitting in the buffer pool.
Write Amplification: The ratio of data written to storage versus data logically changed.

X, Y, Z

xmin/xmax: PostgreSQL tuple header fields recording the creating and deleting transaction IDs.

Keyboard shortcuts

Database Internals: Where Your Data Actually Lives