Appendix A: Glossary of Terms
This glossary defines key terms used throughout the book.
A
- ACID
- The four properties of database transactions: Atomicity, Consistency, Isolation, and Durability.
- Analyzer
- The query processing component that validates semantic correctness, resolves names, and checks types.
- ARIES
- Algorithms for Recovery and Isolation Exploiting Semantics. The industry-standard crash recovery algorithm.
- AST (Abstract Syntax Tree)
- A tree representation of parsed SQL, used for semantic analysis and optimization.
B
- B-tree
- A self-balancing tree data structure that maintains sorted data and allows searches, insertions, and deletions in logarithmic time. The foundation of most database indexes.
- B+ tree
- A variant of B-tree where all values are stored in leaf nodes, which are linked together for efficient range scans.
- Bitmap Index
- An index that uses bitmaps to represent which rows contain each distinct value. Efficient for low-cardinality columns.
- Bloom Filter
- A probabilistic data structure that can tell you if an element is definitely not in a set. Used by LSM trees to avoid unnecessary disk reads.
- BRIN (Block Range Index)
- A compact index storing min/max values for ranges of pages. Efficient for naturally ordered large tables.
- Buffer Pool
- An area of memory that caches disk pages to reduce I/O. Also called buffer cache or page cache.
C
- CAP Theorem
- The principle that a distributed system can provide at most two of three guarantees: Consistency, Availability, and Partition tolerance.
- Cardinality
- The number of distinct values in a column, or the estimated number of rows a query will return.
- Checkpoint
- A point at which the database ensures all committed changes are written to disk, limiting recovery time.
- Clustered Index
- An index where the table data is physically stored in index order. InnoDB primary keys are clustered indexes.
- CLR (Compensation Log Record)
- A WAL record written during undo operations to ensure idempotent recovery.
- Column Store
- A storage format where each column is stored separately, optimized for analytical queries.
- Compaction
- The process of merging LSM tree SSTables to reduce read amplification and reclaim space.
- Covering Index
- An index containing all columns needed by a query, enabling index-only scans.
D
- Deadlock
- A situation where two or more transactions are waiting for each other’s locks, creating a cycle where none can proceed.
- Dirty Page
- A page in the buffer pool that has been modified but not yet written to disk.
- Double-Write Buffer
- An InnoDB mechanism that writes pages to a sequential buffer before their final location, protecting against torn pages.
- Durability
- The ACID guarantee that committed transactions survive system failures.
E
- Eventual Consistency
- A consistency model where reads may return stale data, but all replicas will eventually converge to the same state.
- Exclusive Lock (X Lock)
- A lock that prevents all other access to the locked resource.
- Executor
- The query processing component that runs the execution plan and returns results.
F
- Fanout
- The number of children a B-tree node can have. Higher fanout means shallower trees.
- Fill Factor
- The percentage of space to use when initially filling index pages, leaving room for future inserts.
- Foreign Data Wrapper (FDW)
- PostgreSQL mechanism to access external data sources as if they were local tables.
- Free Space Map (FSM)
- A data structure tracking available space in each page for efficient insert placement.
- fsync
- A system call that forces buffered data to be written to physical storage, ensuring durability.
- Full Page Write
- Writing an entire page to WAL after a checkpoint, enabling torn page recovery.
G
- GIN (Generalized Inverted Index)
- A PostgreSQL index type for composite values like arrays, JSONB, and full-text search.
- GiST (Generalized Search Tree)
- A PostgreSQL index type supporting custom data types and operations like geometric queries.
H
- Hash Index
- An index using a hash table for O(1) lookup, but not supporting range queries.
- Hash Join
- A join algorithm that builds a hash table from one relation and probes it with the other.
- Heap
- A table organization where rows are stored in no particular order. PostgreSQL uses heap storage.
- Histogram
- Statistics showing the distribution of values in a column, used by the query optimizer.
- Hot Standby
- A replica that can serve read queries while replicating from the primary.
- HTAP
- Hybrid Transactional/Analytical Processing. Systems handling both OLTP and OLAP workloads.
I
- Index-Only Scan
- A scan that retrieves all needed data from the index without accessing the heap.
- Isolation Level
- A transaction setting controlling what data modifications are visible. Levels include Read Uncommitted, Read Committed, Repeatable Read, and Serializable.
- Intent Lock
- A lock indicating intention to lock finer-grained resources, enabling hierarchical locking.
J
- Join
- An operation combining rows from two or more tables based on a related column.
K
- Key
- A value or set of values used to identify or locate data, such as a primary key or index key.
L
- Latch
- A lightweight lock protecting internal data structures (like B-tree pages) for very short durations.
- Leader-Based Replication
- A replication topology where one node (leader/primary) handles all writes and replicates to followers.
- Leveled Compaction
- An LSM tree compaction strategy organizing SSTables into levels with size ratios.
- Lock
- A mechanism preventing concurrent access to data to maintain consistency.
- Lock Escalation
- Converting many fine-grained locks to fewer coarse-grained locks to reduce overhead.
- Logical Logging
- Recording the operation performed rather than the bytes changed.
- LSM Tree (Log-Structured Merge Tree)
- A write-optimized data structure that buffers writes in memory and periodically flushes sorted files to disk.
- LSN (Log Sequence Number)
- A unique, monotonically increasing identifier for each WAL record.
M
- MCV (Most Common Values)
- Statistics listing the most frequent values in a column and their frequencies.
- Memtable
- The in-memory component of an LSM tree where recent writes are stored.
- Merge Join
- A join algorithm that merges two sorted inputs.
- MVCC (Multi-Version Concurrency Control)
- A technique maintaining multiple versions of data to allow concurrent reads and writes without blocking.
N
- Nested Loop Join
- A join algorithm that, for each row in the outer relation, scans the inner relation.
- NVMe
- A storage protocol optimized for SSDs, providing lower latency than SATA.
O
- OLAP (Online Analytical Processing)
- Workloads characterized by complex queries over large datasets, typically aggregations and scans.
- OLTP (Online Transaction Processing)
- Workloads characterized by many small transactions with point queries and updates.
- Optimizer
- The query processing component that evaluates execution strategies and chooses the best plan.
P
- Page
- The fixed-size unit of storage and I/O, typically 4KB, 8KB, or 16KB.
- Parser
- The query processing component that converts SQL text into an Abstract Syntax Tree.
- Partition Tolerance
- The ability of a distributed system to continue operating despite network partitions.
- Partitioning (Sharding)
- Dividing data across multiple nodes based on a partition key.
- Phantom Read
- An anomaly where new rows appear in repeated queries within a transaction.
- Physical Logging
- Recording the exact bytes changed in storage.
- PITR (Point-in-Time Recovery)
- Restoring a database to a specific moment by replaying WAL.
- Plan Cache
- Storage for prepared statement execution plans to avoid repeated planning.
- Predicate Pushdown
- Moving filter conditions closer to data sources for earlier elimination.
- Prefetching
- Reading data into cache before it’s requested, anticipating future access.
- Primary Key
- A column or set of columns uniquely identifying each row in a table.
Q
- Query Plan
- A tree of operators describing how to execute a query.
- Quorum
- The minimum number of nodes that must agree for a distributed operation to succeed.
R
- Raft
- A consensus protocol for maintaining a replicated log across distributed nodes.
- Range Partitioning
- Distributing data based on value ranges of the partition key.
- Read Amplification
- The ratio of data read from storage to data actually needed.
- Read Committed
- An isolation level where transactions only see committed data, but may see different data on repeated reads.
- Read-Your-Writes Consistency
- A guarantee that a user always sees their own recent writes.
- Redo
- Reapplying logged changes during recovery to restore committed data.
- Repeatable Read
- An isolation level where transactions see a consistent snapshot, preventing non-repeatable reads.
- Replication
- Copying data to multiple nodes for availability and read scalability.
- Row Store
- A storage format where all columns of a row are stored together, optimized for transactional queries.
- RTO (Recovery Time Objective)
- The maximum acceptable time to restore service after a failure.
- RPO (Recovery Point Objective)
- The maximum acceptable data loss measured in time.
S
- Selectivity
- The fraction of rows that match a predicate, used in cost estimation.
- Sequential Scan (Seq Scan)
- Reading all pages of a table in order.
- Serializable
- The strictest isolation level, guaranteeing transactions execute as if in serial order.
- A lock allowing multiple readers but preventing writers.
- See Partitioning.
- Slotted Page
- A page format with a slot array pointing to variable-length records.
- Snapshot Isolation
- A consistency level where transactions see a snapshot from their start time.
- SSD (Solid State Drive)
- Flash-based storage with no moving parts, providing faster random I/O than HDDs.
- SSI (Serializable Snapshot Isolation)
- A technique providing serializable isolation using snapshot isolation with conflict detection.
- SSTable (Sorted String Table)
- An immutable, sorted file in an LSM tree containing key-value pairs.
- Statistics
- Data about table contents (row counts, distinct values, distributions) used by the optimizer.
- Strong Consistency
- A guarantee that reads always see the most recent write.
T
- TID (Tuple ID)
- PostgreSQL’s row identifier consisting of (page number, slot number).
- TOAST (The Oversized-Attribute Storage Technique)
- PostgreSQL’s mechanism for storing large values out-of-line.
- Tombstone
- A marker indicating a deleted key in an LSM tree, persisting until compaction.
- Torn Page
- A page partially written due to a crash, containing a mix of old and new data.
- Transaction
- A unit of work that is atomic, consistent, isolated, and durable.
- Two-Phase Commit (2PC)
- A protocol for distributed transactions where participants first prepare, then commit.
- Two-Phase Locking (2PL)
- A protocol where transactions acquire all locks before releasing any.
U
- Undo
- Rolling back uncommitted changes during recovery.
V
- Vacuum
- PostgreSQL’s process for removing dead rows and updating statistics.
- Vectorized Execution
- Processing data in batches (vectors) rather than row-by-row for better performance.
- Visibility Map
- PostgreSQL structure tracking which pages contain only visible (all-committed) tuples.
W
- WAL (Write-Ahead Log)
- A log of changes written before data modifications, ensuring durability and enabling recovery.
- Working Set
- The portion of data actively being accessed, ideally fitting in the buffer pool.
- Write Amplification
- The ratio of data written to storage versus data logically changed.
X, Y, Z
- xmin/xmax
- PostgreSQL tuple header fields recording the creating and deleting transaction IDs.