Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 6: Hash Indexes and Specialized Structures

“The right data structure for the job makes all the difference.”

B-trees and LSM trees are general-purpose workhorses, but sometimes specialized data structures offer better performance for specific access patterns. In this chapter, we’ll explore hash indexes, bitmap indexes, and other specialized structures.


6.1 Hash Indexes

Hash indexes provide O(1) average-case lookup by key—faster than B-trees for exact matches.

How Hash Indexes Work

    Hash Function: key → bucket number

    Example: hash("alice@example.com") mod 8 = 3

    ┌─────────────────────────────────────────────────────────────┐
    │  HASH TABLE                                                  │
    │                                                              │
    │  Bucket 0: [empty]                                          │
    │  Bucket 1: [bob@... → row] → [zoe@... → row]               │
    │  Bucket 2: [empty]                                          │
    │  Bucket 3: [alice@... → row] → [mike@... → row]            │
    │  Bucket 4: [carol@... → row]                                │
    │  Bucket 5: [empty]                                          │
    │  Bucket 6: [dave@... → row]                                 │
    │  Bucket 7: [eve@... → row] → [frank@... → row]             │
    │                                                              │
    └─────────────────────────────────────────────────────────────┘

Lookup Process

    Find "alice@example.com":

    1. Compute hash: hash("alice@example.com") = 3
    2. Go to bucket 3
    3. Search chain for exact match
    4. Found! Return row pointer

    Time: O(1) average (O(n) worst case with bad hash distribution)

Hash Index Limitations

No range queries: Hash destroys key ordering

    -- Can use hash index:
    SELECT * FROM users WHERE email = 'alice@example.com';

    -- CANNOT use hash index:
    SELECT * FROM users WHERE email > 'alice@example.com';
    SELECT * FROM users WHERE email LIKE 'alice%';
    SELECT * FROM users ORDER BY email;

No prefix matching: Must match entire key

No partial index scans: Can’t stop early

On-Disk Hash Indexes

In-memory hash tables are straightforward, but on-disk is harder:

Problem: Hash table resizing requires rewriting everything

Solution: Extendible hashing or linear hashing

    EXTENDIBLE HASHING

    Use first N bits of hash to select bucket

    Global depth = 2 (use 2 bits)

    00 ──► Bucket A: [keys with hash starting 00...]
    01 ──► Bucket B: [keys with hash starting 01...]
    10 ──► Bucket C: [keys with hash starting 10...]
    11 ──► Bucket C: [same bucket for 10 and 11]

    When Bucket A overflows:
    - Split it into A1 (000...) and A2 (001...)
    - Increase global depth to 3 for those entries
    - Other buckets unchanged

6.2 Hash Indexes in Practice

PostgreSQL Hash Indexes

PostgreSQL supports hash indexes but with caveats:

-- Create hash index
CREATE INDEX idx_email_hash ON users USING hash (email);

-- Only for equality comparisons
SELECT * FROM users WHERE email = 'alice@example.com';  -- Uses index
SELECT * FROM users WHERE email > 'a';  -- Cannot use hash index

Historical note: Before PostgreSQL 10, hash indexes weren’t crash-safe (no WAL logging). Now they are, but B-trees are still usually preferred.

MySQL Memory Engine

The Memory (HEAP) storage engine supports hash indexes:

-- Memory table with hash index
CREATE TABLE cache (
    key_col VARCHAR(255),
    value_col TEXT,
    INDEX USING HASH (key_col)
) ENGINE=MEMORY;

Useful for temporary tables and caches where data is in memory anyway.

When to Use Hash Indexes

Good use cases:

  • High-cardinality exact-match lookups (UUIDs, email addresses)
  • In-memory tables/caches
  • Join keys when only equality joins are needed

Bad use cases:

  • Range queries of any kind
  • Columns used in ORDER BY
  • Low cardinality columns
  • General purpose indexing

6.3 Bloom Filters Deep Dive

We introduced bloom filters in Chapter 5. Let’s understand them more deeply.

Structure

A bloom filter is a bit array with k hash functions:

    Bloom Filter (m=16 bits, k=3 hash functions)

    Initial state (empty):
    [0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0]
     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Inserting

    Insert "alice":
    h1("alice") = 3
    h2("alice") = 7
    h3("alice") = 11

    Set bits 3, 7, 11 to 1:
    [0][0][0][1][0][0][0][1][0][0][0][1][0][0][0][0]
     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

    Insert "bob":
    h1("bob") = 1
    h2("bob") = 7  (already set!)
    h3("bob") = 14

    [0][1][0][1][0][0][0][1][0][0][0][1][0][0][1][0]
     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Querying

    Query "alice":
    Check bits 3, 7, 11 → All are 1 → MAYBE present

    Query "carol":
    h1("carol") = 5
    h2("carol") = 9
    h3("carol") = 11

    Check bits 5, 9, 11
    Bit 5 = 0 → DEFINITELY NOT present

    (We can stop at first 0 bit)

False Positive Rate

    Probability of false positive ≈ (1 - e^(-kn/m))^k

    Where:
    m = number of bits
    n = number of inserted elements
    k = number of hash functions

    Example:
    m = 10 bits per element
    k = 7 hash functions
    False positive rate ≈ 0.8%

Optimal Configuration

    Given n elements and desired false positive rate p:

    Optimal m (bits) = -n * ln(p) / (ln(2))^2
    Optimal k (hash functions) = (m/n) * ln(2)

    Example: 1 million keys, 1% false positive rate
    m ≈ 9.6 million bits ≈ 1.2 MB
    k ≈ 7 hash functions

Bloom Filter Variants

Counting Bloom Filter: Use counters instead of bits, allows deletes Cuckoo Filter: Supports deletes, often more space-efficient Quotient Filter: Cache-friendly, supports merging


6.4 Bitmap Indexes

Bitmap indexes are highly efficient for low-cardinality columns.

Structure

For each distinct value, store a bitmap indicating which rows have that value:

    Table: users
    ┌────┬─────────┬────────┐
    │ id │  name   │ status │
    ├────┼─────────┼────────┤
    │ 1  │ Alice   │ active │
    │ 2  │ Bob     │ active │
    │ 3  │ Carol   │ pending│
    │ 4  │ Dave    │ active │
    │ 5  │ Eve     │ deleted│
    │ 6  │ Frank   │ pending│
    └────┴─────────┴────────┘

    Bitmap Index on status:

    "active":  [1][1][0][1][0][0]  (rows 1, 2, 4)
    "pending": [0][0][1][0][0][1]  (rows 3, 6)
    "deleted": [0][0][0][0][1][0]  (row 5)

Query Evaluation

Queries become bitwise operations:

    SELECT * FROM users WHERE status = 'active';

    → Return rows where bitmap "active" has bit = 1
    → Rows 1, 2, 4
    SELECT * FROM users WHERE status = 'active' OR status = 'pending';

    "active":  [1][1][0][1][0][0]
    "pending": [0][0][1][0][0][1]
    OR result: [1][1][1][1][0][1]

    → Rows 1, 2, 3, 4, 6
    SELECT * FROM users WHERE status = 'active' AND country = 'US';

    "status=active": [1][1][0][1][0][0]
    "country=US":    [1][0][1][0][0][1]
    AND result:      [1][0][0][0][0][0]

    → Row 1 only

Advantages

  • Compact for low cardinality: 3 status values × n bits << n × pointer
  • Fast bitwise operations: Hardware-accelerated AND, OR, NOT
  • Excellent for analytics: Count queries, complex predicates

Disadvantages

  • Poor for high cardinality: 1 million distinct values = 1 million bitmaps
  • Expensive updates: Insert requires updating bitmaps
  • Space for sparse bitmaps: Mostly-zero bitmaps waste space

Bitmap Compression

Run-length encoding compresses sparse bitmaps:

    Raw bitmap:     [1][0][0][0][0][0][0][0][0][0][1]

    RLE compressed: [1 at position 0, 1 at position 10]
    Or: "1, skip 9, 1"

Word-Aligned Hybrid (WAH) and Roaring Bitmaps are common compression schemes:

    Roaring Bitmaps:
    - Divide into chunks of 65536 integers
    - For each chunk, choose representation:
      - Dense: raw bitmap (when > 4096 values)
      - Sparse: sorted array of values (when ≤ 4096 values)
      - Run: RLE (for runs of consecutive values)

6.5 GiST and GIN Indexes

PostgreSQL offers specialized index types for complex data.

GiST (Generalized Search Tree)

A framework for building tree-structured indexes for custom data types:

-- Full-text search
CREATE INDEX idx_content ON documents USING gist (to_tsvector('english', content));

-- Geometric queries
CREATE INDEX idx_location ON places USING gist (coordinates);

-- Range types
CREATE INDEX idx_dates ON events USING gist (date_range);

GiST supports operators:

  • Containment: @>, <@
  • Overlap: &&
  • Nearest neighbor: <->
  • Full-text search: @@

GIN (Generalized Inverted Index)

For indexing composite values (arrays, JSONB, full-text):

-- Array containment
CREATE INDEX idx_tags ON articles USING gin (tags);
SELECT * FROM articles WHERE tags @> ARRAY['database', 'index'];

-- JSONB queries
CREATE INDEX idx_data ON events USING gin (data);
SELECT * FROM events WHERE data @> '{"type": "click"}';

-- Full-text search
CREATE INDEX idx_body ON posts USING gin (to_tsvector('english', body));

GIN structure:

    Inverted Index for tags column:

    "database" → [article_ids: 1, 5, 12, 47, ...]
    "index"    → [article_ids: 1, 23, 47, 89, ...]
    "postgres" → [article_ids: 5, 12, 89, ...]

    Query: tags @> ARRAY['database', 'index']

    Intersect posting lists:
    "database": {1, 5, 12, 47}
    "index":    {1, 23, 47, 89}
    Result:     {1, 47}

GiST vs GIN

AspectGiSTGIN
Build timeFasterSlower
Index sizeSmallerLarger
Query speedSlowerFaster
UpdatesFasterSlower
Best forGeometric, rangesFull-text, arrays, JSONB

6.6 BRIN Indexes (Block Range INdexes)

For very large tables with naturally ordered data, BRIN indexes are extremely compact.

Concept

Store min/max values for ranges of physical pages:

    Table: time_series_data (sorted by timestamp)

    Pages 0-127:    min_ts = 2024-01-01, max_ts = 2024-01-15
    Pages 128-255:  min_ts = 2024-01-15, max_ts = 2024-01-31
    Pages 256-383:  min_ts = 2024-02-01, max_ts = 2024-02-14
    ...

    Query: WHERE timestamp BETWEEN '2024-01-20' AND '2024-01-25'

    Check BRIN:
    Pages 0-127:   max < query_min → SKIP
    Pages 128-255: range overlaps  → SCAN
    Pages 256-383: min > query_max → SKIP

When BRIN Excels

    Requirements:
    1. Very large table (millions+ rows)
    2. Data naturally ordered by indexed column
    3. Queries filter by that column

    Perfect for:
    - Time-series data (ordered by time)
    - Log tables (ordered by log_time)
    - Append-only data

    Example: 1 billion row table
    B-tree index: ~30 GB
    BRIN index:   ~1 MB

Creating BRIN Indexes

CREATE INDEX idx_logs_ts ON logs USING brin (created_at);

-- With custom pages per range
CREATE INDEX idx_logs_ts ON logs USING brin (created_at)
    WITH (pages_per_range = 32);

BRIN Limitations

  • Only useful if data is physically ordered
  • False positives: must scan matching ranges
  • Not useful for randomly distributed data

6.7 Partial and Expression Indexes

Sometimes you don’t need to index everything.

Partial Indexes

Index only a subset of rows:

-- Only index active users
CREATE INDEX idx_active_users ON users (email)
    WHERE status = 'active';

-- This query uses the partial index:
SELECT * FROM users WHERE email = 'alice@example.com' AND status = 'active';

-- This query cannot use it (no status filter):
SELECT * FROM users WHERE email = 'alice@example.com';

Benefits:

  • Smaller index size
  • Faster index maintenance (fewer entries)
  • Better cache efficiency

Common patterns:

-- Unprocessed items
CREATE INDEX idx_unprocessed ON jobs (created_at) WHERE processed = false;

-- Recent data
CREATE INDEX idx_recent ON orders (customer_id)
    WHERE created_at > '2024-01-01';

-- Non-null values
CREATE INDEX idx_phone ON contacts (phone) WHERE phone IS NOT NULL;

Expression Indexes

Index the result of an expression:

-- Case-insensitive search
CREATE INDEX idx_lower_email ON users (lower(email));
SELECT * FROM users WHERE lower(email) = 'alice@example.com';

-- Date extraction
CREATE INDEX idx_order_year ON orders (extract(year FROM created_at));
SELECT * FROM orders WHERE extract(year FROM created_at) = 2024;

-- JSON field
CREATE INDEX idx_type ON events ((data->>'type'));
SELECT * FROM events WHERE data->>'type' = 'click';

The query must match the expression exactly to use the index.


6.8 Covering Indexes and Index-Only Scans

Include extra columns in the index to avoid heap lookups:

-- Regular index
CREATE INDEX idx_email ON users (email);

-- Query needs name too:
SELECT name FROM users WHERE email = 'alice@example.com';
-- Must: Look up email in index → Get row pointer → Read heap for name

-- Covering index
CREATE INDEX idx_email_name ON users (email) INCLUDE (name);
-- or
CREATE INDEX idx_email_name ON users (email, name);

-- Now: Look up email in index → Name is already there!

PostgreSQL INCLUDE Clause

-- Include columns that aren't searchable, just needed in results
CREATE INDEX idx_email ON users (email) INCLUDE (name, avatar_url);

-- The included columns aren't in the search key
-- But they're stored in leaf pages

Trade-off: Larger index, but avoids heap access for covered queries.


6.9 Choosing the Right Index

Decision Flow

    ┌─────────────────────────────────────────┐
    │ What queries will use this index?       │
    └─────────────────────────────────────────┘
                        │
           ┌────────────┼────────────┐
           ▼            ▼            ▼
    ┌──────────┐  ┌──────────┐  ┌──────────┐
    │ Equality │  │  Range   │  │ Full-text│
    │   only   │  │  scans   │  │  search  │
    └──────────┘  └──────────┘  └──────────┘
           │            │            │
           ▼            ▼            ▼
    Consider:     B-tree is      GIN for
    Hash index    usually best   tsvector
                        │
                        ▼
              ┌─────────────────┐
              │ Data naturally  │
              │ ordered by col? │
              └─────────────────┘
                   │        │
                  YES       NO
                   │        │
                   ▼        ▼
              Consider    Stick with
              BRIN        B-tree

Index Type Summary

Index TypeBest ForAvoid When
B-treeGeneral purpose, rangesN/A (default choice)
HashEquality only, high cardinalityNeed ranges/ordering
GINArrays, JSONB, full-textWrite-heavy, simple queries
GiSTGeometric, ranges, nearest neighborSimple equality queries
BRINHuge tables, naturally orderedRandom data distribution
Bitmap*Low cardinality, analyticsHigh cardinality, OLTP

*Bitmap indexes created on-the-fly in PostgreSQL for complex queries


6.10 Summary

Different index types serve different purposes:

  • Hash indexes: O(1) lookup but no ranges
  • Bloom filters: Probabilistic membership testing
  • Bitmap indexes: Efficient for low-cardinality analytics
  • GiST/GIN: Specialized queries (full-text, geometric, JSONB)
  • BRIN: Extremely compact for ordered data
  • Partial indexes: Index only what you query
  • Expression indexes: Index computed values
  • Covering indexes: Avoid heap lookups

Choose indexes based on your query patterns, not just data types.


What’s Next

In Chapter 7, we’ll switch from indexing to durability. We’ll explore Write-Ahead Logging (WAL)—the mechanism that ensures your data survives crashes.


“The best index is the one that turns a table scan into a direct lookup. The second best is knowing when no index is needed.”