Schema Design That Won’t Haunt You
Schema design is the decision you’ll regret the most if you get it wrong, and benefit from the most if you get it right. Unlike code, which can be refactored, schemas leave fingerprints on everything: your queries, your indexes, your migrations, your application logic, your team’s mental model of the system. A good schema is nearly invisible — it just works, naturally. A bad schema generates friction at every turn.
This chapter covers the design principles that hold up over years of production use.
Start With Normalization, Deviate Intentionally
Third Normal Form (3NF) is not just a textbook abstraction — it’s the starting point for schemas that stay maintainable. In practice, normalization means:
- Every column depends on the primary key, the whole primary key, and nothing but the primary key
- Each fact is stored in exactly one place
- Update anomalies are impossible: you change a value once and it’s changed everywhere
The practical upshot: if you’re storing a user’s email address in the users table and also in the audit_log table and also in the email_preferences table, you have three places to update when an email changes. You will eventually update two of them and miss the third. You will then spend an afternoon debugging why your system thinks a user has two different email addresses.
Normalize first. Denormalize later, only where the performance evidence demands it, and only in well-understood ways.
Common normalization failures to avoid:
Repeating groups in a single column: storing comma-separated values like "tag1,tag2,tag3" in a tags column. Use a proper junction table or a text[] array. Comma-separated values cannot be indexed, cannot be queried efficiently, and will make your future self unhappy.
Non-atomic values: storing a person’s full name in one column when you need to sort by last name. If you ever need to decompose a value, it should have been two columns.
Transitive dependencies: a orders table with customer_id, customer_name, and customer_email. If customer_name depends on customer_id rather than on the order, it belongs in the customers table, not the orders table. Orders should reference customers by ID.
Primary Keys
The question of what to use as a primary key is more consequential than it appears.
Serial vs. Identity
The old pattern — id SERIAL PRIMARY KEY — creates an integer sequence and sets the default. This works, but SERIAL is unofficial shorthand. The proper SQL-standard approach since PostgreSQL 10 is identity columns:
id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY
-- or
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY
GENERATED ALWAYS prevents accidental insertion of explicit values (you have to use OVERRIDING SYSTEM VALUE to override it). GENERATED BY DEFAULT allows explicit values when needed.
Use BIGINT (8 bytes, range up to ~9.2 × 10^18) rather than INTEGER (4 bytes, range up to ~2.1 × 10^9) for anything expected to grow. Running out of integer space in a high-traffic primary key table is a terrible emergency.
UUIDs
UUIDs (UUID type, 16 bytes) are popular for several legitimate reasons:
- They can be generated client-side without a database round-trip
- They don’t leak row counts or insertion order to external parties
- They work naturally in distributed systems where multiple sources generate IDs
The traditional gen_random_uuid() (v4) generates fully random UUIDs. The problem with random UUIDs as primary keys is index fragmentation. Because v4 UUIDs are random, insertions go to random positions in the index B-tree, causing page splits and poor cache locality. This matters at scale — tables with hundreds of millions of rows and random UUID primary keys will have noticeably worse performance than sequential alternatives.
The better choice for most applications: UUIDv7.
UUIDv7 is a time-ordered UUID — the first 48 bits encode the millisecond timestamp, followed by random bits. This preserves the benefits of UUIDs (opaque, distributable) while maintaining roughly sequential insertion order, which dramatically reduces B-tree fragmentation.
In PostgreSQL 17, gen_random_uuid() still generates v4. For v7, you currently need either an extension or application-level generation. Several libraries provide this:
-- Using the uuid-ossp extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- uuid_generate_v4() for v4 (random)
-- Or generate in the application and pass it in
-- e.g., in Go: github.com/google/uuid v7
-- in Python: uuid7 package
As of PostgreSQL 17, there’s ongoing work to add native UUIDv7 generation.
Practical recommendation:
- Small to medium tables with simple access patterns:
BIGINT GENERATED ALWAYS AS IDENTITY - Tables that need externally-opaque IDs or multi-source ID generation: UUIDv7
Composite Primary Keys
Junction tables for many-to-many relationships should use composite primary keys:
CREATE TABLE user_roles (
user_id BIGINT NOT NULL REFERENCES users(id),
role_id BIGINT NOT NULL REFERENCES roles(id),
granted_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (user_id, role_id)
);
Adding a surrogate id column to a junction table is usually unnecessary and adds space overhead. The composite key is natural and enforces uniqueness.
Data Types: Use the Right Ones
PostgreSQL’s type system is rich. Using the right types is not just about correctness — it’s about storage efficiency, index performance, and the queries the database can help you write.
Text Types
TEXT and VARCHAR(n) are stored identically in Postgres — there is no performance advantage to VARCHAR(n) over TEXT. CHAR(n) is padded with spaces and is almost never what you want.
Use TEXT for variable-length strings unless you have a business rule that enforces a length, in which case use a CHECK constraint:
-- Not this (VARCHAR limit has no storage advantage):
name VARCHAR(255)
-- This (with explicit constraint if needed):
name TEXT NOT NULL CHECK (length(name) BETWEEN 1 AND 255)
The one case for VARCHAR(n): it gives PostgreSQL an upper bound on the column size, which can help the planner in some cases. But this is a minor consideration compared to readability.
Numeric Types
SMALLINT(2 bytes): -32768 to 32767. Almost never worth the savings.INTEGER(4 bytes): -2.1B to 2.1B. Good for most counters and IDs that won’t grow large.BIGINT(8 bytes): -9.2 × 10^18 to 9.2 × 10^18. Default for IDs on large tables.NUMERIC(p, s)/DECIMAL(p, s): Exact arbitrary-precision arithmetic. Use for money and anything where rounding matters. Slow compared to integer/float types.REAL(4 bytes, float): Approximate. Use for measurements where small rounding errors are acceptable.DOUBLE PRECISION(8 bytes, float): Approximate, larger range. Use for scientific computation.
Money: Use NUMERIC(12, 2) or NUMERIC(19, 4) depending on required precision. Never use floating-point types for monetary values — the rounding errors are real and will cause auditing nightmares. Postgres has a MONEY type but it’s locale-dependent and problematic; prefer NUMERIC.
Timestamps
Always use TIMESTAMPTZ (timestamp with time zone) rather than TIMESTAMP (without time zone).
Despite the name, TIMESTAMPTZ doesn’t actually store the timezone. It stores a UTC instant and converts to/from the session timezone on display. TIMESTAMP stores whatever you give it with no timezone context — it’s a footgun when your server timezone changes or when your team works across time zones.
-- This is what you want:
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
-- This will bite you eventually:
created_at TIMESTAMP NOT NULL DEFAULT now()
For date-only values: DATE. For time-only: TIME (without tz) or TIMETZ (with tz, rarely needed). For intervals: INTERVAL.
now() returns the transaction start time (same within a transaction). clock_timestamp() returns the actual current time and changes within a transaction. For DEFAULT values on insert, now() is almost always what you want.
Booleans
Use BOOLEAN, not SMALLINT or CHAR(1). PostgreSQL’s BOOLEAN type stores TRUE, FALSE, or NULL. It’s unambiguous, indexable, and understood by every ORM.
Enumerations
For columns with a small, fixed set of values, Postgres offers ENUM types:
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'shipped', 'delivered', 'cancelled');
CREATE TABLE orders (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
status order_status NOT NULL DEFAULT 'pending'
);
ENUMs are stored efficiently (4 bytes) and enforced at the type level. The drawback: adding values to an ENUM requires ALTER TYPE ... ADD VALUE, which is a DDL operation (though a fast one — it doesn’t lock the table). Removing or reordering values requires a full type rebuild.
An alternative is a TEXT column with a CHECK constraint:
status TEXT NOT NULL CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled'))
This is easier to migrate (just update the constraint) but gives up the storage efficiency. For most applications, the TEXT + CHECK approach is preferable because of its migration flexibility. The storage difference is negligible unless you have billions of rows.
Arrays
Postgres has native array support. You can have TEXT[], INTEGER[], UUID[], etc.
Arrays are useful for storing a small, ordered list of values where the list isn’t a relational entity. Tags, permissions flags, phone numbers, and small sets work well as arrays.
CREATE TABLE articles (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
title TEXT NOT NULL,
tags TEXT[] NOT NULL DEFAULT '{}'
);
-- Querying array containment:
SELECT * FROM articles WHERE tags @> ARRAY['postgres', 'performance'];
-- Index support:
CREATE INDEX idx_articles_tags ON articles USING GIN (tags);
For larger sets or when you need to query individual elements relationally (join on them, count them, etc.), a proper junction table is usually better than an array.
JSONB
JSONB is Postgres’s binary JSON type. It’s useful for storing semi-structured data — configuration, metadata, external API payloads, optional attributes that vary by record type.
Chapter 5 covers JSONB in depth. The key rule here: don’t use JSONB as a crutch to avoid schema design. “I’ll just put everything in a JSONB column” is often a sign that you haven’t thought hard enough about your data model. JSONB shines for genuinely variable data; it’s a poor substitute for columns.
Constraints: Your Data’s Last Line of Defense
Constraints are not overhead — they’re the database enforcing your business rules at the layer closest to the data. Bugs that would otherwise corrupt your data fail loudly at the constraint level.
NOT NULL
Apply NOT NULL aggressively. Null is a special value that infects every query that touches it — NULL = NULL is NULL (not true), NULL + 5 is NULL, COUNT(col) ignores NULLs. Every nullable column makes your queries more complex.
Some columns are legitimately nullable — deleted_at in a soft-delete pattern, an optional middle name, a shipping address that might not exist yet. But “nullable because I’m not sure yet” is a schema smell. Be explicit about which NULLs are intentional.
CHECK Constraints
CHECK constraints let you encode business rules in the schema:
ALTER TABLE products ADD CONSTRAINT check_price_positive CHECK (price_cents > 0);
ALTER TABLE users ADD CONSTRAINT check_email_valid CHECK (email LIKE '%@%.%');
ALTER TABLE events ADD CONSTRAINT check_end_after_start CHECK (end_at > start_at);
CHECK constraints are evaluated on every insert and update. If a constraint is expensive to evaluate, consider whether it belongs in the schema or in application code. But for simple range checks and invariant enforcement, they’re invaluable.
UNIQUE Constraints
UNIQUE constraints prevent duplicate values and create an index automatically. Use them for columns that must be unique at the business level:
ALTER TABLE users ADD CONSTRAINT users_email_unique UNIQUE (email);
ALTER TABLE products ADD CONSTRAINT products_sku_unique UNIQUE (sku);
For multi-column uniqueness:
ALTER TABLE team_members ADD CONSTRAINT team_members_unique UNIQUE (team_id, user_id);
Note that UNIQUE allows NULLs — two NULL values do not violate a UNIQUE constraint (because NULL ≠ NULL). If you need a column to be unique among non-null values but allow multiple NULLs, this is the default behavior. If you want to constrain that nulls are also unique (at most one NULL), use a partial index instead of a UNIQUE constraint.
Foreign Key Constraints
Foreign keys enforce referential integrity — a row in orders cannot reference a non-existent user_id. They also communicate the relationship structure to anyone reading the schema.
CREATE TABLE orders (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
...
);
The ON DELETE clause matters:
RESTRICT(default): prevent deletion of the referenced row if child rows existCASCADE: automatically delete child rows when parent is deletedSET NULL: set the foreign key column to NULL when parent is deletedSET DEFAULT: set the foreign key column to its default when parent is deletedNO ACTION: like RESTRICT but deferred (checked at end of transaction)
Choose deliberately. CASCADE is often convenient but can cause surprising bulk deletes. RESTRICT is safer but requires explicit cleanup.
Foreign key columns should almost always be indexed separately (the foreign key constraint does not create an index automatically — only the referenced side gets an index). Missing indexes on foreign key columns cause slow deletes and joins:
CREATE INDEX idx_orders_user_id ON orders(user_id);
Naming Conventions
Consistent naming is a form of documentation. These conventions are widely used in the Postgres community:
- Tables: plural snake_case —
users,orders,product_categories - Columns: snake_case —
created_at,user_id,first_name - Primary keys:
id(or<table_singular>_idfor clarity in joins) - Foreign keys:
<referenced_table_singular>_id—user_id,order_id - Timestamps:
created_at,updated_at,deleted_at - Indexes:
idx_<table>_<columns>—idx_orders_user_id,idx_users_email - Constraints:
<table>_<columns>_<type>—orders_user_id_fkey,users_email_unique - Sequences:
<table>_<column>_seq—users_id_seq
Postgres folds unquoted identifiers to lowercase. Never use quoted mixed-case identifiers (like "userId") — it requires quoting everywhere and makes SQL in psql and other tools annoying.
Soft Deletes
Many applications implement soft deletes — instead of actually deleting a row, they set a deleted_at timestamp. This preserves the audit trail, allows undeletion, and avoids cascading foreign key deletes.
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMPTZ;
The operational challenge with soft deletes is that every query must filter out deleted rows. Forgetting WHERE deleted_at IS NULL is a persistent bug source.
Strategies:
- Use a view that filters deleted rows, and query through the view
- Use row-level security to hide deleted rows for non-admin roles
- Use a partial index that covers only non-deleted rows:
CREATE INDEX idx_users_active ON users(email) WHERE deleted_at IS NULL
The partial index keeps queries on non-deleted rows fast without the overhead of including deleted rows in the index.
Audit Columns and Triggers
Every table that represents mutable domain objects should have created_at and updated_at columns. created_at is set on insert and never changed. updated_at is maintained automatically.
CREATE TABLE products (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT NOT NULL,
price_cents INTEGER NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Trigger to auto-update updated_at
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trigger_products_updated_at
BEFORE UPDATE ON products
FOR EACH ROW EXECUTE FUNCTION set_updated_at();
Some ORMs maintain updated_at in application code. This is less reliable than a trigger — application code can be bypassed (direct database access, migrations, admin scripts), and a trigger ensures the column is always correct.
Partitioning
Table partitioning splits a large table into smaller child tables while presenting a unified interface. Postgres supports:
- Range partitioning: by a range of values (dates are the most common use case)
- List partitioning: by discrete values (e.g., partition by region or tenant)
- Hash partitioning: by hash of a column value (distributes rows evenly)
CREATE TABLE events (
id BIGINT NOT NULL,
event_type TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL,
payload JSONB
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2024_01 PARTITION OF events
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
Partitioning is useful when:
- You need to drop old data quickly (drop a partition instead of deleting millions of rows)
- Your queries consistently filter on the partition key (allowing partition pruning)
- The table is large enough that index scans across the entire table are slow
Partitioning is not a solution for poor query performance in general. A well-indexed table of 50 million rows often performs fine without partitioning. Adding partitioning to a table that doesn’t need it adds complexity without benefit.
The pg_partman extension (Chapter 11) automates partition creation and maintenance.
Table Design Anti-patterns
Entity-attribute-value (EAV). A table like (entity_id, attribute_name, attribute_value) that stores arbitrary key-value metadata. This is almost always a mistake. EAV tables are impossible to index effectively, produce horrible query syntax, and eliminate all type safety. If you need flexible attributes, use JSONB or a proper polymorphic schema.
Polymorphic associations. A comments table with a commentable_type TEXT and commentable_id BIGINT that references different tables based on commentable_type. You can’t add a foreign key constraint, can’t do efficient joins, and the relationship semantics are implicit. Consider table inheritance, separate junction tables, or a single comments table with nullable foreign keys.
GOD tables. A single table that stores multiple conceptually different entities with a type discriminator column, and lots of nullable columns that only apply to some types. This is almost always a schema that was designed to avoid creating new tables. Split it.
Natural keys as primary keys (with exceptions). Using a business-meaningful value (email address, ISBN, social security number) as a primary key creates coupling between business rules and physical data layout. When the “immutable” business key turns out to be mutable (and it always does eventually), migrating is expensive. Use surrogate keys (BIGINT or UUID) and enforce uniqueness of natural keys with a UNIQUE constraint.
The exception: genuinely immutable, truly unique identifiers like UPC codes or ISO country codes in lookup tables where the natural key will never change.
Designing for Migration
Every schema you design today will be migrated tomorrow. Design with that in mind:
- Prefer adding columns to changing them. Adding a nullable column is instant in modern Postgres. Changing a column’s type requires rewriting the table.
- Avoid NOT NULL on new columns without defaults. Adding a NOT NULL column without a default requires rewriting the table (pre-Postgres 11 for constants). Use a default, or add nullable and backfill separately.
- Name indexes and constraints explicitly. Auto-generated names like
orders_pkeyare predictable, but custom names clarify intent. - Use the expand/contract pattern for renames. Add the new column, backfill, make the old column nullable, stop writing to it, drop it later. See Chapter 12.
Schema design is an exercise in empathy — for your future self, your teammates, and the engineers who will maintain this system after you’re gone. Make the schema communicate the domain clearly, enforce invariants aggressively, and leave room for the changes you know are coming.