Tendermint: BFT Meets the Real World

There’s a particular class of distributed systems protocol that lives primarily in academic papers, and another class that lives primarily in production. Tendermint has the unusual distinction of living in both, and the scars to prove it. Originally designed by Jae Kwon in 2014 and subsequently developed by Ethan Buchman, Tendermint (now rebranded as CometBFT) took the ideas from PBFT and DLS (the Dwork-Lynch-Stockmeyer partial synchrony framework) and forged them into something that actually runs in production across hundreds of blockchains in the Cosmos ecosystem.

The result is a protocol that makes pragmatic engineering decisions the academic papers never had to confront: What happens when your validator set changes every hour? How do you handle a state machine that takes 500ms to execute a block? What do light clients need to verify consensus without running the full protocol? These are the questions that separate a protocol from a system.

Tendermint’s consensus is often described as “PBFT-like,” which is accurate in the way that saying a house is “blueprint-like.” The general shape is there — three-phase BFT with 2f + 1 quorums out of 3f + 1 validators — but the engineering decisions, the locking mechanism, the round structure, and the integration with application logic are distinctly Tendermint’s own.

The Tendermint Consensus Protocol

Tendermint consensus operates in heights (block numbers) and rounds (attempts within a height). Each height produces one block. If the first round fails (proposer is faulty, network is slow), the protocol moves to the next round with a different proposer. Within each round, there are three steps: Propose, Prevote, and Precommit.

The protocol tolerates up to f Byzantine validators out of n = 3f + 1 total, requires that the total voting power of Byzantine validators is less than 1/3 of total voting power (Tendermint uses weighted voting, so it’s voting power, not just count).

The Three Steps

// ============ HEIGHT h, ROUND r ============

// Step 1: PROPOSE
// The designated proposer for round r broadcasts a proposal
function proposer_step(h, r):
    if i_am_proposer(h, r):
        if has_valid_locked_block():
            // We locked on a block in a previous round — must re-propose it
            block = locked_block
            pol_round = locked_round  // Proof-of-lock round
        else:
            // Create a new block
            block = create_block(h)
            pol_round = -1

        broadcast(Proposal{
            height:    h,
            round:     r,
            block:     block,
            pol_round: pol_round  // -1 if new, or the round we locked
        })

    // Start timeout for proposal
    start_timer(TIMEOUT_PROPOSE + r * TIMEOUT_DELTA)

// Step 2: PREVOTE
// Upon receiving a proposal (or timing out), each validator prevotes
function prevote_step(h, r):
    proposal = get_proposal(h, r)

    if proposal == nil:
        // No proposal received — prevote nil
        broadcast(Prevote{h, r, nil})
        return

    block = proposal.block

    // Validate the block
    if not is_valid_block(block):
        broadcast(Prevote{h, r, nil})
        return

    // Safety check: locking rules
    if locked_round != -1 and locked_block != block:
        // We're locked on a different block
        if proposal.pol_round < locked_round:
            // Proposer's proof-of-lock is from before our lock
            // Don't vote for this — it might conflict
            broadcast(Prevote{h, r, nil})
            return
        else:
            // Proposer has a POL from a round >= our locked round
            // We need to see 2/3+ prevotes from that round for this block
            if has_polka(h, proposal.pol_round, block):
                // The evidence checks out — safe to prevote
                broadcast(Prevote{h, r, block_id(block)})
            else:
                broadcast(Prevote{h, r, nil})
            return

    // Not locked, or locked on the same block — vote for it
    broadcast(Prevote{h, r, block_id(block)})

// Step 3: PRECOMMIT
// Upon collecting 2/3+ prevotes, each validator precommits
function precommit_step(h, r):
    // Wait for 2/3+ prevotes
    prevotes = collect_prevotes(h, r)

    if has_two_thirds_prevotes_for(prevotes, some_block_id):
        // A "polka" — 2/3+ prevoted for this block
        block = get_block(some_block_id)

        // LOCK on this block
        locked_block = block
        locked_round = r

        broadcast(Precommit{h, r, block_id(block)})

    else if has_two_thirds_prevotes_for(prevotes, nil):
        // 2/3+ prevoted nil — no block this round
        // Unlock (unless we have a stronger lock from a later round,
        // but this is the latest round, so unlock)
        locked_block = nil
        locked_round = -1

        broadcast(Precommit{h, r, nil})

    else:
        // Neither — timeout expired without 2/3+ for anything
        broadcast(Precommit{h, r, nil})

// COMMIT DECISION
// Upon collecting 2/3+ precommits for a block
function commit_decision(h, r):
    precommits = collect_precommits(h, r)

    if has_two_thirds_precommits_for(precommits, some_block_id):
        block = get_block(some_block_id)

        // Commit the block!
        commit_block(h, block)

        // Save the commit (the set of 2/3+ precommit signatures)
        // This becomes the "commit" that light clients can verify
        save_commit(h, precommits)

        // Move to next height
        start_height(h + 1)

    else if has_two_thirds_precommits_for(precommits, nil):
        // Round failed — try next round
        start_round(h, r + 1)

    else:
        // Timeout without 2/3+ for anything
        start_round(h, r + 1)

Message Flow: Normal Case (Happy Path)

Proposer      Validator 1    Validator 2    Validator 3
   |               |              |              |
   |--PROPOSAL---->|              |              |
   |--PROPOSAL-------------------->|              |
   |--PROPOSAL------------------------------------>|
   |               |              |              |
   | (each validator validates block, checks lock) |
   |               |              |              |
   |<--PREVOTE-----|              |              |
   |<--PREVOTE-----|              |              |
   |   PREVOTE---->|<--PREVOTE----|              |
   |   PREVOTE---->|   PREVOTE--->|<--PREVOTE----|
   |   PREVOTE-------------------->|   PREVOTE-->|
   |               |              |              |
   | (all-to-all: each validator sends prevote   |
   |  to every other validator — O(n²)!)         |
   |               |              |              |
   | (each independently observes 2/3+ prevotes  |
   |  for the block — a "polka")                 |
   |               |              |              |
   | (each validator LOCKS on the block)         |
   |               |              |              |
   |<--PRECOMMIT---|              |              |
   |   PRECOMMIT-->|<--PRECOMMIT--|              |
   |   PRECOMMIT-->|   PRECOMMIT->|<--PRECOMMIT-|
   |   PRECOMMIT-->|   PRECOMMIT->|   PRECOMMIT>|
   |               |              |              |
   | (all-to-all again — O(n²))                  |
   |               |              |              |
   | (each independently observes 2/3+ precommits|
   |  → COMMIT the block, advance height)        |

Note that Tendermint’s prevote and precommit phases use all-to-all communication, just like PBFT’s prepare and commit phases. The message complexity is O(n^2) per round. Tendermint did not adopt HotStuff’s threshold-signature-based linear complexity — a deliberate engineering choice we’ll discuss later.

Why Two Voting Rounds?

Tendermint’s two voting rounds (prevote and precommit) serve the same purpose as PBFT’s prepare and commit:

Prevote (like PBFT prepare): Establishes that 2/3+ validators agree on a specific block for this round. This creates a “polka” — proof that the network has converged on a block. The polka is used to justify locking.
Precommit (like PBFT commit): Establishes that 2/3+ validators are locked on the block (they observed the polka and committed to it). Once 2/3+ precommits exist, the block is committed.

The two-round structure ensures that a committed block can always be recovered after a round change or crash, because enough validators are locked on it.

The Locking Mechanism: Tendermint’s Safety Heart

The locking mechanism is what prevents safety violations, and it’s both the most important and most subtle part of Tendermint. Let me walk through it carefully.

Locking Rules

Lock on polka. When a validator sees 2/3+ prevotes for a block B in round r, it locks on (B, r). This means: “I have evidence that the network might commit B, so I’ll defend it.”
Only prevote locked block. If a validator is locked on (B, r), it will only prevote for B in subsequent rounds, unless it sees evidence that it’s safe to unlock.
Unlock on higher polka. If a validator sees a polka for a different block B’ in a round r’ > r, it can unlock from (B, r) and lock on (B’, r’). The higher round polka proves that the network has moved on.
Proposer carries proof-of-lock. When a proposer re-proposes a locked block, it includes the pol_round — the round in which it observed the polka. Other validators check this against their own locks to decide if it’s safe to vote.

Why This Works: The Safety Argument

Suppose block B is committed at height h. This means 2/3+ precommits exist for B. Each precommitting validator was locked on B. For a different block B’ to be committed at the same height, 2/3+ validators would need to prevote for B’ — but the locked validators will only prevote for B (or unlock due to a higher polka for B’). Since 2/3+ are locked on B, at most 1/3 can prevote for B’ (the non-locked ones), which isn’t enough for a polka. So B’ can’t get a polka, can’t get precommits, and can’t be committed.

The unlock-on-higher-polka rule doesn’t violate this because: if there’s a higher polka for B’, then 2/3+ prevoted for B’ in a higher round. But if B was already committed (2/3+ precommitted), then 2/3+ were locked on B. For 2/3+ to prevote B’, the intersection (at least 1/3+) would need to have been locked on B but prevoted B’. They can only do this if they saw a polka for B’ in a round > their lock round — but B was committed, meaning 2/3+ precommitted B, meaning 2/3+ locked on B, meaning no polka for B’ is possible (insufficient unlocked validators). Contradiction.

This is one of those safety arguments that’s airtight on paper and terrifying to think about at 2 AM when you’re debugging a consensus failure. The engineering challenge is ensuring that the lock state is persisted correctly across crashes, that the polka evidence is validated rigorously, and that the timing of lock/unlock operations is exactly right.

Common Implementation Bug: The Lock Persistence Problem

Here’s a bug I’ve seen in multiple Tendermint-like implementations: a validator crashes after locking on a block but before writing the lock to stable storage. It restarts, doesn’t remember the lock, and prevotes for a different block. If enough validators do this simultaneously, safety can be violated.

The fix is simple in principle: write the lock to disk before sending the precommit. In practice, this adds disk I/O latency to the critical path and introduces questions about what happens if the write is partially completed when the crash occurs. Tendermint’s production code handles this with a write-ahead log (WAL) that records every state transition before it takes effect.

ABCI: Separating Consensus from Application

One of Tendermint’s most significant architectural decisions is the Application Blockchain Interface (ABCI). ABCI is a socket-based interface that separates the consensus engine from the application logic. The consensus engine (Tendermint Core) handles peer discovery, block propagation, voting, and finality. The application (running as a separate process) handles transaction validation, state transitions, and queries.

// ABCI interface (simplified)
interface Application:
    // Called when a new transaction is received in the mempool
    // Returns: accept/reject for mempool inclusion
    function CheckTx(tx) -> ResponseCheckTx

    // Called at the start of block processing
    function BeginBlock(header) -> ResponseBeginBlock

    // Called for each transaction in the block, in order
    function DeliverTx(tx) -> ResponseDeliverTx

    // Called at the end of block processing
    function EndBlock(height) -> ResponseEndBlock

    // Called to persist the state changes
    function Commit() -> ResponseCommit  // Returns app state hash

    // Called to query application state
    function Query(path, data) -> ResponseQuery

    // Called to get application info (including latest block height)
    function Info() -> ResponseInfo

The flow during block commitment:

Tendermint Core                  Application (via ABCI)
     |                                |
     | (consensus commits block B     |
     |  at height h)                  |
     |                                |
     |---BeginBlock(B.header)-------->|
     |<--ResponseBeginBlock-----------|
     |                                |
     |---DeliverTx(tx1)-------------->|
     |<--ResponseDeliverTx------------|
     |---DeliverTx(tx2)-------------->|
     |<--ResponseDeliverTx------------|
     |   ... (for each tx in block)   |
     |                                |
     |---EndBlock(h)----------------->|
     |<--ResponseEndBlock-------------|
     |   (may include validator set   |
     |    updates for next height!)   |
     |                                |
     |---Commit()-------------------->|
     |<--ResponseCommit(app_hash)----|
     |                                |
     | (app_hash is included in the   |
     |  next block's header, creating |
     |  a commitment to app state)    |

Why ABCI Matters

ABCI’s separation of concerns has profound implications:

Language independence. The application can be written in any language that speaks ABCI (socket protocol or gRPC). The Cosmos SDK uses Go, but applications have been written in Rust, JavaScript, and others.
Deterministic replay. The consensus engine guarantees that all validators deliver the same blocks in the same order. The application just needs to be deterministic: given the same sequence of blocks, produce the same state. This is the state machine replication guarantee.
Validator set changes. The application can change the validator set via EndBlock responses. This is how proof-of-stake systems work on Tendermint: the application logic determines who the validators are, and Tendermint adjusts the consensus participant set accordingly. This is elegant in theory but adds complexity in practice — the consensus engine needs to handle validator set changes between heights while maintaining safety guarantees.
Application-level validation. CheckTx allows the application to reject invalid transactions before they enter the mempool, and DeliverTx allows per-transaction processing. This keeps garbage out of blocks without the consensus engine needing to understand application semantics.

The downside of ABCI is latency. Every block requires multiple cross-process calls (or cross-machine calls if the application runs on a different host). For high-throughput applications, this overhead is significant. ABCI++ (introduced in CometBFT v0.38) addresses some of this by adding hooks earlier in the consensus process, allowing the application to participate in block proposal (via PrepareProposal and ProcessProposal).

Round Structure and Timeouts

Tendermint’s round structure is deterministic, with configurable timeouts:

// Timeout configuration
TIMEOUT_PROPOSE  = 3000ms   // Wait for proposal
TIMEOUT_PREVOTE  = 1000ms   // Wait for 2/3+ prevotes after seeing any
TIMEOUT_PRECOMMIT = 1000ms  // Wait for 2/3+ precommits after seeing any
TIMEOUT_DELTA    = 500ms    // Increment per round (for backoff)

function round_timeout(base_timeout, round):
    return base_timeout + round * TIMEOUT_DELTA

// State machine for a single height
function run_height(h):
    round = 0
    while true:
        // Propose step
        if i_am_proposer(h, round):
            propose(h, round)

        wait_for(
            received_proposal(h, round),
            timeout: round_timeout(TIMEOUT_PROPOSE, round)
        )

        // Prevote step
        do_prevote(h, round)

        wait_for(
            has_two_thirds_plus_prevotes(h, round),
            timeout: round_timeout(TIMEOUT_PREVOTE, round)
        )

        // Precommit step
        do_precommit(h, round)

        wait_for(
            has_two_thirds_plus_precommits(h, round),
            timeout: round_timeout(TIMEOUT_PRECOMMIT, round)
        )

        if committed(h):
            return  // Move to next height

        // Round failed — increment and try again
        round += 1

The increasing timeouts (via TIMEOUT_DELTA) serve as the eventual synchrony mechanism. If the network is temporarily partitioned or a proposer is slow, subsequent rounds give more time for messages to arrive. This is the same exponential-backoff-flavored approach that PBFT uses, but Tendermint makes it more explicit and tunable.

Proposer Selection

Tendermint uses a deterministic, weighted round-robin proposer selection:

function select_proposer(validators, round):
    // Each validator has a "priority" that accumulates
    // based on their voting power
    for v in validators:
        v.priority += v.voting_power

    // Select the validator with highest priority
    proposer = max(validators, key=lambda v: v.priority)

    // Decrease selected proposer's priority
    proposer.priority -= total_voting_power

    return proposer

This ensures that validators propose blocks proportional to their voting power. A validator with 10% of the stake proposes approximately 10% of blocks. The algorithm is deterministic — all validators compute the same proposer for each round — which is essential for consensus.

Tendermint vs PBFT vs HotStuff

Let’s compare the three BFT protocols we’ve covered.

Protocol Structure

Aspect	PBFT	HotStuff	Tendermint
Phases	Pre-prepare, Prepare, Commit	Prepare, Pre-commit, Commit, Decide	Propose, Prevote, Precommit
Communication pattern	All-to-all (prepare, commit)	Star (through leader)	All-to-all (prevote, precommit)
Message complexity	O(n^2) per decision	O(n) per view	O(n^2) per round
View/round change	Separate complex protocol	Same as normal case	Same as normal case (next round)
Crypto	MACs + signatures	Threshold signatures	Standard signatures
Pipelining	No (in base protocol)	Yes (Chained HotStuff)	No (one block per height)

Locking Mechanism

Aspect	PBFT	HotStuff	Tendermint
When locked	After prepared certificate	After pre-commit QC	After observing polka (2/3+ prevotes)
Lock scope	Per sequence number	Per node in chain	Per height (across rounds)
Unlock condition	View change with higher certificate	Higher QC from safe_node rule	Higher polka in later round
Lock persistence	Must survive crashes	Must survive crashes	WAL-based persistence

Performance Characteristics

Metric	PBFT	HotStuff	Tendermint
Typical block time	N/A (request-based)	0.5-2s (chained)	1-7s (configurable)
Throughput (n=4)	80K ops/s	60K ops/s	1K-10K TPS
Throughput (n=100)	<5K ops/s	25K ops/s	100-1K TPS
Finality	Immediate	Immediate	Immediate
Latency (LAN)	3-10ms	5-20ms	1-7s (block time)
Latency (WAN)	100-500ms	300-500ms	5-15s

Tendermint’s throughput numbers are lower in part because it’s measuring different things: transactions per second through the full application stack (consensus + ABCI + application execution), not just consensus operations. The ABCI overhead and application execution time are significant factors. Raw consensus throughput (without application) would be higher.

Why Tendermint Didn’t Adopt Linear Complexity

A natural question: if HotStuff achieves O(n) message complexity, why does Tendermint stick with O(n^2)?

Several reasons:

Simplicity. Tendermint uses standard digital signatures, not threshold signatures. No DKG ceremony, no complex cryptographic setup. Any validator can join with a standard key pair. This dramatically simplifies deployment and key management.
Practical validator sets. Most Cosmos chains run with 50-175 validators. At this scale, O(n^2) is manageable — 175^2 = 30,625 messages per round, which is high but feasible with modern networking. The chains that need 1000+ validators are rare and typically use delegated staking to keep the active set small.
Gossip-based communication. Tendermint doesn’t actually send n^2 direct messages. It uses a gossip protocol: each validator sends its vote to a subset of peers, who relay it further. This doesn’t change the theoretical complexity, but it spreads the load and works well in real networks.
No leader bottleneck. With all-to-all communication, no single node is overloaded. In HotStuff, the leader processes all n votes and aggregates them — it does more work than any other node. In Tendermint, work is distributed evenly.
Historical timing. Tendermint’s core protocol was designed in 2014, years before HotStuff (2018). By the time HotStuff was published, Tendermint had a large production ecosystem. Switching consensus protocols for a running network with billions of dollars at stake is… not done casually.

Light Client Verification

One of Tendermint’s most practical features is its support for light clients — clients that verify consensus without running the full protocol or storing the full state.

A Tendermint light client needs:

A trusted block header (from genesis or a trusted source).
The current validator set.
Block headers and commit signatures for blocks it wants to verify.

// Light client verification
function verify_block(header, commit, trusted_validators):
    // Check that the commit contains 2/3+ voting power
    // of signatures from the validator set
    total_power = sum(v.voting_power for v in trusted_validators)
    signed_power = 0

    for sig in commit.signatures:
        validator = trusted_validators.get(sig.validator_id)
        if validator == nil:
            continue  // Unknown validator, skip

        if not verify_signature(sig, validator.public_key, header):
            continue  // Invalid signature, skip

        signed_power += validator.voting_power

    if signed_power * 3 <= total_power * 2:
        return error("insufficient voting power: need >2/3")

    return ok(header)

// Verifying a header at height h given trusted header at height t
function verify_header_at_height(h, trusted_header_at_t):
    if h == t + 1:
        // Sequential verification: next block's header
        // contains the hash of the validator set that signed it
        header_h = fetch_header(h)
        commit_h = fetch_commit(h)
        validators_h = get_validators_from_header(trusted_header_at_t)
        return verify_block(header_h, commit_h, validators_h)

    else:
        // Skipping verification: can skip ahead if the validator set
        // hasn't changed too much (1/3 overlap rule)
        header_h = fetch_header(h)
        commit_h = fetch_commit(h)
        validators_h = fetch_validators(h)

        // Check that 1/3+ of trusted validators signed header_h
        // (this prevents long-range attacks)
        trusted_power = verify_overlap(
            commit_h, trusted_header_at_t.validators)

        if trusted_power * 3 <= total_trusted_power:
            // Not enough overlap — can't skip, must verify sequentially
            return verify_sequential(t, h)

        return verify_block(header_h, commit_h, validators_h)

The light client protocol enables:

Mobile wallets that verify blockchain state without downloading the full chain.
IBC (Inter-Blockchain Communication) — Cosmos’s cross-chain protocol uses light client verification to prove state on one chain to another.
Bridges to other ecosystems.

This is something that neither PBFT nor HotStuff addresses directly. Their papers focus on the consensus protocol itself, not on how external observers verify its output. Tendermint’s light client design is a practical contribution that came from building a system that real users need to interact with.

Real-World Deployment: The Cosmos Ecosystem

As of 2025, Tendermint/CometBFT powers:

Cosmos Hub — the central hub of the Cosmos ecosystem, with ~175 validators and billions in staked value.
Osmosis — a decentralized exchange with ~150 validators.
Celestia — a modular data availability layer.
dYdX — a derivatives exchange that migrated from Ethereum to its own Cosmos chain.
Hundreds of other chains in the Cosmos ecosystem, each with their own validator sets.

Production Lessons

Here are things we’ve learned from Tendermint’s production deployments that aren’t in any paper:

Block time tuning is an art. The default 5-7 second block time balances finality latency against network propagation time. Chains have experimented with 1-second block times and found that it works in good network conditions but leads to frequent empty blocks and increased round failures when latency spikes. The right block time depends on your geographic distribution of validators and your tolerance for empty blocks.
Validator infrastructure is heterogeneous. Some validators run on bare metal in data centers; others run on cloud VMs in different regions. The fastest validator might have 1ms network latency to its peers; the slowest might have 300ms. Timeout tuning must accommodate the slowest honest validator without giving Byzantine validators too much time to misbehave.
Mempool management matters more than you think. The consensus protocol assumes transactions are available — but getting the right transactions into blocks, deduplicating across the gossip network, and handling transaction validity that changes as state changes is complex. Tendermint’s mempool has been rewritten multiple times.
State sync is essential. A new validator joining the network can’t replay blocks from genesis (that would take weeks for a mature chain). Tendermint supports state sync: downloading a recent state snapshot and only replaying recent blocks. This requires trust in the snapshot provider, which somewhat undermines the BFT model. In practice, validators use snapshots from multiple sources and verify against the light client protocol.
Evidence of misbehavior. Tendermint collects evidence of Byzantine behavior (double-signing, specifically) and includes it in blocks. The application can then punish (slash) the misbehaving validator. This economic incentive layer is not part of the consensus protocol per se, but it’s essential for the system’s security in a proof-of-stake setting.
Upgrades are the hardest problem. Upgrading the consensus protocol on a running network with 150+ independent validators requires coordination that no paper describes. Cosmos chains use “governance proposals” where validators vote on an upgrade block height, and at that height, all validators simultaneously switch to the new software. When this works, it’s elegant. When a validator misses the memo, it forks off and needs to catch up.

ABCI++: The Evolution

CometBFT v0.38 introduced ABCI++ (also called ABCI 2.0), which extends the interface with new hooks:

// New ABCI++ methods
interface Application_v2 extends Application:
    // Called when a proposer is preparing a block
    // Allows the application to reorder, add, or remove transactions
    function PrepareProposal(txs, max_bytes) -> ResponsePrepareProposal

    // Called by non-proposer validators to validate a proposed block
    // Can reject the entire block (vote nil) or accept
    function ProcessProposal(block) -> ResponseProcessProposal

    // Called to extend the precommit vote with application data
    function ExtendVote(block) -> ResponseExtendVote

    // Called to verify another validator's vote extension
    function VerifyVoteExtension(extension) -> ResponseVerifyVoteExtension

    // Replaces BeginBlock + DeliverTx + EndBlock with a single call
    function FinalizeBlock(block) -> ResponseFinalizeBlock

These additions address real problems:

PrepareProposal/ProcessProposal: Gives the application control over block contents. Applications can implement MEV (Maximal Extractable Value) protection, transaction ordering policies, or custom validity rules that go beyond CheckTx.
ExtendVote/VerifyVoteExtension: Allows validators to attach application-specific data to their votes. Use cases include oracle price feeds (validators attest to off-chain data during consensus), threshold decryption (validators contribute decryption shares), and more.
FinalizeBlock: Replaces the multi-call block execution with a single atomic call, reducing ABCI overhead.

The Practical Engineering Decisions

Let me enumerate the decisions Tendermint made that you won’t find in BFT papers but that matter enormously in production:

1. Gossip Over Direct Communication

Tendermint doesn’t maintain n^2 direct connections. Instead, each node maintains connections to a subset of peers and uses gossip to disseminate messages. The gossip protocol adds latency (messages take multiple hops) but dramatically reduces the number of connections each node must maintain.

For 150 validators, maintaining 149 direct TCP connections is feasible but adds memory and CPU overhead per connection (TLS, keepalives, etc.). Gossip with 20-40 peers is more practical and more resilient to network topology changes.

2. WAL-Based Crash Recovery

Every state transition — receiving a proposal, prevoting, locking, precommitting — is written to a write-ahead log before the action takes effect. On recovery, the WAL is replayed to restore the validator to its pre-crash state. This is conceptually simple but the details matter: the WAL must be fsynced before proceeding, which adds ~1-5ms of latency per consensus step on typical SSDs.

3. Evidence Handling

When a validator detects equivocation (a peer signed two different blocks or votes at the same height/round), it collects the conflicting signatures as evidence and broadcasts them. The evidence is included in future blocks, and the application can slash the misbehaving validator.

This creates an incentive layer that exists outside the consensus protocol. The protocol itself doesn’t need slashing to be safe — safety comes from the 2/3+ honest assumption. But slashing makes it economically irrational to be Byzantine, which is the practical argument for why the 2/3+ honest assumption holds.

4. Proposer-Based Timestamps

Tendermint originally used BFT time (median of validator-reported timestamps) for block timestamps. This was replaced with proposer-based timestamps in later versions, where the proposer sets the block time and validators reject blocks with timestamps too far from their local clocks. This simplifies the protocol and removes a subtle attack vector where Byzantine validators could skew the median time.

5. Block Size and Gas Limits

Block size limits and gas limits (maximum computational work per block) are application-level parameters, not consensus parameters. But they profoundly affect consensus performance: a block that takes 5 seconds to execute means the effective minimum block time is 5+ seconds, regardless of what the consensus timeout is configured to. This coupling between application execution time and consensus latency is a source of constant tuning.

When Tendermint Is the Right Choice

Tendermint/CometBFT makes sense when:

You’re building an application-specific blockchain. The ABCI separation lets you write your application logic in any language while getting production-tested BFT consensus.
You need immediate finality. Unlike Nakamoto consensus, Tendermint blocks are final once committed. No waiting for 6 confirmations or worrying about chain reorganizations.
Your validator set is moderate-sized (10-200). Tendermint performs well in this range. Beyond 200, the O(n^2) message complexity starts to bite.
You want the Cosmos ecosystem. IBC (Inter-Blockchain Communication), the Cosmos SDK, and a large community of validators and developers are significant assets.
You need light client support. Tendermint’s light client protocol is mature and well-tested.

Tendermint is less ideal when:

You need thousands of consensus participants. Use something with linear complexity.
You’re not building a blockchain. If you just need replicated state machine with BFT, Tendermint’s blockchain-specific features (blocks, heights, ABCI) may be unnecessary overhead. Consider a general-purpose BFT library.
You need sub-second finality. Tendermint’s block-based structure means latency is at least one block time (typically 1-7 seconds).
You don’t need BFT. If your replicas are trusted, Raft is simpler, faster, and more appropriate.

The Legacy

Tendermint’s lasting contribution isn’t just the consensus protocol — it’s the demonstration that BFT consensus can be productized, deployed at scale, and maintained by an ecosystem of independent operators. The academic BFT community produced brilliant protocols. Tendermint proved they could be turned into infrastructure that handles billions of dollars in value.

The protocol itself is a pragmatic blend of PBFT’s ideas with practical engineering: standard signatures instead of threshold signatures, gossip instead of direct communication, ABCI instead of tightly coupled state machines, WAL-based recovery instead of assumed reliability. These choices sacrifice theoretical optimality for operational simplicity, and the thriving Cosmos ecosystem suggests that’s the right tradeoff for many applications.

CometBFT continues to evolve, and the lessons learned from operating hundreds of chains inform each iteration. The gap between a BFT paper and a BFT production system is still enormous, but Tendermint has done more than any other project to bridge it.

Keyboard shortcuts

The Agony of Consensus Algorithms