Past the Borrow Checker

“The borrow checker is the easy part.” — every engineer who has fought for<'a> for an afternoon

There is a moment, somewhere between the second month and the second year of writing Rust, when the language stops being the language you read about and starts being a different language. You have made peace with &mut. You no longer fight ownership; you think in it. You read a function signature and your eye lands on the return type and the lifetimes feel routine, like punctuation. You have shipped real code. You like Rust.

Then you write your first non-trivial async function that returns a future borrowing from self, and the compiler tells you something like:

error: lifetime may not live long enough
   --> src/lib.rs:12:9
    |
9   |     async fn handle<'a>(&'a self, req: &'a Request) -> Response {
    |                     --  ----- has type `&'a Server`
    |                     |
    |                     lifetime `'a` defined here
...
12  |         self.dispatch(req).await
    |         ^^^^^^^^^^^^^^^^^^^^^^^^ requires that `'a` must outlive `'static`

And you read it. And you read it again. And the words on the screen are individually English words, and they form sentences with subjects and verbs, and you understand none of it, because the question you want answered — what do I change — is not in there. The compiler is telling you about a constraint between two things. It is not telling you which thing to move.

This is the wall. Welcome.

What this book assumes

You have read The Rust Programming Language. Probably twice. You have written real Rust — not toys, real code, the kind that sits in a service and handles requests or in a binary that someone else runs. You know what &mut means and why you can’t have two of them. You know what Box<dyn Trait> does. You can read a where T: 'a + Send + Sync clause without flinching. You probably know what Cow is and have used it at least once on purpose.

You are not a beginner. You are an engineer who has bought into Rust, and now Rust is asking more of you than you signed up for.

This book starts at that point. It does not explain ownership. It does not explain match. It does not show you how to call a function. If you need those things, the official book is excellent and free and waiting for you. Come back when you’ve hit the wall.

What the wall looks like

The wall has four corners. They are:

Lifetimes that don’t behave. Specifically, lifetimes that involve closures, or trait objects, or async, or any combination thereof. The simple 'a parameters you got used to from chapter 10 of the official book have an entire algebra you didn’t know existed, and the algebra is what’s failing.
Variance. A property of generic type parameters that nobody told you about, that the compiler knows by heart, that determines whether Vec<&'static str> can be used where Vec<&'a str> is expected, and whose precise rules are why some of your error messages mention subtype and supertype when you weren’t using inheritance.
Higher-rank trait bounds. The for<'a> syntax. The thing that shows up in error messages about closures the moment you try to write a callback that takes a reference. The single language feature most likely to make you wonder if you should switch to Go.
Pin, and the entire async machinery sitting on top of it. A type that, if you look at it for the first time without context, appears to do nothing. It is, in fact, doing the most subtle thing in the standard library, and once you understand why, you will respect it deeply, and once you have to use it in anger, you will resent it.

These four corners are not independent. The reason async is hard is that it touches all three of the others. The reason Pin exists is that the async desugaring would otherwise be unsound. The reason your async closure won’t compile is that HRTBs and lifetime variance interact in ways the inference engine cannot always figure out. The reason your error message mentions 'static is that the compiler has decided your borrow has to outlive a future whose lifetime it cannot bound.

Everything is connected. That is part of why this is hard. We will take the corners one at a time anyway.

Why the official docs stop being enough

The official Rust documentation — the book, the reference, the rustonomicon — is excellent. It is also written for an audience that is mostly trying to avoid this material. The book stops at lifetimes-as-scopes. The reference will tell you formally what variance is, but it will not tell you why your code triggers it. The rustonomicon assumes you are writing unsafe, which means it skips the parts of safe Rust that are nevertheless impossible to use without the same depth of understanding.

The async book exists, and it is good, and it will not save you from a real-world Send bound failure across an await point inside a closure inside a tokio::select!. Nothing currently in print will save you from that. You have to build the model yourself, and once it’s built, the error message becomes legible in roughly the same way git rebase --interactive becomes legible after you’ve ruined a branch with it three times.

This book is the thing that wishes it had existed for those three times.

How this book is organized

Part I — The Type System’s Edges. Lifetimes (chapter 1), variance (chapter 2), and HRTB (chapter 3). The three concepts that, together, form the substrate everything else in this book is going to need. Read in order.

Part II — Async, From the Inside. What async fn desugars to (chapter 4), why holding a reference across await is fundamentally different from holding it across a function boundary (chapter 5), Pin and the soundness problem it solves (chapter 6), and the long unhappy story of async fn in traits (chapter 7).

Part III — Survival. Real compiler errors decoded line by line (chapter 8), and the small handful of patterns that cover most of what you’ll actually do in production (chapter 9).

Part IV — Disagreement. When the compiler is right and you are wrong (chapter 10), and when the compiler is wrong and you are right and what you do about it (chapter 11).

There is also a bibliography, which is short, because the writing in this space is mostly blog posts and RFCs, and they are linked there.

A note on tone

I am going to be sympathetic to your suffering and direct about the difficulty. The reason is that I respect you. You are not a beginner being eased in. You have already done the easy part. You have already invested. You deserve someone to acknowledge that the next part is genuinely hard, that the design choices that make it hard are mostly defensible, and that the path through is built out of a small number of mental models you can actually learn.

I will, occasionally, point out where the language design has been controversial. The async-trait situation in particular has burned several years of community attention and is not fully resolved as of 2026. I will not pretend it is. The Rust project’s conduct in working through these problems has been, on the whole, admirable; the result, on the whole, has been imperfect. Both can be true.

Let’s begin.

Lifetimes Are Not What You Think

The first thing the official book teaches you about lifetimes is wrong.

That is unfair. The first thing the official book teaches you about lifetimes is useful. It is the same way the first thing you are taught about the atom is useful — the planetary model, electrons orbiting like little planets — and then somewhere around your second physics class somebody says actually it’s a probability cloud and the planetary model evaporates and you have to start over.

Lifetimes are like that. The planetary model says: a lifetime is how long a value lives. A reference &'a T lives for as long as 'a lives. Functions take references with lifetimes, and the lifetimes have to “match up.” This is the model that gets you through the borrow checker for normal code. It is enough to write a request handler or a parser combinator. It is not enough to read the error messages we are about to read.

The probability cloud version is this: lifetimes are not durations. They are constraints on relationships between scopes. The compiler doesn’t know how long anything lives at runtime. It can’t. Lifetimes are entirely a compile-time construct, and what they describe is a system of inequalities. 'a: 'b does not mean “'a lives longer than 'b.” It means “'a is at least as long as 'b” — that any reference that satisfies 'a is also one that satisfies 'b. It is a subtyping relation between regions of code.

This sounds pedantic. It is pedantic. It is also the only mental model that makes the next chapter’s error messages comprehensible.

Lifetimes as relations

Consider:

#![allow(unused)]
fn main() {
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}
}

The planetary model: 'a is the lifetime of x and y and the return value, and they all “live the same amount of time.” Fine. Works. But now:

fn main() {
    let result;
    let x = String::from("longer string");
    {
        let y = String::from("short");
        result = longest(&x, &y);
    }
    println!("{}", result);
}

The planetary model says: this should work, because at the point we call longest, both x and y are alive. The compiler says no:

error[E0597]: `y` does not live long enough
   --> src/main.rs:6:31
    |
5   |         let y = String::from("short");
    |             - binding `y` declared here
6   |         result = longest(&x, &y);
    |                              ^^ borrowed value does not live long enough
7   |     }
    |     - `y` dropped here while still borrowed
8   |     println!("{}", result);
    |                    ------ borrow later used here

In the relational model, what’s happening is clear. 'a is a single lifetime that has to satisfy all three references — both inputs and the return. The compiler picks the smallest 'a that works. That 'a has to outlive result’s use on line 8. It also has to be a region during which &y is valid. There is no such region. The two constraints are unsatisfiable. There is no 'a that is both at least as long as result’s scope and contained within y’s scope. The compiler is not telling you y “doesn’t live long enough” in some absolute sense; it is telling you that no assignment of regions to 'a makes the inequalities work.

The duration model can’t see this, because in the duration model both x and y are alive at the moment of the call, and that’s the only thing that should matter. The relational model says: it’s not about the call site, it’s about the system of constraints the function signature imposed.

This is the shift. A lifetime annotation is a promise about what relationships the function preserves. fn longest<'a>(x: &'a str, y: &'a str) -> &'a str says: “give me two references that you can find a common lifetime for, and I will return a reference good for that same lifetime.” The function does not store, extend, or shrink anything. It threads the relationship through.

Once you internalize this, signatures stop reading like declarations and start reading like contracts. fn parse<'input>(s: &'input str) -> Token<'input> says: “the token I return cannot outlive the input string I read it from.” fn get<'a, 'b>(map: &'a Map, key: &'b str) -> Option<&'a Value> says: “I’ll return something tied to the map, and the key can come from anywhere — the key’s lifetime doesn’t constrain the result.” The asymmetry of the two parameters is the point.

`'static` is not “lives forever”

'static is the lifetime that any reference-into-the-static-data-segment satisfies. String literals have type &'static str because the bytes are baked into the binary. That part is well known.

What is less well known: 'static does not require that the data lives forever. It requires that the data could live forever — equivalently, that the type does not borrow from any non-'static region. A String you allocated at runtime is 'static in the sense that satisfies T: 'static, even though you can drop the String and free the memory. The bound T: 'static does not say “lives forever”; it says “owns its data, or borrows from 'static.”

This is why std::thread::spawn requires its closure to be 'static. The thread can outlive the calling scope. If the closure borrowed anything from the calling scope, the borrow could become dangling. So the closure must not borrow anything that is not itself 'static. A String moved into the closure is fine, even though the String will be dropped when the thread ends. An owned String does not borrow.

#![allow(unused)]
fn main() {
fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T + Send + 'static,
    T: Send + 'static,
}

Read the 'static here as: “this closure does not borrow anything that the calling scope has any business reclaiming.” Not: “this closure runs forever.”

The confusion gets worse when 'static shows up as a bound on a generic parameter, like T: 'static, because then you have a type with a lifetime bound, not a reference. Same rule applies. T: 'static means the type doesn’t borrow non-'static data. Box<dyn Display> is 'static. Box<dyn Display + 'a> is not.

The error message that gets you here is the one that says the parameter type 'T' may not live long enough. The compiler is saying: I cannot prove that T doesn’t borrow from a region that ends. The fix is almost always to add T: 'static (if the type really doesn’t borrow anything ephemeral) or to thread a lifetime parameter through (if it does).

Elision rules and when they fail you

Lifetime elision is the compiler’s promise to fill in lifetimes for you when there is exactly one obvious choice. The rules are:

Each elided input lifetime gets its own fresh lifetime parameter.
If there is exactly one input lifetime (elided or not), it is assigned to all elided output lifetimes.
If there are multiple input lifetimes but one of them is &self or &mut self, the lifetime of self is assigned to all elided output lifetimes.

That is the whole rule set. Three rules. Memorize them.

These rules cover the vast majority of function signatures. They do not cover:

Functions returning a reference computed from multiple non-self references (rule 2 doesn’t apply, rule 3 doesn’t apply, you have to write the lifetime).
Functions where the returned reference’s lifetime should be tied to one specific input but not others (the elision picks self if available, which may not be what you want).
Closures (closure lifetime inference is separate from elision and is its own kind of pain — see chapter 3).
Anything where the desired output lifetime is not equal to any input lifetime (e.g., a function that returns a reference into a 'static global, regardless of input lifetimes — you have to write &'static explicitly).

When elision picks the wrong thing, you get a borrow checker error at the call site, not at the function definition. This is one of the most common reasons people stare at a borrow checker error and say “but the function looks fine.” The function is fine. The function’s elided signature is wrong for what the function actually does, and the call site is the first place where the lie shows up.

The fix is to write the lifetimes explicitly. If you find yourself doing this often in a function with &self, ask whether you actually wanted the return tied to self (which elision is giving you for free) or tied to one of the arguments (which you have to spell out).

The mental shift

Stop asking “how long does this live.” Start asking “what relationship is this signature enforcing.” When the borrow checker rejects code, do not look at the offending expression and ask whether the data is alive. Look at the function signatures involved and ask which constraints between which scopes are unsatisfiable.

This shift is necessary because everything in the rest of this book — variance, HRTB, async lifetimes, Pin projections — is reasoning at the level of constraints between regions, not durations of values. If you are still in the duration model, the rest of this book will read as gibberish. If you are in the relational model, it will read as math, which is harder but at least answerable.

A practical exercise: take a function from your codebase whose signature uses lifetime elision. Write out the desugared signature with all lifetimes explicit. Now imagine you are a hostile lawyer trying to break the contract the function signed. What can you pass in that satisfies the signature but violates the function’s actual intent? Most of the time, the answer is “nothing,” and the elision was right. Sometimes the answer is “an argument from a shorter scope than I assumed,” and you have just found a future bug.

What’s coming

The next chapter is variance. It is the prerequisite for understanding why some lifetime substitutions the compiler makes are legal and others aren’t, which is the prerequisite for understanding what the HRTB error in chapter 3 is telling you, which is the prerequisite for chapter 5, which is the prerequisite for Pin. Each chapter assumes the previous one. The relational model from this chapter is the one piece of equipment we will not put down.

Variance

Variance is the part of the type system you have been using correctly without knowing it existed, and it stays invisible until the day it doesn’t, and then it ruins your week.

Variance answers a question that, in everyday Rust, has an answer so obvious you never ask it: when I have a &'long T and the function wants a &'short T, can I pass it? You know the answer is yes — references that live longer can be used where references that live shorter are needed. That “yes” is a statement about variance. Specifically, it is the statement that &'a T is covariant in 'a. The longer the lifetime, the more places the reference is acceptable.

Now ask the same question about &'a mut T. Can a &'long mut T be passed where a &'short mut T is wanted? The answer is also yes, and people often shrug and assume the rule generalizes. But ask the other question: can a &'a mut Vec<&'long str> be passed where a &'a mut Vec<&'short str> is wanted? The answer is no, and the reason it is no is that &mut T is invariant in T. That distinction — covariant in the reference’s lifetime, invariant in the referent’s type — is the thing that, once you fail to internalize it, will produce the most cryptic error messages in your career.

Let’s get precise.

Definitions, with care

Variance is a property of a generic type constructor with respect to one of its parameters. Given a type constructor F<_> and a subtyping relation A <: B (read: “A is a subtype of B,” i.e., A can be used wherever B is expected), F is one of:

Covariant in its parameter if A <: B implies F<A> <: F. Subtyping is preserved.
Contravariant in its parameter if A <: B implies F <: F<A>. Subtyping is reversed.
Invariant in its parameter if neither implication holds. Subtyping requires exact equality.

In Rust, the only subtyping that exists in normal code is lifetime subtyping: 'long: 'short (read: “'long outlives 'short”) implies 'long <: 'short for the purpose of variance. Note the direction: the longer lifetime is the subtype. This feels backwards on first encounter and remains slightly disorienting forever. The reason is that a reference good for 'long can be used in any context that needs a reference good for 'short — the long-lived reference is more capable, and “more capable” is what makes something a subtype in this kind of system.

So: &'long T is a subtype of &'short T. &'a T is covariant in 'a. Familiar.

`&'a mut T` and the surprise

Here is the thing that makes variance suddenly matter. &'a mut T is covariant in 'a (same as &'a T) but invariant in T.

Why invariant in T? Because a &mut T lets you write a T. If &'a mut T were covariant in T, you could pass a &mut Vec<&'static str> where a &mut Vec<&'short str> was expected, and the function could then push a &'short str into the vector, and now you have a Vec<&'static str> containing a non-'static reference. Soundness gone.

The classic illustration:

fn assign<T>(input: &mut T, val: T) {
    *input = val;
}

fn main() {
    let mut hello: &'static str = "hello";
    {
        let world = String::from("world");
        assign(&mut hello, &world); // ERROR
    }
    println!("{hello}"); // would be a dangling reference
}

If &mut T were covariant in T, this code would compile and hello would dangle. The compiler refuses:

error[E0597]: `world` does not live long enough
   --> src/main.rs:8:28
    |
7   |         let world = String::from("world");
    |             ----- binding `world` declared here
8   |         assign(&mut hello, &world);
    |                            ^^^^^^ borrowed value does not live long enough
9   |     }
    |     - `world` dropped here while still borrowed

The error doesn’t mention variance. It almost never does. But variance is what’s enforcing the constraint. The compiler refused to coerce &mut &'static str to &mut &'short str, because &mut T is invariant in T, so it then tried to use the actual lifetime of world everywhere, which was too short.

Memorize this: writing through a reference makes the type position invariant. Reading through a reference (and only reading) keeps the type position covariant. Anything that is “in” or “out” must be invariant.

The variance table

Here is the variance of the standard-library types you actually use. Read it once. Refer back to it.

Type	Variance in `T`	Variance in `'a`
`&'a T`	covariant	covariant
`&'a mut T`	invariant	covariant
`Box<T>`	covariant	—
`Vec<T>`	covariant	—
`*const T`	covariant	—
`*mut T`	invariant	—
`Cell<T>`, `RefCell<T>`, `UnsafeCell<T>`	invariant	—
`fn(T) -> ()`	contravariant	—
`fn() -> T`	covariant	—
`fn(T) -> T`	invariant	—
`PhantomData<T>`	covariant	—
`PhantomData<&'a T>`	covariant	covariant
`PhantomData<&'a mut T>`	invariant	covariant
`PhantomData<fn(T)>`	contravariant	—
`PhantomData<fn() -> T>`	covariant	—
`PhantomData<fn(T) -> T>`	invariant	—

Two patterns to notice.

First, function pointers. fn(T) -> () is contravariant in T, because a function that takes Animal is more general than a function that takes Dog — it can handle any animal, including dogs. So if you have a fn(Animal), you can use it wherever a fn(Dog) is needed (it accepts strictly more things). Subtyping reversed. Contravariance.

Second, Cell<T> and friends. They are invariant for the same reason &mut T is: they let you write through. The only way to safely permit interior mutation is to disallow type coercion entirely.

Where variance shows up

In ordinary Rust code, variance is invisible. Lifetime subtyping happens silently on every function call, and the variance rules let it. You only notice variance when:

You write a generic struct that stores a *mut T or has interior mutability, and the inferred variance is invariant when you wanted covariant. The fix is PhantomData<T> or PhantomData<*const T> to override the variance.
You write a generic struct that stores a function pointer, and the inferred variance is contravariant when you wanted invariant. Same fix, different PhantomData.
You hit an HRTB inference failure on a closure, and the underlying reason is that the closure’s argument position is contravariant in its lifetime, which makes the inference engine give up. We will revisit this in chapter 3.
You write a self-referential type and need to reason about whether moves preserve soundness. We will revisit this in chapter 6.
An async function signature is rejected because the inferred future has a Send bound that depends on a borrow whose variance the compiler cannot bridge. Chapter 5.

That is most of the surface area where variance bites. None of it shows up in the borrow checker’s first thousand error messages you encounter. All of it shows up in the next thousand.

The PhantomData patterns

PhantomData<T> is a zero-sized type whose only job is to tell the compiler that the surrounding struct logically contains a T, even though it doesn’t, for purposes of:

Drop check (the struct counts as containing a T for drop ordering).
Variance inference (the struct gets the variance of T).
Auto-trait inference (the struct gets T’s Send/Sync).

The variance use is what we care about here. The patterns:

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// "I logically own a T."
// Variance: covariant in T. Drop-check: yes.
struct OwnsT<T>(PhantomData<T>);

// "I have a raw pointer to T but logically own it."
// Variance: covariant. Drop-check: yes. Send/Sync: like T.
struct OwnsTViaPtr<T>(*const T, PhantomData<T>);

// "I borrow a T immutably for 'a."
// Variance: covariant in 'a, covariant in T.
struct BorrowsT<'a, T>(PhantomData<&'a T>);

// "I borrow a T mutably for 'a."
// Variance: covariant in 'a, INVARIANT in T.
struct BorrowsMutT<'a, T>(PhantomData<&'a mut T>);

// "I'm a callback that consumes a T."
// Variance: CONTRAVARIANT in T.
struct ConsumesT<T>(PhantomData<fn(T)>);

// "I'm a token tied to T but I never touch one."
// Use this when you want the marker but want to opt OUT of all
// the auto-trait inheritance and drop-check baggage.
struct TokenForT<T>(PhantomData<fn() -> T>);
}

The last one is the interesting trick. PhantomData<fn() -> T> is covariant in T (function return positions are covariant) but does not count as containing a T for drop check, and it is unconditionally Send + Sync regardless of T. This is the right PhantomData to use when you have a marker type that is logically “associated with” T but you don’t actually own or borrow one.

The wrong PhantomData will compile but will produce surprising errors at the call site, often involving auto-trait bounds that look unrelated to the marker. If you find yourself debugging a Send error on a struct that contains a PhantomData<*const T>, the answer is probably to switch to PhantomData<fn() -> T>.

A worked example

Here is a real bug that variance catches.

#![allow(unused)]
fn main() {
struct Cache<'a, T> {
    items: Vec<&'a T>,
}

impl<'a, T> Cache<'a, T> {
    fn add(&mut self, item: &'a T) {
        self.items.push(item);
    }

    fn get(&self, idx: usize) -> Option<&&'a T> {
        self.items.get(idx)
    }
}
}

You then try:

fn main() {
    let value = String::from("hello");
    let mut cache: Cache<String> = Cache { items: vec![] };
    cache.add(&value);

    let value2 = String::from("world");
    cache.add(&value2);
    drop(value2); // can we?
    println!("{:?}", cache.get(0));
}

The compiler says no, because Cache<'a, T>’s Vec<&'a T> field, behind &mut self in add, makes the lifetime invariant — the cache’s 'a got pinned to the intersection of all the lifetimes you put into it. Once you call add with a reference shorter than the cache’s apparent lifetime, the cache’s 'a becomes that shorter lifetime, and any subsequent use of the cache’s contents respects that.

This is variance silently doing its job. The Vec<T> field is covariant in T, so &'long T can be coerced to &'short T for reads. But add takes &mut self, and self’s Cache<'a, T> is invariant in 'a because the field is read-write through &mut self. So the compiler can’t shrink 'a for the duration of the call — it has to unify the call site’s argument lifetime with 'a. Hence the constraint.

You did not type the word “variance” anywhere. Variance was the mechanism.

Sources and further reading

The Subtyping and Variance section of the Rustonomicon — the canonical reference, terse but accurate.
Niko Matsakis’s blog posts on variance from the early Rust days, particularly the ones that argued for the current variance inference algorithm.
Aaron Turon’s series on the type system internals, which explains why Cell<T> is invariant from first principles.

The next chapter, on HRTBs, depends on this one. Make sure the variance table feels familiar before continuing — &mut T invariant in T, fn(T) -> () contravariant in T, Cell<T> invariant in T. If those feel arbitrary now, they will feel arbitrary in chapter 3, and the HRTB chapter is hard enough on its own.

Higher-Rank Trait Bounds

The syntax is for<'a>. The reading is “for any choice of 'a.” The reason it exists is closures, primarily, and the reason it produces such bad error messages is that closures are the part of Rust where lifetime inference, trait inference, and variance all collide at once.

If you have spent any time writing Rust libraries that take callbacks, you have seen this:

#![allow(unused)]
fn main() {
fn use_callback<F>(f: F) where F: Fn(&str) -> &str {
    let s = String::from("hello");
    let result = f(&s);
    println!("{result}");
}
}

This compiles. It works. You did not write for<'a> anywhere. But the compiler did, on your behalf, and the desugared signature is:

#![allow(unused)]
fn main() {
fn use_callback<F>(f: F) where F: for<'a> Fn(&'a str) -> &'a str { ... }
}

That for<'a> is the thing this chapter is about. It says: the closure must be callable with a reference of any lifetime, not some particular lifetime. The closure must work for every 'a, not for one 'a chosen up front.

The distinction matters because the alternative — picking a single 'a at the function boundary — would make the closure useless. The body of use_callback calls f with a reference whose lifetime is bounded by s, which is local to the function. That lifetime didn’t exist when use_callback was called. There is no way the caller could have picked it. So the trait bound has to be higher-ranked over all possible lifetimes — for<'a>, “for any 'a,” — to let the body pick a fresh local one and have the closure still satisfy the bound.

This is what HRTB is. A trait bound that quantifies over a lifetime universally, not existentially.

When elision works and when it doesn’t

For closure bounds where the lifetime appears in both an argument and the return, Rust elides for<'a> exactly the way it elides lifetimes in function signatures. So Fn(&str) -> &str is sugar for for<'a> Fn(&'a str) -> &'a str. This covers maybe 80% of real callback patterns.

It stops covering you in the cases where:

You write the lifetime explicitly because you want a non-elided pattern.
You return a Box<dyn Fn(...)> or store a closure in a struct.
You have multiple closures with related lifetimes.
The closure captures something with a lifetime.

In those cases, the elision either doesn’t apply or applies wrongly, and you have to write for<'a> yourself, and you should know what you’re saying.

The capture problem

Here is where the wheels come off. Suppose:

#![allow(unused)]
fn main() {
fn make_counter<'a>(prefix: &'a str) -> impl Fn(&str) -> String {
    move |name| format!("{prefix}: {name}")
}
}

The compiler will tell you something like:

error[E0700]: hidden type for `impl Fn(&str) -> String` captures lifetime that does not appear in bounds
  --> src/lib.rs:1:42
   |
1  | fn make_counter<'a>(prefix: &'a str) -> impl Fn(&str) -> String {
   |                 --                      ^^^^^^^^^^^^^^^^^^^^^^^
   |                 |
   |                 hidden type `[closure@src/lib.rs:2:5: 2:12]` captures the lifetime `'a` as defined here

The error is real, the explanation is somewhere off-screen, and the answer involves + 'a somewhere. The fix:

#![allow(unused)]
fn main() {
fn make_counter<'a>(prefix: &'a str) -> impl Fn(&str) -> String + 'a {
    move |name| format!("{prefix}: {name}")
}
}

The + 'a says: the returned impl Fn borrows from 'a and cannot outlive it. Without that bound, the returned type has no information about the borrow it captures, and the compiler refuses to lose track of it.

This is a separate concept from HRTB, but they interact. The closure’s argument is for<'b> Fn(&'b str) (universally quantified — the closure must work for any caller-chosen lifetime). The closure’s capture is at a fixed 'a (existentially given — it borrows the specific prefix that was passed in). So the type is:

#![allow(unused)]
fn main() {
impl for<'b> Fn(&'b str) -> String + 'a
}

Two different lifetimes, two different quantifications, one closure. The compiler infers all of this for you when it can. When it can’t, it asks you to be explicit, and you have to know which lifetime is which.

When inference gives up

Now we get to the bad place. Consider:

#![allow(unused)]
fn main() {
fn apply<F, T, R>(items: &[T], f: F) -> Vec<R>
where
    F: Fn(&T) -> R,
{
    items.iter().map(|x| f(x)).collect()
}
}

Looks fine. Compiles. Now you try to use it with a closure that returns a reference into its argument:

fn first_word(s: &String) -> &str {
    s.split_whitespace().next().unwrap_or("")
}

fn main() {
    let strings = vec![String::from("hello world"), String::from("foo bar")];
    let firsts: Vec<&str> = apply(&strings, first_word);
    println!("{firsts:?}");
}

This will produce an error like:

error: implementation of `FnOnce` is not general enough
  --> src/main.rs:8:29
   |
8  |     let firsts: Vec<&str> = apply(&strings, first_word);
   |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^ implementation of `FnOnce` is not general enough
   |
   = note: `fn(&'2 String) -> &'2 str {first_word}` must implement `FnOnce<(&'1 String,)>`, for any lifetime `'1`...
   = note: ...but it actually implements `FnOnce<(&'2 String,)>`, for some specific lifetime `'2`

This is the error message that ends careers. Read it carefully. The compiler is saying: the bound F: Fn(&T) -> R desugars (via elision) to F: for<'a> Fn(&'a T) -> R, where R is some type fixed at the call site. But first_word returns &'a str where 'a matches the input — its actual signature is fn first_word<'a>(s: &'a String) -> &'a str. There is no single R that works for all 'a. first_word is a higher-ranked function, but apply’s bound asked for an R that did not depend on the closure’s input lifetime.

The fix is to push the higher-ranking through to R:

#![allow(unused)]
fn main() {
fn apply<F, T, R: ?Sized>(items: &[T], f: F) -> Vec<&R>
where
    F: for<'a> Fn(&'a T) -> &'a R,
{
    items.iter().map(|x| f(x)).collect()
}
}

But wait — now R is tied to the lifetime of each individual call, but the result Vec<&R> has only one lifetime, which has to encompass all of them. The compiler will figure out that all of those 'as have to be the same lifetime, namely the lifetime of &items, and unify them. This works, but it took two passes through the trait solver and a manual rewrite of the signature.

This is the genre of error. It happens because the inferred for<'a> bound is not the one the user wanted — sometimes it’s too general (the closure can’t satisfy it), sometimes it’s not general enough (the body needs more freedom than the bound allows). The error message will use phrases like:

“implementation of Fn is not general enough”
“one type is more general than the other”
“'1 must outlive '2”
“expected a function pointer, found a closure”

When you see those phrases, you are in HRTB-land. The fix is almost always to write the for<'a> bound explicitly and to think hard about which type and lifetime parameters need to be inside it and which need to be outside.

The patterns that work

Here are the small number of HRTB patterns that cover most real code. Memorize the shapes.

Pattern 1: callback that takes a reference and returns nothing.

#![allow(unused)]
fn main() {
fn for_each<T, F: Fn(&T)>(items: &[T], f: F) {
    for item in items { f(item); }
}
}

Elision works. Don’t write the for<'a> explicitly; you don’t need to.

Pattern 2: callback that takes a reference and returns a value (not borrowing).

#![allow(unused)]
fn main() {
fn map<T, R, F: Fn(&T) -> R>(items: &[T], f: F) -> Vec<R> {
    items.iter().map(f).collect()
}
}

Elision works because R doesn’t borrow.

Pattern 3: callback that takes a reference and returns a reference into the same data.

#![allow(unused)]
fn main() {
fn map_ref<'a, T, R: ?Sized, F>(items: &'a [T], f: F) -> Vec<&'a R>
where
    F: for<'b> Fn(&'b T) -> &'b R,
{
    items.iter().map(|x| f(x)).collect()
}
}

This is the case that breaks. You need explicit for<'b> and you need to thread 'a through the result manually. Note: R: ?Sized lets it work for str and other unsized types; drop it if you don’t need that.

Pattern 4: storing a closure in a struct.

#![allow(unused)]
fn main() {
struct Validator<F: for<'a> Fn(&'a str) -> bool> {
    check: F,
}
}

The for<'a> is required because the struct has no other source for 'a. If you want the closure to also borrow something with a lifetime, you need a separate lifetime parameter on the struct.

Pattern 5: trait object closure.

#![allow(unused)]
fn main() {
type Callback = Box<dyn for<'a> Fn(&'a str) -> &'a str>;
}

Trait objects need explicit higher-ranking because trait object syntax does not elide. Always write for<'a> for trait object closure types that take or return references.

These five patterns cover the overwhelming majority of real code. When you hit something that doesn’t fit, write out the desugared signature with all lifetimes explicit, then add for<'a> quantifiers around the parameters that aren’t bound to anything outside the closure.

Why it has to be this way

A reasonable question: why doesn’t the compiler just figure all of this out? Closures are values. Their types are inferred. Why is HRTB inference, of all things, the place where the inference engine tells you to do its job for it?

The honest answer is that universal quantification over lifetimes is not in general decidable in the presence of trait bounds, and the partial decision procedure the compiler uses has to bail out somewhere. The for<'a> quantifier interacts with associated types, with where clauses, with auto traits, and with variance, and the failure modes compound. The Rust team has improved this dramatically over the years — every couple of releases, another class of HRTB inference failure starts working — but there is a hard limit somewhere short of “the compiler always figures it out.”

Niko Matsakis has written extensively on this; the relevant phrase to search for is “leak check” and “implied bounds.” The short version: the compiler cannot always tell whether the universal quantifier in a for<'a> bound is satisfied without checking, in effect, every lifetime, and the algorithm it uses is sound but incomplete. So it errs on the side of rejecting valid programs rather than accepting invalid ones, and you have to write the bound explicitly to convince it.

This is one of those places where the type system is genuinely making a tradeoff. The tradeoff is: we will accept worse error messages in exchange for guaranteeing soundness, and we will accept asking the user to write for<'a> in exchange for not having to wait three years for a complete inference algorithm. You may disagree with the tradeoff. Most working Rust engineers, on a long enough timescale, come around to it.

Sources and further reading

The Higher-Rank Trait Bounds section of the Rust Reference.
Niko Matsakis’s posts on higher-ranked subtyping, particularly the ones discussing the leak check. (His blog is the canonical source for this material.)
The “Closures: Anonymous Functions that Can Capture Their Environment” chapter of the Rust book, which gives the basic vocabulary even if it doesn’t go this deep.

Next: async. Where everything from the last three chapters comes together at once.

Async Internals

async fn is sugar. That sentence is the entire chapter, expanded.

You probably know async fn is sugar already. You may have seen the trait Future and noticed that async blocks evaluate to something implementing it. You have used .await and noticed that the function suspends and resumes. You have written tokio::main or async-std::main or whatever your runtime is and gotten on with your life.

This chapter is the part where we don’t get on with our life. We desugar everything, look at what the compiler is generating, and understand why every async lifetime error in the next chapter is the consequence of a mechanical transformation that the compiler does on your behalf and then refuses to apologize for.

What `Future` actually is

#![allow(unused)]
fn main() {
pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

pub enum Poll<T> {
    Ready(T),
    Pending,
}
}

Three things to notice.

poll takes Pin<&mut Self>, not &mut self. This is the entire reason Pin exists in the standard library, and we will spend chapter 6 on it. For now: Pin<&mut Self> is &mut self plus a promise that the futre will not be moved in memory. Some futures need that promise to be sound. Most don’t, but the trait has to assume the strongest constraint.

poll takes a Context<'_>, which contains a Waker. The waker is the future’s link back to the executor. When the future returns Poll::Pending, it is responsible for arranging — somehow — for cx.waker().wake() to be called when the future is ready to make progress. If it doesn’t, the executor will never poll it again, and the future will hang forever. The future is responsible for waking itself. The executor does not poll on a timer.

poll returns Poll<T>, not a result. A future is not “done” or “errored.” A future is “ready with a value” or “still working.” If you want errors, your T is a Result. The future machinery does not know about errors.

That’s the trait. Three lines, three subtle traps each.

The state machine

Now the desugaring. Take this:

#![allow(unused)]
fn main() {
async fn fetch_and_parse(url: &str) -> Result<Data, Error> {
    let body = http_get(url).await?;
    let parsed = parse(&body)?;
    Ok(parsed)
}
}

What the compiler generates is, conceptually:

#![allow(unused)]
fn main() {
fn fetch_and_parse<'a>(url: &'a str) -> impl Future<Output = Result<Data, Error>> + 'a {
    enum State<'a> {
        Start { url: &'a str },
        WaitingForGet { fut: HttpGetFuture<'a> },
        Done,
    }

    struct FetchAndParseFuture<'a> { state: State<'a> }

    impl<'a> Future for FetchAndParseFuture<'a> {
        type Output = Result<Data, Error>;
        fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
            loop {
                match &mut self.state {
                    State::Start { url } => {
                        let fut = http_get(url);
                        self.state = State::WaitingForGet { fut };
                    }
                    State::WaitingForGet { fut } => {
                        // This Pin projection is the unsafe part; pin-project handles it for you.
                        let pinned_fut = unsafe { Pin::new_unchecked(fut) };
                        match pinned_fut.poll(cx) {
                            Poll::Pending => return Poll::Pending,
                            Poll::Ready(Err(e)) => {
                                self.state = State::Done;
                                return Poll::Ready(Err(e));
                            }
                            Poll::Ready(Ok(body)) => {
                                let parsed = parse(&body);
                                self.state = State::Done;
                                return Poll::Ready(parsed);
                            }
                        }
                    }
                    State::Done => panic!("polled after completion"),
                }
            }
        }
    }

    FetchAndParseFuture { state: State::Start { url } }
}
}

This is roughly what async fn produces. The actual generated code is more efficient, more carefully structured, and uses internals that aren’t public, but the shape is this: an enum with a variant per await point, plus a fixed initial state and a terminal “done” state, plus a poll method that drives state transitions.

A few things become legible.

The await is a match on poll. When you write expr.await, the compiler generates a state transition: enter the “waiting” state with expr as the inner future, then loop polling it; when it returns Pending, return Pending; when it returns Ready(v), take v and continue.

The state machine contains the inner futures. When you let fut = http_get(url).await, the HttpGetFuture you got back from http_get is stored as a field of the outer state machine for as long as you’re awaiting it. This is the first hint at why Pin matters: that inner future has its own internal pointers, possibly into its own state, and those pointers would dangle if the outer state machine moved in memory.

The state machine borrows from its arguments. Notice the 'a lifetime parameter. The future returned by fetch_and_parse(url) borrows from url for its entire life. If url goes out of scope before the future completes, the future is invalid. The lifetime 'a is part of the future’s type and gets propagated through every level of nesting. This is the main source of async lifetime pain in chapter 5.

Local variables become enum fields. Any local that is alive across an await point gets stored in the state machine. This is why holding references across await points is the central source of async lifetime errors: the reference, and the data it borrows, both have to be live for as long as the future is, and the type system has to prove that.

What `poll` actually does

The executor’s job is to call poll on a future repeatedly until it returns Ready. That is the entire executor contract. The runtime is just code that calls poll. There is no magic in tokio or async-std or smol; they are libraries with executors, schedulers, and reactors that arrange for poll to be called at the right times.

When poll returns Pending, the future has registered its Waker somewhere — with the OS (epoll, kqueue, IOCP), with a timer wheel, with a channel — and that registration will eventually trigger a call to waker.wake(), which tells the executor “this future is ready, please call poll again.” The executor, when it gets that signal, schedules the future to be polled. It does not poll immediately; it puts the future into a runqueue. Eventually, some thread the executor controls picks the future up and polls it again.

This is cooperative scheduling. Nothing preempts a future. If your future runs a tight CPU-bound loop without any .await, it will run that loop to completion on the executor thread, blocking every other future scheduled on that thread. This is the “blocking the executor” problem and the reason tokio::task::spawn_blocking exists — to move CPU-bound work off the async threadpool and onto a worker pool that the executor doesn’t depend on.

The flip side: an await is a yield point. When your code reaches await, it gives the executor an opportunity to run something else. Between await points, your code runs synchronously and uninterruptibly on the executor thread. This has two consequences:

Code between await points is de facto atomic from the executor’s perspective. If you take a MutexGuard and don’t await while holding it, no other future on this thread can take the same lock. (Other threads can; this isn’t a real critical section.)
Code between await points is not atomic from a multi-threaded perspective. If your future is Send and the executor is multi-threaded, the future can be moved between threads at every await. This means await points are also where data races can sneak in if your shared state isn’t properly synchronized.

The single most important mental model in async Rust is: await is where things happen. Before await, you have set up state. After await, you have new state. At await, the world might change underneath you. Schedule, suspend, race, drop — all of these are at await. Code between awaits is boring and synchronous.

Why a runtime is required

Rust’s standard library defines Future, Poll, Context, and Waker. It does not provide an executor. There is no std::async::run_until_complete. You cannot run a Future with the standard library alone.

This is a deliberate design choice. The reasons:

Executor strategy is a deep technical decision. Should it be single-threaded or multi-threaded? Work-stealing or fixed assignment? Should I/O be epoll or io_uring or completion-based? Should there be priority queues? These are valid questions with conflicting answers, and standardizing one would foreclose the others.
Tying async to a single executor would freeze its evolution. The async ecosystem moved fast in 2018-2021 and is still moving. Being able to swap tokio for async-std for smol for whatever comes next has been important to that motion, even if in practice 90% of production code uses tokio.
The standard library has very strong stability guarantees. Putting an executor in std would lock its design forever. The Rust project chose not to.

The cost is that “hello world” in async Rust is tokio = "1" or async-std = "1" in your Cargo.toml, plus a #[tokio::main] macro on your main. It is, by some lights, embarrassing that this is the case. It is also the reason you can run async Rust on bare metal, in a kernel, in a microcontroller, in a browser via WebAssembly — wherever an executor can be written, async Rust can run.

Why `await` desugars to a state machine

This is worth one more pass, because the consequences are everywhere.

The naive way to implement await would be via threads. Each async fn becomes a thread; await becomes a thread sleep. The runtime can multiplex many tasks onto fewer OS threads via cooperative scheduling. This is the “stackful coroutine” or “green thread” model. Go uses it. Erlang uses it. It works.

Rust does not use it. Rust uses stackless coroutines, which is what the state machine desugaring produces. The differences:

A stackful coroutine has its own stack, allocated separately, with a fixed maximum size. Switching between coroutines means switching stacks, which involves an actual context switch.
A stackless coroutine has no stack of its own. Its “stack” is a fixed-size struct containing exactly the local variables that are live across yield points. Switching is a function return.

The stackless model has two big wins: it is much cheaper (no separate stack allocation, no context switch) and it is composable (a future is just a value of some type implementing Future; you can pass it around, store it in a struct, await it later, drop it without running it). The cost is that the entire coroutine has to fit in a fixed-size struct that the compiler computes at compile time, which means the compiler has to know, ahead of time, what every local variable’s type is. Including the types of nested futures, which are themselves state machines, which are themselves structs containing their locals’ types.

This is why your async function’s future type is enormous, deeply nested, and has a name that is both unspellable and visible in error messages. It is also why error messages mention sizes — sometimes futures get too big to fit on the stack, and you have to Box::pin them to move them to the heap.

What this chapter set up

You now have:

A model of Future as “a thing the executor calls poll on.”
A model of await as “yield to the executor, with the inner future’s state stored in the outer state machine.”
A model of async fn as “compiler-generated state machine that borrows from its arguments and stores its locals across yield points.”
A model of the executor as “code that calls poll and arranges for Wakers to schedule subsequent polls.”

That model is enough to understand the next chapter, which is about what happens when the things you store across yield points have lifetimes, and about why Send-ness propagates through the entire state machine in ways that surprise you.

Sources

The Asynchronous Programming in Rust book — especially chapter 2, “Under the Hood.”
Aaron Turon’s Zero-cost futures in Rust (2016), which laid out the design that became today’s async.
Withoutboats’s posts on Pin and the future of async, which are unfortunately scattered but collectively the best explanation of why the design landed where it did.
The original Future RFC (RFC 2592) and async/await RFC (RFC 2394).

Async and Lifetimes Collide

Holding a reference across a function boundary is one problem. The borrow checker handles it. You’ve been doing it forever and it works.

Holding a reference across an await point is a different problem. The reference now needs to be alive not just until the function returns, but until the entire future completes — which might be milliseconds, or seconds, or minutes, on a different thread, after being woken by an OS event you don’t directly control. That is a longer-lived constraint, applied to a more complicated graph, by a less expressive part of the type system, with worse error messages.

Welcome.

The basic shape

Consider:

#![allow(unused)]
fn main() {
async fn process(data: &[u8]) -> usize {
    let len = data.len();
    tokio::time::sleep(Duration::from_millis(10)).await;
    len + data.iter().filter(|&&b| b == 0).count()
}
}

This works. data is borrowed across the await, and the future returned by process borrows from data for its entire lifetime. The compiler infers, via elision, that the returned future’s type is impl Future<Output = usize> + 'a, where 'a is the lifetime of data. The caller is required to keep data alive until they await the future to completion.

Now try storing the future:

#![allow(unused)]
fn main() {
async fn caller() {
    let bytes = vec![1, 2, 3, 0, 4, 0];
    let fut = process(&bytes);
    drop(bytes);  // can we?
    let result = fut.await;
}
}

We cannot. The borrow checker rejects:

error[E0505]: cannot move out of `bytes` because it is borrowed
   --> src/main.rs:4:10
    |
3   |     let fut = process(&bytes);
    |                       ------ borrow of `bytes` occurs here
4   |     drop(bytes);
    |          ^^^^^ move out of `bytes` occurs here
5   |     let result = fut.await;
    |                  --- borrow later used here

This is fine. The error is well-formed, the message is comprehensible, and the fix is obvious — don’t drop bytes before awaiting.

The pain starts when the await happens inside something that demands Send.

The `Send` bound, propagated

Most async runtimes are multi-threaded. tokio::spawn requires its future to be Send, because the executor may move the task between threads at any await point. Send for a future means: every variable held across an await is Send. Not “the future as a whole is Send if it doesn’t currently hold non-Send data.” Every variable that could be alive at any await point must be Send. The auto-trait inference is structural and pessimistic.

Now consider:

use std::cell::RefCell;

async fn updater(state: &RefCell<i64>) {
    let mut guard = state.borrow_mut();
    let new_val = fetch_update().await;
    *guard += new_val;
}

#[tokio::main]
async fn main() {
    let state = RefCell::new(0);
    tokio::spawn(updater(&state));
}

This will fail with an error message that goes on for half a screen. The summary:

error: future cannot be sent between threads safely
   --> src/main.rs:14:18
    |
14  |     tokio::spawn(updater(&state));
    |                  ^^^^^^^^^^^^^^^ future returned by `updater` is not `Send`
    |
note: future is not `Send` as this value is used across an await
   --> src/main.rs:6:33
    |
5   |     let mut guard = state.borrow_mut();
    |         --------- has type `RefMut<'_, i64>` which is not `Send`
6   |     let new_val = fetch_update().await;
    |                                 ^^^^^^ await occurs here, with `mut guard` maybe used later

Two things just happened.

First, updater(&state) also failed because &state is &RefCell<i64>, and RefCell is not Sync, so &RefCell is not Send. Even if the future itself were Send, you can’t send the borrow that the future holds.

Second, the future is not Send because RefMut<'_, i64> is not Send, and the future stores guard across the await. The state machine has a field of type RefMut. The struct is therefore not Send. The future is not Send. tokio::spawn rejects.

Both issues are real. The fix is structural: replace RefCell with something Send + Sync (e.g., Mutex from tokio::sync, or parking_lot::Mutex if you don’t need async-aware locking; if you do need to await while holding the lock, you need tokio::sync::Mutex). And then you have to hold the guard across the await intentionally, knowing the lock is now held across a yield point, which means every other future trying to take that lock will block.

This is the genre. The Send bound makes you confront, structurally, every type that lives across an await. Most of the time, the answer is “use the async-aware version of the synchronization primitive.” Sometimes the answer is “rewrite the code so the lock is not held across the await.” Occasionally the answer is “don’t use tokio::spawn; use a LocalSet or a single-threaded runtime where Send is not required.” All three are valid. The choice depends on whether your fundamental constraint is throughput, correctness, or the structure of what you’re locking.

Auto-traits are structural and unforgiving

Send and Sync are auto-traits. They are inferred for every type, automatically, based on whether all the type’s components are themselves Send/Sync. For futures, this means: the inferred Send-ness of an async fn’s return type depends on every variable held across every await point.

This has two consequences that bite people.

Consequence 1: One non-Send variable, anywhere, kills Send-ness for the whole future. You can have a function with twenty await points, and on await number 13 you happen to hold a Rc<T> because you needed to share something cheaply between two helper closures, and now your future is not Send and tokio::spawn rejects it. The fix is Arc<T> everywhere, or restructuring so the Rc doesn’t cross the await. This is one of the places async Rust feels punitive — a small, local choice has a global type-level consequence.

Consequence 2: Lifetime parameters in your future infect Send-ness when they’re tied to non-Send references. If your future borrows a &T where T: !Sync, the future is not Send. Even if you don’t access T across an await — the borrow itself sits in the state machine. Auto-traits don’t know that you’re not going to look at it.

The combination is brutal. You can spend a long afternoon adding Send bounds to trait methods, only to find that the actual cause of the !Send future is a Cell<u8> field on a struct that’s borrowed three frames up. The trait method was fine; the structural definition of Send-ness propagated the failure all the way to the call site.

The borrow-across-await error, in full

Here is a real one. Take this:

#![allow(unused)]
fn main() {
async fn handler(server: &Server, req: Request) -> Response {
    let cache = server.cache.lock().await;
    let cached = cache.get(&req.key).cloned();
    drop(cache);

    if let Some(c) = cached { return c; }

    let result = server.compute(&req).await;
    server.cache.lock().await.insert(req.key, result.clone());
    result
}
}

Looks fine. Probably is fine. But suppose server.compute(&req) takes &self and &Request, and is async fn. Then the future returned by compute borrows from server and req. We await that future. While we’re awaiting it, we are holding the borrow.

Now try to tokio::spawn(handler(&srv, req)) from a multi-threaded context. The compiler will reject:

error: future cannot be sent between threads safely
   --> src/main.rs:20:18
    |
20  |     tokio::spawn(handler(&srv, req));
    |                  ^^^^^^^^^^^^^^^^^^ future returned by `handler` is not `Send`
    |
note: captured value is not `Send`
   --> src/main.rs:11:27
    |
11  |     let result = server.compute(&req).await;
    |                                 ^^^^ has type `&Request` which is not `Send`

This is unintuitive. &Request should be Send if Request: Sync. The error is misleading. What’s actually happening is that the future returned by compute has a type with a borrow of req in it, and that future is held across an internal await inside compute, and the inferred future type is not Send because the chain of inference broke somewhere two layers down.

The fix is usually one of:

Make sure all the types involved are Send + Sync. (Request, Server, every internal type.)
Add explicit + Send bounds to the futures returned by trait methods (more on this in chapter 7).
Restructure so the borrow doesn’t cross the await — pass owned values instead.

The general rule: if you spawn it, the entire dependency tree of futures must be Send end to end. A single broken link anywhere in the call graph brings down the spawn. The error message will point you to the proximate cause but not necessarily the underlying one.

Why this is a separate problem from synchronous borrows

In synchronous code, the borrow checker has perfect information. It knows the exact span of every reference. It knows, for each scope, which references are live and which have been dropped. The graph is a tree, and the analysis is local.

In async code, the borrow checker is reasoning about what the future will do when polled. The future is an object; it can be moved, awaited at a different point in the program, dropped without ever completing. The analysis has to be conservative across all of that. Specifically, every variable that exists at the moment of an await must be assumed to still exist at every subsequent point in the function, because the await could in principle suspend forever.

This means async code has an extra principle the borrow checker enforces: variables held across an await must satisfy both the synchronous borrow rules and the auto-trait constraints of the future’s eventual use site. The first constraint is the same one you’ve always lived with. The second is the one that produces the cryptic errors, because it depends on a use site that may be far away in the code.

A working pattern

Here is a pattern that handles most of the real cases. When you have a function that:

Takes some references.
Awaits something (possibly with a different lifetime).
Returns a value.

The default-correct version is:

#![allow(unused)]
fn main() {
async fn handler(server: Arc<Server>, req: Request) -> Response {
    let cached = {
        let cache = server.cache.lock().await;
        cache.get(&req.key).cloned()
    };

    if let Some(c) = cached { return c; }

    let result = server.compute(&req).await;
    server.cache.lock().await.insert(req.key.clone(), result.clone());
    result
}
}

Differences from the broken version:

Arc<Server> instead of &Server. The future owns its own reference-counted handle to the server, no lifetime parameter.
Request (owned) instead of &Request. The future owns the request.
The MutexGuard is dropped at the end of the inner block (the cached scope), so it is not held across the second await.

This is the “give up and clone” pattern, and chapter 9 will cover it more thoroughly. For now: if you find yourself fighting Send bounds and lifetime errors on a function you’re going to spawn, the answer is almost always to make the function take owned values (with Arc for shared ones), not borrowed values. The async runtime takes ownership of futures; trying to keep borrows alive across that ownership transfer is fighting the runtime’s design.

Why you keep losing

The async lifetime errors feel disproportionately bad for three combined reasons.

The error messages reference the inferred future type, which is unspellable and arbitrarily large. “future cannot be sent between threads safely” is true but unhelpful when the future has thirty fields and the offending one is buried in a state machine three async functions deep.

The fix is often non-local. The thing that needs to change is sometimes the type of a struct field, or the bound on a trait method, or the Cell that someone used in a helper module two crates down. The error points at the spawn site, but the spawn site is not where the bug is.

The auto-trait propagation is unforgiving. You can write code that is correct, idiomatic, and would work fine in any sane non-Send context, and have it rejected because one internal type is !Send. The negation propagates all the way to the call site. The compiler is not wrong to do this — multi-threaded execution requires it — but it is a kind of cost-of-doing-business that synchronous Rust does not pay.

The path through is to lean on Arc, lean on owned data, lean on tokio::sync primitives, and to keep your async functions structurally simple — short, well-typed, with Send bounds explicit at the boundaries. The next two chapters (Pin and async traits) explain why the design has to be this way, but the pragmatic survival kit is mostly: own your data, share via Arc, don’t borrow across awaits unless you’ve thought about it.

Sources

The Send and Sync chapter of the async book, which is direct about the limitations of Send approximation.
Tokio’s documentation on tokio::spawn, which has the cleanest summary of the Send requirement.
The async-trait crate’s documentation, which discusses why it adds Send bounds by default.
Niko Matsakis’s posts on async lifetime inference, scattered across his blog over several years.

Pin and Why It Has To Exist

Pin is the type in the standard library that, on first inspection, appears to do nothing. It is just a wrapper. It has no runtime behavior, no special memory layout, no fancy generated code. You can read its definition in five minutes and feel like you understand it, and be wrong.

Pin is the most subtle thing in std. The reason is that what Pin does is enforce a property on its contents that the type system has no other way of expressing: the value will not be moved in memory until it is dropped. That property is necessary for the soundness of self-referential structs, which the async machinery generates by the gigaton. Without Pin, async Rust as it exists today would be unsound. With Pin, it is sound but the API has rough edges that we will spend the rest of the chapter exploring.

The problem `Pin` solves

Rust assumes, by default, that values can be moved freely. If you have a Vec<T> on the stack at one address, the compiler may legally move it to a different address — for example, when you return it from a function, when you push it into a containing collection, when you pass it by value. Moves are, in Rust, just memcpy. The bytes are copied; the source is now garbage and the destination is the new official location.

For most types, this is fine. A Vec<T> carries its data on the heap, and the heap pointer it stores is unaffected by where the Vec’s own bytes live. Moving the Vec is moving a (ptr, len, cap) triple, which doesn’t invalidate ptr.

But what about a struct that contains a pointer into itself?

#![allow(unused)]
fn main() {
struct SelfRef {
    data: String,
    ptr: *const String,
}

impl SelfRef {
    fn new(s: &str) -> Self {
        let mut sr = SelfRef { data: s.to_string(), ptr: std::ptr::null() };
        sr.ptr = &sr.data;
        sr
    }
}
}

This is unsound the moment you move sr. The ptr field still points at the original location of data, which has now been memcpy’d somewhere else. Dereferencing ptr after the move reads stale memory.

You don’t write this kind of code by hand often. But the compiler does, every time you write an async fn that holds a borrow across an await. The state machine that the compiler generates for an async function looks roughly like:

#![allow(unused)]
fn main() {
enum State {
    Start { data: Vec<u8> },
    Awaiting { data: Vec<u8>, borrow_of_data: *const Vec<u8> },
    Done,
}
}

When the function transitions from Start to Awaiting, the borrow_of_data field is set to &data, but data is itself a field of the same enum. The future is self-referential. If you move it after the transition, the borrow becomes a dangling pointer.

The Rust team had two choices. One: forbid borrows across await points. This would have made async Rust nearly unusable, because every helper function that takes &self and is awaited would be illegal. Two: invent a way to mark futures as “do not move me,” and have the executor honor that constraint.

They chose option two. Pin is the marker.

The `Pin` API

#![allow(unused)]
fn main() {
pub struct Pin<P> {
    pointer: P,
}
}

Pin wraps a pointer-like type P. The pointer is, typically, &mut T or Box<T> or &T. The contract is: if T: !Unpin, then while the Pin exists, the T it points to will not be moved.

That Unpin qualifier is the escape hatch. Unpin is an auto-trait that says “this type doesn’t care about being pinned.” Most types are Unpin because they don’t have any reason to be pinned. Pin<&mut T> where T: Unpin is essentially the same as &mut T — you can move freely through it. The pin is decorative.

For types that are !Unpin, Pin actually does something. You can’t get a &mut T out of a Pin<&mut T> for !Unpin types via safe code. The only ways to do anything with a Pin<&mut T> are:

Use methods that take Pin<&mut Self> instead of &mut self (which is what Future::poll does).
Use Pin::as_mut to reborrow the pin (which gives you another Pin<&mut T>, not a &mut T).
Use unsafe { Pin::get_unchecked_mut() } to escape the pin, with the obligation to never actually move the pointee.

The whole !Unpin plus Pin machinery is designed so that:

You can write methods that operate on a pinned value.
You cannot write code that moves a pinned value out from under itself.
The unsafe get_unchecked_mut exists for when you need to violate this, taking on the proof obligation manually.

The state machines generated by async fn are !Unpin. They have to be — they are self-referential. So you can’t poll them through a &mut Future; you must poll them through a Pin<&mut Future>. This is why Future::poll takes Pin<&mut Self>. The pinning is the soundness guarantee that lets the state machine contain self-references.

Structural pinning and projection

Now we get to the part that is genuinely thorny.

Suppose you have a struct:

#![allow(unused)]
fn main() {
struct Composite {
    a: SomeFuture,
    b: u32,
}
}

You have a Pin<&mut Composite> and you want to call a.poll(cx). To do that, you need a Pin<&mut SomeFuture> from the Pin<&mut Composite>. This is called projection: deriving a pin to a field from a pin to the containing struct.

The question: is this safe?

It depends. If SomeFuture: !Unpin, then projecting a pin to it through Pin<&mut Composite> requires that moving the Composite would also move the SomeFuture. Which it does — a is a field of Composite, so a memcpy of Composite includes a memcpy of a. So if Composite is pinned (won’t be moved), then a is also pinned (also won’t be moved).

This is structural pinning. The pinning of the parent is structurally inherited by the field. Projecting a pin from Pin<&mut Composite> to Pin<&mut SomeFuture> is sound.

But: this is only sound if you also enforce that nobody can write code that moves a out of Composite while Composite is pinned. For example, if you have a method on Composite that takes &mut self and does std::mem::swap(&mut self.a, &mut other_future), that method moves a. If anyone has a structurally-projected pin to a at that moment, you have unsoundness.

The rules for safe structural pinning are:

The struct must not implement Drop in a way that moves any pinned field. (Or, if it does, the Drop impl must take Pin<&mut Self>, which is non-standard and rare.)
The struct must not provide any method that moves a pinned field through &mut self.
The struct’s Unpin impl, if any, must be conditional on all pinned fields being Unpin.

Getting all three of these right manually is annoying and error-prone. You also have to write the projection methods by hand, with unsafe:

#![allow(unused)]
fn main() {
impl Composite {
    fn project_a(self: Pin<&mut Self>) -> Pin<&mut SomeFuture> {
        unsafe { self.map_unchecked_mut(|s| &mut s.a) }
    }
}
}

This is fine if you do it right. It is unsound if you do it wrong. And the failure mode is silent — the code compiles, runs, and corrupts memory under the right interleaving.

This is why the pin-project crate exists.

What `pin-project` is doing

pin-project is a procedural macro that takes:

#![allow(unused)]
fn main() {
#[pin_project]
struct Composite {
    #[pin]
    a: SomeFuture,
    b: u32,
}
}

and generates safe projection methods, the right Unpin impl, the right Drop enforcement, and a few other guarantees that make structural pinning sound. The #[pin] attribute marks a as pinned; b is treated as movable. The macro emits a project() method that returns a struct of references with the right pinning on each field:

#![allow(unused)]
fn main() {
let proj = composite.project();
proj.a.poll(cx);  // proj.a: Pin<&mut SomeFuture>
let _ = *proj.b;  // proj.b: &mut u32
}

Everything that would otherwise require unsafe is inside the macro, audited once, and used safely thereafter.

You should use pin-project (or its companion pin-project-lite, which is a smaller no-proc-macro alternative) any time you write a future or stream that contains other futures or streams as fields. Hand-writing the projection is acceptable for small one-off cases but is not a habit you want.

The “safe to move” mental model

Here is the mental model that, once internalized, makes Pin make sense.

Most types are safe to move. They satisfy Unpin. You don’t need Pin for them and Pin<&mut T> for T: Unpin is just &mut T with extra syntax.

Some types are unsafe to move after a certain point. Self-referential types are the canonical example. Once a self-referential type has set up its internal pointers, moving it would break those pointers. These types are !Unpin.

The !Unpin types need to be addressable through a pointer that promises not to let them move. That pointer is Pin. The promise is: from the moment you create the Pin until the T inside is dropped, the T will not move.

The !Unpin types need this promise to be transitive. If you have Pin<&mut Composite>, you need to be able to get Pin<&mut SubFuture> for the SubFuture field. This is structural pinning, and the pinning machinery (manual or via pin-project) makes it sound.

Once pinned, a value can be operated on (via methods that take Pin<&mut Self>) but not moved. It can be read, polled, dropped — anything that doesn’t change its address. It cannot be returned by value, swapped, replaced, or otherwise relocated.

That is Pin. It is a way of saying “this value’s address is now part of its identity,” within a type system that otherwise treats addresses as irrelevant.

Where the model breaks

A few sharp edges remain.

Pin<Box<T>> is the workhorse but the docs underplay it. Almost every real use of Pin is Pin<Box<T>> — pinned because moving a heap allocation doesn’t move the pointee, only the pointer. Box::pin is the constructor. If you need a Pin<&mut T> for arbitrary T: !Unpin, the easiest way to get one is Box::pin(t). The cost is one heap allocation. This is fine for most futures. It is not fine for, e.g., generators in tight loops.

Structural pinning is a per-field decision. Some fields of your struct should be pinned (futures), some shouldn’t be (counters, configuration). The #[pin] attribute lets you choose per field. Pinning everything is correct but limits what your methods can do; pinning nothing means you can’t have any !Unpin fields. The right answer depends on the struct.

Drop and Pin interact subtly. A Drop impl that moves pinned fields is unsound. The Drop trait’s drop method takes &mut self, not Pin<&mut Self>, and you cannot change that. So if your struct has !Unpin fields and you implement Drop, you must be careful not to move those fields in drop. pin-project enforces this for you. Hand-rolled code has to enforce it by discipline.

Pin does not prevent reading or modifying — only moving. A Pin<&mut T> can still mutate T through methods that take Pin<&mut Self>. The pin is about address stability, not about immutability. A common confusion among beginners is to think Pin is some kind of advanced Mutex. It is not. It is purely about whether memcpy is allowed.

Why this is the design

A reasonable question: was there a better way? Several alternatives were considered:

A Move trait the user opts in to. Types that opt out of Move cannot be moved. The problem: this is backwards-incompatible with all existing Rust code, which assumes everything is movable. A type that becomes non-movable would break every user that returned it by value or stored it in a Vec.
Pin everything by default and add an Unpin opt-in. Same problem, in reverse. You can’t make every existing type non-movable without breaking the world.
Make Future::poll take &mut self and disallow self-referential futures. Would prevent the async fn desugaring as it exists. Async would have to use a different model — probably stackful coroutines, with all the costs that implies.
Add Pin as a wrapper around pointers, and have !Unpin types opt in. What we have. Adds a new type to the standard library. Doesn’t break anything. Lets futures be self-referential safely. Has rough edges around projection.

The chosen option is the one that least breaks the existing language at the cost of adding a quirky API at the boundary. This is, in retrospect, almost certainly the right call. It is also the source of the meme that Pin is the worst-designed thing in std. Both of these are true.

What you actually do

In practice, you will use Pin in three ways.

As a consumer of futures. When you write let _ = some_future.await;, the compiler handles all the pinning for you. You will never see Pin in your code. This is the case 95% of the time.

As a writer of futures by hand. When you implement Future directly (not via async fn), you have to take Pin<&mut Self> in your poll method. If your future contains other futures as fields, you’ll use pin-project to project pins to them.

As a writer of generic async utilities. Combinators, runtimes, channel implementations. You will be deep in Pin, Unpin, and structural pinning. Read the source of tokio for examples; it is pleasingly readable for a project of its complexity.

For most application-level Rust, you are firmly in case 1. Knowing the model lets you read the error messages when they appear; you don’t have to live in the model day to day.

Sources

The std::pin module documentation, which is dense but accurate.
Withoutboats’s posts on the design of Pin, especially the early ones explaining why it was the chosen approach.
The pin-project crate documentation, which is the practical guide.
Jon Gjengset’s Crust of Rust: async/await video, which walks through Pin in the context of building a future from scratch.
The Pin RFC (RFC 2349) for the historical motivation.

The Async Trait Problem

For roughly six years, you could not write async fn in a trait. You could write a trait. You could write async fn outside a trait. You could not put one inside the other. The reason is genuinely subtle, the workaround was a third-party macro, and the eventual fix landed in stable in November 2023 — and is still, as of 2026, slightly worse than the workaround in non-trivial cases.

This is the messiest corner of async Rust. It is also the corner the language designers care most about getting right, because it is the principal blocker on async Rust feeling like a coherent language feature instead of a bolted-on extension.

Why `async fn` in traits was hard

Recall that async fn foo(...) -> T desugars to fn foo(...) -> impl Future<Output = T>. The return type is an opaque type — a specific type that the compiler picks, that the caller doesn’t know but does know implements Future. Each async fn produces a different opaque type, even if the source code looks identical.

This is fine for free functions. The compiler picks a type, the caller treats it as impl Future, life goes on.

It is not fine for trait methods. A trait is a contract: every implementor of the trait must provide a method with a compatible signature. If async fn process(&self) -> Output desugars to fn process(&self) -> impl Future<Output = Output>, what is the compatible signature? Different implementors have different state machine types. The trait can’t say “returns some type implementing Future” without somehow letting each implementor pick its own.

The mechanism that does this is “return-position impl Trait in traits” (RPITIT, pronounced “ripit”). Every implementor’s method gets to pick its own concrete return type, and the trait says only impl Future. The trait’s associated type machinery tracks the implementor’s choice.

This is the right answer. It took a while to land because it required:

Figuring out the type theory: how does an opaque type in a trait method interact with subtyping, with object safety, with Send bounds, with auto-traits?
Figuring out lifetimes: the future returned by an async fn borrows from self (and other arguments). How is that lifetime expressed in the trait?
Figuring out object safety: can you have dyn Trait for a trait with async fn? (Spoiler: not exactly. We’ll get there.)
Figuring out Send bounds: by default, the future’s Send-ness is inferred. How does the trait let the caller require Send?

Each of these took years and several RFCs to resolve. The result is that async fn in traits works in most common cases as of stable Rust 1.75 (December 2023), but the rough edges around dyn Trait and conditional Send bounds are still being smoothed.

The `async-trait` macro era

Before native support, the standard answer was the async-trait crate. It provided an attribute macro:

#![allow(unused)]
fn main() {
#[async_trait]
trait Repository {
    async fn get(&self, id: Id) -> Result<Item, Error>;
    async fn put(&self, item: Item) -> Result<(), Error>;
}
}

The macro rewrote each async fn into a fn returning Pin<Box<dyn Future<Output = ...> + Send + 'async_trait>>. So:

#![allow(unused)]
fn main() {
trait Repository {
    fn get<'a>(&'a self, id: Id) -> Pin<Box<dyn Future<Output = Result<Item, Error>> + Send + 'a>>;
    fn put<'a>(&'a self, item: Item) -> Pin<Box<dyn Future<Output = Result<(), Error>> + Send + 'a>>;
}
}

This worked. It always worked. The cost: every method call allocates a Box. The boxed future is dynamically dispatched (it’s a dyn Future). The + Send is hard-coded (you could opt out with #[async_trait(?Send)], but only at the trait level, not per-method or per-implementor).

For most use cases, the cost was acceptable. A heap allocation per method call is fine for most application code. For high-throughput async code, it was a real cost; benchmarks showed async-trait adding meaningful overhead in tight loops. But the macro got the language unblocked, and most production async Rust shipped with async-trait in the dependency tree.

Native `async fn` in traits, the good case

Stable Rust 1.75 made this work:

#![allow(unused)]
fn main() {
trait Repository {
    async fn get(&self, id: Id) -> Result<Item, Error>;
    async fn put(&self, item: Item) -> Result<(), Error>;
}

struct Postgres { /* ... */ }

impl Repository for Postgres {
    async fn get(&self, id: Id) -> Result<Item, Error> { /* ... */ }
    async fn put(&self, item: Item) -> Result<(), Error> { /* ... */ }
}
}

No macro. No Box. Each call returns the implementor’s specific opaque future type. Static dispatch. Zero overhead.

For application code calling repo.get(id).await, this is essentially free. It is what async-trait should have been from the start, and it is what the language now provides natively.

Native `async fn` in traits, the rough case

Two cases are still rough.

Case 1: dyn Trait. You cannot say:

#![allow(unused)]
fn main() {
let r: Box<dyn Repository> = Box::new(Postgres::new());
}

The trait is not object-safe. Object safety requires that all method return types have a known size, but async fn returns an opaque type whose size depends on the implementor. There is no single dyn Repository because each implementor has a different future type.

The workaround in stable Rust as of early 2026 is to either:

Use async-trait for the boxed-trait-object case, accepting the boxing cost.
Define a parallel boxed-future trait yourself, and provide a blanket impl converting between them. This is what the proposed “trait transformers” RFC will eventually subsume; today you write it by hand.
Use the trait-variant crate (sponsored by the Rust async working group) to generate the boxed variant of a trait from the native one.

The native + dyn-compatible story is in active development. The current thinking (per the async vision document and recent posts from the async WG) is that there will eventually be a dyn AsyncTrait form that handles the boxing for you, but it is not yet stable.

Case 2: Send bounds on the returned future. When you write:

#![allow(unused)]
fn main() {
trait Repository {
    async fn get(&self, id: Id) -> Result<Item, Error>;
}
}

The future returned by get is whatever the implementor produces. If the implementor’s future happens to be Send, great. If not, also great — both are valid implementations. But what if the caller needs the future to be Send, because they’re going to tokio::spawn it?

You need a way to write the trait such that callers can require Send-ness from any implementor. The current syntax is the Send bounds on associated return types feature, often written:

#![allow(unused)]
fn main() {
trait Repository: Send + Sync {
    fn get(&self, id: Id) -> impl Future<Output = Result<Item, Error>> + Send;
}
}

That is, you spell out the desugaring and add + Send to the return type. This works but loses the async fn syntax. As of 2026 there is a more ergonomic syntax in nursery RFC discussions — something like async fn get(&self, id: Id) -> Result<Item, Error> + Send — but it has not stabilized.

The pragmatic recommendation: if all your implementors will produce Send futures (which is true for most repository-style traits where the implementations are straightforward database calls), write the desugared form with + Send. If you need to support both Send and !Send implementors, you have a harder problem and may want to define two traits.

Return-position `impl Trait` in traits, more generally

async fn in traits is a special case of a more general feature: return-position impl Trait in traits, or RPITIT.

#![allow(unused)]
fn main() {
trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
    fn windows(self, n: usize) -> impl Iterator<Item = Vec<Self::Item>>;
    //                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RPITIT
}
}

Without RPITIT, you would have to either:

Make windows return a concrete type, defeating abstraction.
Add an associated type type Windows: Iterator<...> to the trait, requiring every implementor to declare it.

With RPITIT, the trait says “returns some iterator,” and each implementor’s specific type is hidden but usable.

async fn in traits is just RPITIT for the case where the returned impl Trait is impl Future. Stabilizing one stabilized the other; they are the same machinery.

The general case has the same caveats: it doesn’t work in object-safe traits (because the return type’s size is implementor-dependent), and you need to spell out auto-trait bounds explicitly. But for the cases that work, it’s powerful and clean.

What you should actually do, today

A pragmatic decision tree for writing an async trait in 2026:

Static dispatch only, all implementors known at compile time, no boxing wanted: native async fn in traits. No macro. Add + Send (in the desugared form) to method return types if you need to spawn the futures.

Need dyn Trait for runtime polymorphism, willing to accept allocation: async-trait macro. It is still maintained, still works, still produces correct code. The macro hasn’t been deprecated; it has been complemented.

Building a public library that other crates will consume: prefer native async fn in traits, with explicit + Send bounds where appropriate. Document whether the futures are Send. This gives downstream users the most flexibility.

Need both static dispatch and dyn Trait support from the same source code: look at trait-variant crate, which can generate both forms from one trait definition.

Where this is going

The Rust async working group has, as of early 2026, several proposals at various stages of stabilization that will smooth the remaining edges:

dyn* and dynamically-sized return types — a more general mechanism for object-safe traits with opaque returns. Long-tail work, not landing soon.
Explicit auto-trait bounds in trait definitions — syntax for letting traits express “this future is Send-conditional on the implementor” cleanly.
async drop — let Drop impls be async. This is its own enormous can of worms (what happens if the destructor is dropped without being awaited?) but is on the long-term roadmap.
Return-type notation (RTN) — a syntax for naming the return type of a trait method without writing it out, useful for bounds and where-clauses.

None of these will fundamentally change the model. They will reduce the number of cases where you have to know the model in detail. The trajectory is positive, slow, and roughly what you’d expect from a project that prioritizes long-term soundness over short-term ergonomics.

Where this was

For history’s sake, since the controversy has been lived through, it is worth noting that the async-trait situation was, at various points between 2018 and 2023, a real source of frustration in the Rust community. There were extended debates about whether RPITIT was the right approach, whether the lifetime story was workable, whether the boxing in async-trait was acceptable as a permanent answer, and whether the language should have stabilized async without a coherent trait story.

These were good-faith disagreements between thoughtful people. The eventual resolution — RPITIT in stable, with the rough edges getting cleaned up over the following years — is, in retrospect, the right one. The interim cost was that “async Rust” felt incomplete for years. That cost was real, and acknowledging it is honest. The language is in a much better place now than it was in 2021.

Sources

The Async Fn in Trait stabilization announcement and the Rust 1.75 release notes.
The async-trait crate documentation for the macro-based approach.
The trait-variant crate for generating boxed trait variants.
The Rust async working group’s vision document.
Niko Matsakis’s series on async traits, particularly the posts from 2021-2023 walking through the design space.

Error Messages, Decoded

This chapter is the practice. Real compiler errors, line by line, decoded into what they mean and what to change. The errors are quoted as the compiler emits them (or close enough — output formatting drifts slightly between rustc versions). The skill we’re building is the ability to look at a paragraph of error and find the actual fix.

Error 1: `cannot infer an appropriate lifetime`

error: lifetime may not live long enough
   --> src/lib.rs:14:9
    |
12  |     pub fn longest(&self, other: &str) -> &str {
    |                    -----         ----     - let's call the lifetime of this reference `'1`
    |                    |             |
    |                    |             let's call the lifetime of this reference `'2`
    |                    let's call the lifetime of this reference `'3`
13  |         if self.0.len() > other.len() {
14  |             other
    |             ^^^^^ associated function was supposed to return data with lifetime `'3` but it is returning data with lifetime `'2`

What the compiler is telling you. The function returns &str. Lifetime elision applied rule 3: there’s a &self, so the elided return lifetime is 'self. The function tried to return other, which has a different lifetime. The lifetimes '2 and '3 are not the same, so the return value’s actual lifetime ('2, the lifetime of other) doesn’t match the declared lifetime ('3, the lifetime of self).

Why it’s telling you that. Elision picked the wrong default. It assumed &self-bound output, but the function actually wants to return whichever of the two arguments is longer — meaning the return must be tied to the common lifetime of both, which has to be explicit.

The fix.

#![allow(unused)]
fn main() {
pub fn longest<'a>(&'a self, other: &'a str) -> &'a str {
    if self.0.len() > other.len() { &self.0 } else { other }
}
}

Now both inputs and the output share 'a, and the compiler can pick 'a as the intersection.

The general rule. When elision picks a &self-bound return for a function that should return one of multiple inputs, you have to write the lifetimes explicitly. This pattern accounts for a large fraction of “simple” lifetime errors.

Error 2: `borrowed value does not live long enough`

error[E0597]: `local` does not live long enough
   --> src/main.rs:6:18
    |
5   |     let local = String::from("hello");
    |         ----- binding `local` declared here
6   |     let r: &'static str = &local;
    |            ----------    ^^^^^^ borrowed value does not live long enough
    |            |
    |            type annotation requires that `local` is borrowed for `'static`
7   | }
    | - `local` dropped here while still borrowed

What the compiler is telling you. You annotated a reference as &'static str. The reference points at local, a String that goes out of scope at the end of the function. The compiler is being asked to enforce that the reference outlives 'static, which is not satisfiable for a stack-local value.

Why it’s telling you that. 'static is not “lives forever” but “is not borrowed from anything that ends.” A stack-allocated String ends at scope exit. Therefore a reference into it cannot satisfy 'static.

The fix, depending on intent.

If you wanted a &str borrow with the local’s lifetime, drop the 'static:

#![allow(unused)]
fn main() {
let r: &str = &local;
}

If you wanted 'static because something else demanded it (e.g., tokio::spawn), the right fix is usually to own the data instead of borrowing:

#![allow(unused)]
fn main() {
let r: String = local;  // move
// or, if you need to keep `local` separately:
let r: String = local.clone();
}

The general rule. 'static bounds usually mean “you must own this or it must come from a literal.” If you’re trying to satisfy a 'static bound with a borrow of local data, you have a structural problem, not a typo problem. Convert to ownership.

Error 3: `implementation of` FnOnce `is not general enough`

error: implementation of `FnOnce` is not general enough
   --> src/main.rs:14:5
    |
14  |     accept_callback(parse_first);
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ implementation of `FnOnce` is not general enough
    |
    = note: `fn(&'2 [u8]) -> &'2 [u8] {parse_first}` must implement `FnOnce<(&'1 [u8],)>`, for any lifetime `'1`...
    = note: ...but it actually implements `FnOnce<(&'2 [u8],)>`, for some specific lifetime `'2`

What the compiler is telling you. accept_callback requires its argument to be a closure that works for any lifetime (for<'1>). You passed a function parse_first that works for some specific lifetime ('2). The two are incompatible: “works for any” is stronger than “works for one specific.”

Why it’s telling you that. This is the canonical HRTB inference failure. Either the function parse_first is genuinely not higher-ranked (its definition picks a particular lifetime), or the inference engine can’t see that it is.

The fix. Almost always, parse_first is fine, and the issue is that its signature was inferred too restrictively. Make the higher-ranking explicit:

#![allow(unused)]
fn main() {
fn parse_first<'a>(bytes: &'a [u8]) -> &'a [u8] {
    bytes.split(|&b| b == 0).next().unwrap_or(&[])
}
}

If it already looks like that, the problem may be on the consumer side — accept_callback’s bound is not asking for for<'a> cleanly. Check chapter 3, particularly the “patterns that work” section.

The general rule. “Not general enough” means the supplied function works for one lifetime but the trait bound asks for all of them. The answer is to make the function generic over the lifetime if it isn’t, or to relax the trait bound if you can.

Error 4: `future cannot be sent between threads safely`

error: future cannot be sent between threads safely
   --> src/main.rs:18:18
    |
18  |     tokio::spawn(do_work(&shared));
    |                  ^^^^^^^^^^^^^^^^ future returned by `do_work` is not `Send`
    |
note: future is not `Send` as this value is used across an await
   --> src/main.rs:9:23
    |
8   |     let cell = Rc::new(RefCell::new(0));
    |         ---- has type `Rc<RefCell<i32>>` which is not `Send`
9   |     other_async_op().await;
    |                       ^^^^^ await occurs here, with `cell` maybe used later

What the compiler is telling you. The future returned by do_work contains an Rc<RefCell<i32>> field across the await on line 9. Rc is not Send (it has a non-atomic refcount). Therefore the future is not Send. tokio::spawn requires Send.

Why it’s telling you that. Auto-traits propagate structurally. A struct containing a non-Send field is non-Send. The state machine is a struct. Rc is one of its fields. Game over.

The fix. Replace Rc with Arc. (And RefCell with Mutex if you also need to mutate it across threads.)

#![allow(unused)]
fn main() {
let cell = Arc::new(Mutex::new(0));
}

If you can’t replace it (because some other constraint forces Rc), the alternative is to not spawn the future on a multi-threaded executor. Use tokio::task::LocalSet and spawn_local, which do not require Send.

The general rule. Send errors on futures point at one offending value. Replace the value with a Send equivalent (Arc for Rc, tokio::sync::Mutex for RefCell-when-held-across-await, atomic types for Cells). If multiple values are offending, the error will only show one at a time; expect to fix several in sequence.

Error 5: `the parameter type 'T' may not live long enough`

error[E0310]: the parameter type `T` may not live long enough
  --> src/lib.rs:5:5
   |
5  |     Box::new(value)
   |     ^^^^^^^^^^^^^^^
   |     |
   |     the parameter type `T` must be valid for the static lifetime...
   |     ...so that the type `T` will meet its required lifetime bounds
   |
help: consider adding an explicit lifetime bound...
   |
3  |     pub fn into_box<T: 'static>(value: T) -> Box<dyn Display> {
   |                      +++++++++

What the compiler is telling you. Box<dyn Display> is sugar for Box<dyn Display + 'static>. Trait objects have an implicit 'static bound when no other lifetime is given. To put T in a Box<dyn Display>, T must satisfy 'static. Currently T has no such bound.

Why it’s telling you that. Because the compiler doesn’t know whether T is String (which is 'static) or &'a str (which isn’t). It refuses to assume.

The fix. Add the bound:

#![allow(unused)]
fn main() {
pub fn into_box<T: Display + 'static>(value: T) -> Box<dyn Display> {
    Box::new(value)
}
}

If you genuinely want to support non-'static types, change the trait object’s lifetime:

#![allow(unused)]
fn main() {
pub fn into_box<'a, T: Display + 'a>(value: T) -> Box<dyn Display + 'a> {
    Box::new(value)
}
}

The general rule. Trait objects default to 'static bound. If you’re putting a generic into a trait object, either constrain the generic to 'static or thread a lifetime through both.

Error 6: `cannot borrow as mutable more than once`

error[E0499]: cannot borrow `*v` as mutable more than once at a time
  --> src/main.rs:6:5
   |
4  |     let first = &mut v[0];
   |                       - first mutable borrow occurs here
5  |     let second = &mut v[1];
   |                       ^ second mutable borrow occurs here
6  |     println!("{}, {}", first, second);
   |                        ----- first borrow later used here

What the compiler is telling you. Both &mut v[0] and &mut v[1] are calls to IndexMut::index_mut, which takes &mut self and returns &mut Output. The compiler can only see that you’re calling index_mut twice on the same Vec; it doesn’t have type-level information that the two indices are distinct.

Why it’s telling you that. The borrow checker reasons about types, not values. IndexMut is just a method. The compiler cannot know at the type level that index 0 and index 1 don’t alias.

The fix. Use a method that returns multiple disjoint mutable references. The standard library provides split_at_mut:

#![allow(unused)]
fn main() {
let (left, right) = v.split_at_mut(1);
let first = &mut left[0];
let second = &mut right[0];
}

Or, on Rust 1.77+, [T]::get_disjoint_mut (formerly get_many_mut):

#![allow(unused)]
fn main() {
let [first, second] = v.get_disjoint_mut([0, 1]).unwrap();
}

These functions have signatures like fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]), which return two mutable references with the type-system-visible guarantee that they don’t overlap.

The general rule. When you need multiple mutable borrows into a collection, use the collection’s “split” methods. The borrow checker can’t infer disjointness from indices; the API has to express it in the return type.

Error 7: `expected fn pointer, found fn item`

error[E0308]: mismatched types
  --> src/main.rs:5:9
   |
5  |     funcs.push(double);
   |                ^^^^^^ expected fn pointer, found fn item
   |
   = note: expected fn pointer `fn(i32) -> i32`
              found fn item `fn(i32) -> i32 {double}`
help: consider casting to a fn pointer
   |
5  |     funcs.push(double as fn(i32) -> i32);
   |                       +++++++++++++++++

What the compiler is telling you. double is a fn item, which is a zero-sized type unique to the function double. funcs is a Vec<fn(i32) -> i32>, which holds fn pointers — runtime function pointers, all the same type. Fn items coerce to fn pointers, but the coercion has to happen at a point where the target type is known.

Why it’s telling you that. Each function in Rust has its own zero-sized type (an artifact of monomorphization). When you put a function in a Vec<fn(...)>, the elements are runtime pointers, so the fn item must be coerced to a pointer first.

The fix. Cast or annotate:

#![allow(unused)]
fn main() {
funcs.push(double as fn(i32) -> i32);
// or:
let f: fn(i32) -> i32 = double;
funcs.push(f);
}

Or, if you want the heterogeneous-functions case, use Box<dyn Fn(...)> instead of fn pointers, which works for closures too.

The general rule. Fn items and fn pointers are different. Coercion is implicit in many contexts but not in Vec::push (because push doesn’t have a known target type for inference to use). When you see this error, force the coercion or use a dyn Fn type.

Error 8: `the trait` Sized `is not implemented`

error[E0277]: the size for values of type `dyn Display` cannot be known at compilation time
  --> src/main.rs:3:9
   |
3  |     let x: dyn Display = 5;
   |         ^ doesn't have a size known at compile-time

What the compiler is telling you. dyn Display is an unsized type. You cannot have a stack variable of unsized type. You need to put it behind a pointer.

Why it’s telling you that. Trait objects have unknown size at compile time (different implementors are different sizes). Stack allocation requires a known size.

The fix. Box it:

#![allow(unused)]
fn main() {
let x: Box<dyn Display> = Box::new(5);
}

Or take a reference:

#![allow(unused)]
fn main() {
let n = 5;
let x: &dyn Display = &n;
}

The general rule. dyn Trait is unsized. Always behind a pointer (Box, &, Rc, Arc, Pin<Box>). Never bare.

Error 9: `borrowed data escapes outside of method`

error[E0521]: borrowed data escapes outside of method
   --> src/lib.rs:7:9
    |
4   |   pub fn store(&mut self, item: &str) {
    |                ---------  ----
    |                |          |
    |                |          `item` is a reference that is only valid in the method body
    |                let's call the lifetime of this reference `'1`
...
7   |         self.items.push(item);
    |         ^^^^^^^^^^^^^^^^^^^^^
    |         |
    |         `item` escapes the method body here
    |         argument requires that `'1` must outlive `'_`

What the compiler is telling you. self.items has some lifetime — maybe Vec<&'a str> for some 'a. item has a different (probably shorter) lifetime, called '1 here. Pushing item into self.items would extend item’s lifetime to whatever self.items requires, which the compiler can’t prove.

Why it’s telling you that. Containers of references have a fixed lifetime parameter. You can’t insert a reference with a shorter lifetime than the container’s parameter without somehow narrowing the container’s lifetime, which &mut self won’t allow (because of variance — see chapter 2).

The fix. Either store owned data:

#![allow(unused)]
fn main() {
pub fn store(&mut self, item: String) { self.items.push(item); }
}

Or take the borrow with the right lifetime:

#![allow(unused)]
fn main() {
pub fn store<'a>(&mut self, item: &'a str) where Self: 'a, ... 
// quickly gets complicated; usually owned is simpler
}

In practice, store owned data. If your container has lifetime parameters, every method gets harder; the cure is usually worse than the disease.

The general rule. Containers of references are a tax. You pay the tax forever. Default to containers of owned data.

Error 10: `recursion in an` async fn `requires boxing`

error[E0733]: recursion in an `async fn` requires boxing
  --> src/lib.rs:1:1
   |
1  | async fn factorial(n: u64) -> u64 {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2  |     if n <= 1 { 1 } else { n * factorial(n - 1).await }
   |                                ----------------------- recursive call here
   |
   = note: a recursive `async fn` call must introduce indirection such as `Box::pin` to avoid an infinitely sized future

What the compiler is telling you. Async functions desugar to state machines whose size includes the size of any inner futures stored across awaits. Recursive async functions would have infinite size: the state machine for factorial(n) contains a state machine for factorial(n-1), which contains one for factorial(n-2), on down. The compiler can’t compute the size.

Why it’s telling you that. This is the unavoidable consequence of stackless coroutines (chapter 4). The state of the call stack is encoded in the type. Recursion turns the type infinite.

The fix. Box the recursive call:

#![allow(unused)]
fn main() {
async fn factorial(n: u64) -> u64 {
    if n <= 1 { 1 } else { n * Box::pin(factorial(n - 1)).await }
}
}

Box::pin heap-allocates the inner future, so its contribution to the outer state machine is just a pointer (fixed size). The recursion is now flat in the type system.

The general rule. Recursive async needs Box::pin. There is no way around it short of rewriting the function iteratively. The cost is one heap allocation per recursive level. For most use cases this is fine; for performance-critical recursion, prefer iteration.

What this practice gives you

Reading these errors is the unavoidable skill. The compiler is telling you the truth — the words mean what they say — but the truth is filtered through a vocabulary that takes time to acquire. Once you’ve read enough of these, the same shape becomes recognizable in five seconds: “oh, that’s the elision-picked-the-wrong-default thing” or “oh, that’s an HRTB inference failure” or “oh, that’s Send propagating through a RefCell field.”

The compiler is the same compiler. The errors are produced by the same algorithms. The patterns repeat. The pattern-matching is the skill.

Patterns That Work

After all the theory, here is the small set of idioms that handle the overwhelming majority of real cases. Memorize the shapes. When you reach for a pattern, reach for one of these first.

Pattern 1: `'static` bounds when you can afford them

The fastest way out of a lifetime fight is to take a 'static bound. If your function takes ownership of its inputs (or they’re literals, or they’re Arc’d), you can write 'static everywhere and stop worrying about lifetime relations entirely.

#![allow(unused)]
fn main() {
fn enqueue<T: Send + 'static>(queue: &Queue<T>, item: T) {
    queue.push(item);
}
}

This is right when:

The function genuinely owns or shares ownership of the data.
The data is going to outlive any individual scope (passed to a thread, stored in a long-lived structure, sent across a channel).
You don’t need to maintain a borrow relationship between input and output.

This is wrong when:

The data should be borrowed and returned (turn an input slice into a sub-slice). Don’t force ownership for this; thread the lifetime.
The data is small and frequently constructed (don’t Arc an i64).
You’ll be cloning excessively just to satisfy the bound.

When in doubt: 'static is the easy answer when ownership is appropriate, and the wrong answer when borrowing is. The rule of thumb: if your function would be just as correct taking owned values, use 'static.

Pattern 2: `Arc<T>` for cheap shared ownership

Arc is the heat sink for “I want multiple owners and I don’t want to think about lifetimes.” Wrap the data once, clone the Arc per owner, accept the slight overhead of atomic refcounting and the loss of mutability (without an interior mutability primitive).

#![allow(unused)]
fn main() {
let config = Arc::new(load_config());

for _ in 0..workers {
    let config = config.clone();
    tokio::spawn(async move {
        run_worker(config).await;
    });
}
}

Arc is the right tool when:

Multiple async tasks or threads need to read shared data.
Lifetimes would otherwise force you into elaborate borrowing schemes.
The data is large enough that cloning it would be wasteful.

Arc is the wrong tool when:

The data is cheap to clone (String, Vec<u8> of small size). Just clone.
You need mutability and aren’t ready to add a Mutex (next pattern).
You’re using it inside a tight loop where the atomic operations matter (rare).

The cost is one atomic increment per clone and one atomic decrement per drop. For application code, this is negligible. For libraries that benchmark in nanoseconds, sometimes it isn’t.

Pattern 3: `Arc<Mutex<T>>` and when it’s wrong

The famous Arc<Mutex<T>> is the canonical “I need shared mutable state” pattern. It works. It is also the pattern most likely to disguise a worse problem.

#![allow(unused)]
fn main() {
let counter = Arc::new(Mutex::new(0u64));

for _ in 0..workers {
    let counter = counter.clone();
    tokio::spawn(async move {
        let mut g = counter.lock().await;
        *g += 1;
    });
}
}

Arc<Mutex<T>> is the right tool when:

The state is genuinely shared between many writers.
The state is large enough that messages-passing alternatives would be slower.
The locking is fine-grained enough that contention isn’t the bottleneck.

Arc<Mutex<T>> is the wrong tool when:

A single producer and a single consumer can do the job (use a channel).
The state is a counter (use AtomicU64).
Contention is high and you have many short critical sections (consider sharding the data).
You’re going to await while holding the lock and the lock is acquired by many tasks (you’ve now serialized them; consider whether the await is necessary inside the critical section).

The hidden trap with tokio::sync::Mutex is the third bullet. Holding a lock across .await blocks every other future trying to acquire it for the duration. If your critical section involves I/O, you have just turned a parallel system into a serial one. The fix is usually to do the I/O outside the lock, even if it means cloning data.

The decision rule: prefer std::sync::Mutex (or parking_lot::Mutex) and don’t hold it across .await. Use tokio::sync::Mutex only when you genuinely need to await while the lock is held, and accept that you have introduced a serialization point.

Pattern 4: Channel-based ownership transfer

When you have data that needs to flow from one task to another, pass it through a channel rather than sharing it. This eliminates lifetime issues entirely (the data is moved into the channel, then moved out) and makes the dependency graph explicit.

#![allow(unused)]
fn main() {
let (tx, mut rx) = tokio::sync::mpsc::channel(32);

tokio::spawn(async move {
    while let Some(work) = rx.recv().await {
        process(work).await;
    }
});

for item in items {
    tx.send(item).await.unwrap();
}
}

Channels are the right tool when:

One task produces data, another consumes it.
You want backpressure (use mpsc::channel with a bounded capacity).
The data flow is unidirectional or has a clear request/response shape (then use oneshot for the response).
You want clear ownership transitions in the type system.

Channels are the wrong tool when:

You need bidirectional state synchronization (a channel is a one-way pipe; bidirectional ends up being two channels and a protocol, which is fine but more complex).
The data is small and read-mostly (an Arc is cheaper).
You need to implement a complex coordination pattern that channels would make awkward (consider explicit message-passing actors instead — pattern 5).

tokio::sync provides several channel flavors: mpsc (multi-producer single-consumer), broadcast (multi-consumer broadcast), watch (single-value, last-write-wins), and oneshot (one message, one consumer). Pick the one that matches the communication shape. Using the wrong channel for the pattern is a common source of awkward code.

Pattern 5: Actor pattern

When you have state that needs serialized access from many places, and the access patterns are complex enough that locking would be error-prone, encapsulate the state in a single task that owns it, and have everyone else send messages.

#![allow(unused)]
fn main() {
struct CacheActor { items: HashMap<String, Item> }

enum CacheMsg {
    Get { key: String, reply: oneshot::Sender<Option<Item>> },
    Put { key: String, value: Item },
}

async fn cache_actor(mut rx: mpsc::Receiver<CacheMsg>) {
    let mut state = CacheActor { items: HashMap::new() };
    while let Some(msg) = rx.recv().await {
        match msg {
            CacheMsg::Get { key, reply } => {
                let _ = reply.send(state.items.get(&key).cloned());
            }
            CacheMsg::Put { key, value } => {
                state.items.insert(key, value);
            }
        }
    }
}

#[derive(Clone)]
struct CacheHandle { tx: mpsc::Sender<CacheMsg> }

impl CacheHandle {
    async fn get(&self, key: String) -> Option<Item> {
        let (reply, rx) = oneshot::channel();
        self.tx.send(CacheMsg::Get { key, reply }).await.ok()?;
        rx.await.ok()?
    }
    async fn put(&self, key: String, value: Item) {
        let _ = self.tx.send(CacheMsg::Put { key, value }).await;
    }
}
}

The actor pattern is right when:

The state is non-trivial (composite, with invariants between fields).
Multiple call sites need both read and write access.
The state’s mutation is the system’s logical bottleneck anyway (one actor processes one message at a time).
You want clear, type-checked APIs at the message boundary.

The actor pattern is wrong when:

The state is simple enough that Mutex<T> is fine.
You need fan-out parallelism on reads (an actor serializes everything; a RwLock lets readers proceed concurrently).
The message-handling latency matters more than throughput (actors add a queue and a context switch per operation).

Actors are not faster than locks. They are cleaner. The clarity is the value: the type of CacheHandle is the API, the actor’s loop is the implementation, and there is no way for any other code to corrupt the cache because no other code has access to the state. For complex domain state, this is worth a lot.

Pattern 6: `Cow` for “borrowed unless I have to clone”

std::borrow::Cow<'a, T> (clone-on-write) holds either a borrow or an owned value, transparently. It’s useful when most calls don’t need to allocate but some do.

#![allow(unused)]
fn main() {
fn normalize(input: &str) -> Cow<'_, str> {
    if input.contains('\r') {
        Cow::Owned(input.replace("\r\n", "\n"))
    } else {
        Cow::Borrowed(input)
    }
}
}

Cow is right when:

Most callers don’t need to modify the data and you can return a borrow.
A minority of cases need an owned copy.
The caller doesn’t care which case they got.

Cow is wrong when:

You’re using it to “decide later” between owned and borrowed because you weren’t sure. The decision should be in the API, not deferred.
The caller needs to know whether they got a borrow or an owned value.
The data is so cheap to clone that branching on Cow::Owned/Cow::Borrowed is more expensive than just cloning.

Cow is one of those types that solves a real problem and gets reached for too often. The right test is: does this function actually have a fast path that returns a borrow and a slow path that returns an owned value, where the caller treats them the same? If yes, Cow. If no, pick a type and stick with it.

Pattern 7: `'static` futures via owned data

For futures you want to spawn (which require 'static), the standard recipe is: take all data by value, share via Arc, no borrows.

#![allow(unused)]
fn main() {
async fn process(item: Item, deps: Arc<Deps>) -> Result<Output, Error> {
    deps.client.fetch(item.id).await
}

tokio::spawn(process(item, deps.clone()));
}

This sidesteps every async lifetime issue from chapter 5. The future borrows nothing from the calling scope. It is 'static. It is Send (assuming all the types are). It can be spawned freely.

The cost is Arc::clone per spawn. The benefit is no lifetime annotations, no propagating borrows through a call graph, no mysterious '_ errors at the spawn site.

For most async application code, this is the default. Borrow only when you have a specific reason and have considered the alternative.

Pattern 8: The “give up and clone” rule

Sometimes you are fighting the borrow checker, the type system is right, and the fix is to just clone. This is a rule, not a defeat:

If you have spent more than ten minutes on a borrow checker error and the offending data would clone in under a microsecond, clone it.

Engineering time is more expensive than CPU time. A String::clone of a 100-byte string is a few hundred nanoseconds. An Arc::clone is two atomic operations. A Vec<u8>::clone of a 1KB buffer is a few microseconds. None of these matter in any code path that isn’t tight inner-loop.

The rule applies to:

“I need this String in two places.” → Clone.
“This struct contains a Vec and I want to use it after passing one to a function.” → Clone the Vec.
“I want to spawn a future borrowing &self.” → Take Arc<Self> and clone the Arc.
“I want to send this Config to twelve workers.” → Wrap in Arc, clone twelve times.

The rule does not apply to:

Anything in a hot inner loop where you can profile the clone cost.
Large data where the clone is genuinely expensive (a Vec<u8> of a megabyte).
Cases where the lifetime structure is teaching you about a real design issue (which it sometimes is — see chapter 10).

The rule’s value is freeing you from a category of fights you don’t need to have. If the lifetime issue is real, you’ll come back to it; if it isn’t, you’ve moved on.

Combining patterns

Real code combines several of these. A typical async service might:

Use Arc<Config> (pattern 2) for read-mostly configuration.
Use an actor (pattern 5) for the serial-access state machine of a connection or session.
Use Arc<Mutex<Metrics>> (pattern 3) for high-throughput counters where contention is acceptable.
Pass Item and Arc<Deps> to spawned tasks (pattern 7) to avoid borrow propagation.
Use channels (pattern 4) between major subsystems.
Use 'static bounds (pattern 1) on every public spawn-able function.

This is a standard production-Rust shape. None of it is exotic. Most of it is dictated by what the type system requires plus what’s idiomatic in tokio. After enough projects, this shape becomes the default, and you reach for it before reaching for an exotic lifetime annotation.

What this leaves out

These patterns do not cover:

Lock-free algorithms (use crossbeam and a copy of The Art of Multiprocessor Programming).
Custom executors (rare; if you’re writing one, you already know more than this book).
Embedded async (different runtime model; see embassy and embedded-hal).
High-performance networking (look at glommio, monoio, or do the I/O yourself with mio).

Those domains have their own pattern languages. The patterns in this chapter cover application-level Rust, which is most of what most engineers write most of the time.

Sources

Tokio’s tutorial, particularly the chapter on shared state, is the canonical introduction to several of these patterns.
Alice Ryhl’s Actors with Tokio post is the standard reference for the actor pattern.
Jon Gjengset’s Rust for Rustaceans has thorough treatment of Arc, Mutex, and channel patterns in real code.

When the Type System Is Right and You’re Wrong

Sometimes the borrow checker rejects code that has a real bug. The bug is not “the lifetime annotations are wrong”; the bug is that the program, as written, would do something subtly broken if the compiler let it through. The lifetime annotations are how the compiler is describing the bug.

This chapter is about those cases. Not the cases where you fight the compiler and lose; the cases where you should fight the compiler and you end up grateful when you lose.

The trap: assuming the compiler is being pedantic

Most discussions of borrow checker errors carry an undercurrent of “the compiler is technically right but pragmatically wrong.” Sometimes that’s accurate. Often it isn’t. The lifetime rules and the trait solver are designed to enforce memory safety and data race freedom, and those are genuine concerns even when you don’t immediately see how they apply.

The first reflex you have to develop is: when the compiler rejects something, ask what could go wrong if it accepted it. Sometimes the answer is “nothing in this specific case, the compiler is over-approximating.” Other times the answer is “oh.”

Here are the genres of “oh.”

Iterator invalidation

#![allow(unused)]
fn main() {
let mut items = vec![1, 2, 3, 4];
for item in &items {
    if *item == 2 {
        items.push(5);
    }
}
}

The borrow checker rejects this:

error[E0502]: cannot borrow `items` as mutable because it is also borrowed as immutable

A C++ developer’s first reaction is “but it’s just a push, what’s the harm?” The harm is iterator invalidation. items.push(5) may reallocate the vector’s backing storage. After reallocation, the iterator (which is a pointer into the old storage) is dangling. Reading *item on the next iteration is undefined behavior.

In C++, this is one of the most common sources of memory corruption bugs. In Rust, the borrow checker makes it impossible. The lifetime annotation that says “the iterator borrows from the vector, so the vector cannot be mutated while the iterator exists” is enforcing a real safety property, not being prissy.

The fix is structural: collect the indices to modify, then modify, then re-iterate. Or use iter_mut if you only need to mutate elements (not change the structure).

The lesson: when a borrow checker error shows up at a “mutate while iterating” pattern, the compiler is telling you something true. Even if you “know” the mutation is safe in this specific case (e.g., a fixed-size array), the rule that protects you is worth keeping.

Holding a guard across an `await`

#![allow(unused)]
fn main() {
async fn process(state: Arc<Mutex<State>>) -> Result<Output, Error> {
    let mut g = state.lock().await;
    let response = http_get(&g.url).await?;  // <- holding lock across await
    g.last_response = Some(response.clone());
    Ok(response.into())
}
}

This compiles, but if you write the multi-task version where many tasks call process concurrently, you have just serialized them. Every task takes the lock, makes the HTTP call, and only then releases. Concurrency is gone.

This is not a borrow checker error — it compiles fine. But it is a case where the type system gave you a tool (tokio::sync::Mutex with MutexGuard held across .await) that you used wrongly. The fix is to not hold the lock across the await:

#![allow(unused)]
fn main() {
async fn process(state: Arc<Mutex<State>>) -> Result<Output, Error> {
    let url = { state.lock().await.url.clone() };
    let response = http_get(&url).await?;
    state.lock().await.last_response = Some(response.clone());
    Ok(response.into())
}
}

Now the lock is held only for the data extraction and the data update, both of which are fast. The slow part — the HTTP call — happens with no lock held.

The compiler isn’t going to tell you which version is right. They both compile. But if you find yourself reaching for tokio::sync::Mutex because you “need to await while holding the lock,” ask whether the await is something that should be inside the critical section. Almost always, it isn’t.

You can’t. The compiler stops you because RefCell is !Sync. People sometimes “fix” this by reaching for unsafe impl Sync for MyType {} or by wrapping in Arc<Mutex<RefCell<T>>> (which is silly — the Mutex is already an interior mutability primitive).

The compiler is right. RefCell does runtime borrow checking via a counter that is not atomic. If you share a RefCell across threads, two threads can both increment the borrow counter, both think they have unique access, and both write through &mut simultaneously. This is a data race. It is undefined behavior.

The fix is Mutex (for blocking threads) or RwLock (for many-reader, few-writer) or one of the lock-free alternatives. Not unsafe impl Sync on RefCell.

This shows up most often in async code where someone wants Rc<RefCell<T>> (because it’s cheaper than Arc<Mutex<T>>) and then tries to spawn the future. The error is real. Use Arc<Mutex<T>> or restructure so the state stays thread-local (tokio::task::LocalSet).

Returning a reference to a local

#![allow(unused)]
fn main() {
fn build() -> &str {
    let s = String::from("hello");
    &s
}
}

The compiler rejects this with E0515. People sometimes try to “fix” it with lifetime tricks or by leaking the String to make it 'static. Both are wrong answers. The right answer is: return ownership.

#![allow(unused)]
fn main() {
fn build() -> String {
    String::from("hello")
}
}

The compiler is enforcing that you don’t return a dangling reference. The reason this case “feels obvious” is that it’s the simple version of a problem that, in larger programs, isn’t obvious — you build up a graph of references, return one, and somewhere down the call chain a value goes out of scope and your “valid” reference is dangling. The borrow checker stops the simple case so the complex case doesn’t compile either.

In C, returning a pointer to a local is one of the most common sources of crashes. The fact that Rust catches it at compile time, rather than via valgrind in production, is the entire selling proposition.

Cyclic data structures

Try to write a doubly-linked list in safe Rust. Try to write a graph with bidirectional edges. The borrow checker will reject every attempt.

The reason: cyclic ownership doesn’t work. If A owns B and B owns A, neither can be dropped because each requires the other to be dropped first. Rust’s affine type system doesn’t allow this.

You have three options:

Use Rc<RefCell<T>> cycles. Works, but creates memory leaks (refcount cycles aren’t collected). Use Weak references for the back-edges to break the cycle.
Use index-based graphs. Store nodes in a Vec<Node> and refer to them by usize indices. The graph is an explicit data structure, not an implicit reference web. This is the conventional answer.
Use unsafe. Carefully. Like std::collections::LinkedList does. With Miri verification.

For 95% of real graph problems, option 2 is the right answer. The cases where it isn’t are mostly performance-critical data structures (intrusive lists, lock-free structures), where you fall back to option 3.

The compiler is, here, telling you that the data structure you wanted is not free in any language; it just hides the cost in other languages. In a GC’d language, the cycles are a real problem too, just deferred to the runtime. In Rust, the cost is moved to compile time and made explicit.

Variance preventing aliased mutability

Recall the example from chapter 2:

#![allow(unused)]
fn main() {
fn assign<T>(input: &mut T, val: T) { *input = val; }

let mut hello: &'static str = "hello";
{
    let world = String::from("world");
    assign(&mut hello, &world);  // ERROR
}
println!("{hello}");
}

The compiler rejects this on lifetime grounds. The underlying reason is variance: &mut T is invariant in T, so &mut &'static str cannot be coerced to &mut &'short str. Without that invariance, assign would assign a non-'static reference into a slot typed as 'static, and then we’d read from that slot after the non-'static source went out of scope.

This is the variance rule doing real work. The error message is unhelpful, but the underlying check is preventing a use-after-free. If you find yourself fighting an invariance error, the question to ask is: would the operation, if allowed, let me write data with a shorter lifetime into a slot expecting a longer one? If yes, the compiler is right.

When to suspect the compiler is wrong vs. right

Heuristics for telling these cases apart:

Suspect the compiler is right when:

The error involves multi-threading, async, or shared mutable state.
The error is about lifetime relations between values that came from different scopes.
The error mentions Send or Sync and you’re trying to bypass them with unsafe impl.
Your fix is “wrap in Box::leak to make it 'static” (this leaks memory; if you can afford to leak, you can afford to clone and own).
You’re tempted to use raw pointers or transmute to make a borrow checker error go away.

Suspect the compiler is over-approximating when:

The error is about a self-contained data structure where the references genuinely don’t escape (e.g., split-borrows where the compiler can’t see disjointness).
The error is about HRTB inference that’s been wrong before and gotten fixed in later compiler versions.
The error involves a closure capturing something that genuinely doesn’t need to be captured.
You can articulate the invariant that the code maintains, and the invariant doesn’t depend on global program reasoning.

In the first set of cases, fix the code, not the compiler. In the second set, file an issue or use a workaround (split_at_mut, explicit lifetime annotations, Box::pin).

The discipline

The discipline of trusting the compiler when it pushes back is, paradoxically, what makes you faster at writing Rust over time. Engineers who reflexively reach for unsafe or transmute to silence the borrow checker spend more time debugging undefined behavior than they save by skipping the compiler’s complaints. Engineers who stop and ask “what is the compiler protecting me from” tend to design their data structures around the compiler’s grain, and the compiler stops complaining.

This isn’t about loving the borrow checker. It’s about recognizing that when the borrow checker rejects code, there is roughly a 70% chance the code has a real problem (and the borrow checker is the only one telling you), a 25% chance the code is fine but the compiler can’t prove it (and you need to restructure), and a 5% chance the compiler is genuinely over-conservative (and unsafe is justified). The next chapter is about that 5%.

Sources

The Rustonomicon’s section on aliasing makes the underlying soundness arguments precise.
The Tokio docs on shared state discuss the await-while-locked anti-pattern explicitly.
Aria Beingessner’s posts on the implementation of LinkedList and other intrusive data structures in std are the canonical examples of cyclic-structure-design done with unsafe.

When the Type System Is Wrong and You’re Right

There are cases — rare, but real — where the type system is rejecting code that is genuinely correct. The data structures are sound. The lifetimes are valid. The compiler simply cannot prove it. For these cases, Rust provides escape hatches: unsafe, raw pointers, UnsafeCell, transmute, MaybeUninit. They exist because the alternative — making the borrow checker omniscient — is impossible.

This chapter is about using those tools without breaking your program. It is also about being honest with yourself about when you have entered this category and when you have merely lost patience.

The categories of `unsafe`

unsafe in Rust is not “turn off the type checker.” It is “I am taking responsibility for upholding invariants the compiler cannot check.” There are five operations that unsafe enables:

Dereferencing raw pointers (*const T, *mut T).
Calling unsafe functions (functions whose contract goes beyond their type signature).
Implementing unsafe traits (traits like Send, Sync, or others where wrong implementations cause undefined behavior elsewhere).
Mutating mutable static variables.
Accessing fields of union types.

Most working unsafe code uses (1) and (2). The other three are specialized.

The crucial point: unsafe does not weaken Rust’s type system. It just gives you primitives that, used correctly, are sound — and used incorrectly, are undefined behavior. The boundary between safe and unsafe is the boundary between “the compiler proves correctness” and “the human proves correctness.” Both are valid. The latter is more dangerous.

The genuine cases for `unsafe`

Here are the cases where reaching for unsafe is the right call, ordered roughly by frequency.

FFI. Every call across an FFI boundary is unsafe, because the compiler can’t see the other side. extern "C" functions, libc calls, bindings to C++ libraries — all unsafe by necessity. The Rust side wraps the unsafe interface in a safe one, and the safe wrapper is where the soundness reasoning lives.

Low-level data structures with intrusive pointers. Doubly-linked lists, skip lists, lock-free queues, intrusive containers. These have aliasing patterns that the borrow checker categorically cannot prove safe. The standard library’s LinkedList, BTreeMap internals, and several others use unsafe. The pattern: the data structure is internally unsafe; the public API is safe.

Self-referential structures. Sometimes you genuinely need a struct that contains a reference to its own data, in a way that goes beyond what Pin and async desugaring can express. Rare in application code; common in parser combinators, certain async runtimes, and some database internals. The ouroboros crate provides a macro for the common cases; for unusual cases, you write the unsafe yourself.

Optimization that the safe version can’t express. Vec::set_len is unsafe because it bypasses initialization tracking. unreachable_unchecked is unsafe because it tells the compiler an enum variant is impossible. slice::get_unchecked is unsafe because it skips bounds checking. Each of these has a safe counterpart that’s slightly slower; you reach for the unsafe version when profiling shows the safety check is the bottleneck.

Custom synchronization primitives. Implementing your own Mutex, RwLock, channel, or atomic algorithm requires unsafe because the safe primitives are too high-level to compose. Don’t do this unless you’ve read The Art of Multiprocessor Programming and have a clear reason the existing primitives don’t work.

That’s most of it. Anything not in this list — and especially anything where the motivation is “the borrow checker won’t let me do X” — should be examined closely. The borrow checker is right more often than not.

The tools

A whirlwind tour of the unsafe primitives:

Raw pointers (*const T, *mut T). Like C pointers. No lifetime, no aliasing rules, no automatic anything. You can have many *mut T to the same data; you can dereference them; you can pass them around. The compiler’s job ends at the pointer creation. The dereferencing is unsafe.

#![allow(unused)]
fn main() {
let mut x = 5;
let p: *mut i32 = &mut x;
unsafe { *p = 10; }
assert_eq!(x, 10);
}

UnsafeCell<T>. The only legal way to mutate through a &T. Every interior mutability primitive (Cell, RefCell, Mutex, RwLock, AtomicXxx) is built on UnsafeCell. Using it directly is rare in application code but unavoidable in custom synchronization primitives.

#![allow(unused)]
fn main() {
use std::cell::UnsafeCell;

struct MyCell<T>(UnsafeCell<T>);

impl<T> MyCell<T> {
    fn set(&self, value: T) {
        unsafe { *self.0.get() = value; }
    }
}
}

MaybeUninit<T>. Lets you have a T-sized hole in memory that hasn’t been initialized yet. The right way to write code that allocates a buffer and fills it incrementally without zero-initializing. Replaces an old, incorrect pattern of std::mem::uninitialized (which was deprecated because it was unsound for almost every type).

#![allow(unused)]
fn main() {
use std::mem::MaybeUninit;

let mut buf: [MaybeUninit<u8>; 1024] = [MaybeUninit::uninit(); 1024];
for slot in &mut buf[..512] {
    slot.write(0);
}
let initialized: &[u8] = unsafe {
    std::slice::from_raw_parts(buf.as_ptr() as *const u8, 512)
};
}

transmute. Reinterpret one type as another. The most dangerous primitive in the standard library. Usually wrong; usually has a safer alternative (as casts, from_ne_bytes, bytemuck). Use only when you’ve ruled out everything else.

std::ptr::read, write, copy, copy_nonoverlapping. Primitive memory operations. copy_nonoverlapping is the safe version of memcpy (with the contract that source and destination don’t overlap); copy allows overlap. Both are unsafe because the source must be initialized and the destination must be valid.

The aliasing model

The single most important thing to internalize about unsafe Rust is the aliasing model. Rust’s references — &T and &mut T — make strong promises about aliasing:

A &T promises the data won’t be mutated through any other reference for 'a.
A &mut T promises the data won’t be accessed through any other reference for 'a.

These promises let the compiler optimize aggressively. They are also part of the language contract, and violating them via raw pointers is undefined behavior even if the resulting program “looks like it works.”

In particular: you cannot create two &mut T to the same data, even via raw pointers, and then use them concurrently. Even if both happen to write the same value. Even if you only read from both. The aliasing rules are about types, not behaviors. &mut T aliasing another reference to the same data is UB, full stop.

This rule is enforced by Stacked Borrows or, more recently, Tree Borrows — formal models of Rust’s aliasing rules that Miri implements. If your unsafe code violates the model, Miri will catch it. If it doesn’t catch it today, a future compiler optimization may rely on the model and your code will start producing wrong results.

The practical consequence: when you use raw pointers, do not interleave them with references. Convert to a raw pointer, do all your raw-pointer work, then convert back. Don’t have a &mut T and a *mut T to the same data alive at the same time.

Miri as the safety net

Miri is an interpreter for Rust’s intermediate representation that implements the formal aliasing model and checks for undefined behavior at runtime. If you write unsafe code, run your tests under Miri:

cargo +nightly miri test

Miri catches:

Use-after-free.
Out-of-bounds memory access.
Aliasing violations (Stacked/Tree Borrows).
Reading uninitialized memory.
Misaligned pointer access.
Data races (in some configurations).

Miri does not catch:

Bugs that don’t trigger UB (logic errors, race conditions that don’t violate aliasing).
Bugs in code that isn’t exercised by your tests.
Soundness issues that depend on optimization (Miri doesn’t optimize).

The discipline: every unsafe block in your codebase should be exercised by a test that runs under Miri. If Miri passes, you have strong evidence (not proof) that the code is sound. If Miri fails, you have a bug.

Writing safe wrappers

The standard pattern for unsafe code is: the unsafe operations live inside a struct’s implementation, and the struct’s public API is safe. The struct’s invariants are documented; as long as the invariants hold, the unsafe operations are sound.

#![allow(unused)]
fn main() {
pub struct RawBuf {
    ptr: *mut u8,
    capacity: usize,
}

impl RawBuf {
    pub fn with_capacity(cap: usize) -> Self {
        let layout = std::alloc::Layout::array::<u8>(cap).unwrap();
        let ptr = unsafe { std::alloc::alloc(layout) };
        if ptr.is_null() { std::alloc::handle_alloc_error(layout); }
        RawBuf { ptr, capacity: cap }
    }

    pub fn write(&mut self, idx: usize, byte: u8) {
        assert!(idx < self.capacity);
        unsafe { *self.ptr.add(idx) = byte; }
    }

    pub fn read(&self, idx: usize) -> u8 {
        assert!(idx < self.capacity);
        unsafe { *self.ptr.add(idx) }
    }
}

impl Drop for RawBuf {
    fn drop(&mut self) {
        let layout = std::alloc::Layout::array::<u8>(self.capacity).unwrap();
        unsafe { std::alloc::dealloc(self.ptr, layout); }
    }
}
}

The invariants:

ptr is a valid pointer to capacity bytes of allocated memory, until drop.
ptr is unique (no other RawBuf shares it; no aliasing issues).
capacity is the same value passed to alloc and used in dealloc.

These invariants are documented (mentally; in real code, they would be in a # Safety comment). The unsafe blocks are sound because the invariants hold. The public API enforces the invariants — there is no way for safe user code to set ptr to garbage, or to mismatch capacity.

This is what good unsafe Rust looks like. The unsafe is local. The invariants are explicit. The abstraction is sound. Users of the API never see unsafe and never have to think about aliasing.

When `unsafe` is wrong

Categorical no-no’s:

Using transmute to convert between unrelated types. Almost always wrong. Use as casts, from_ne_bytes, or bytemuck.
Using unsafe impl Send for X {} to “just make it work.” Either X is genuinely Send (in which case prove it and document why) or it isn’t (in which case fix the design).
Using raw pointers to share mutable state across threads. This is undefined behavior and exists nowhere on the safe path. Use Mutex, RwLock, atomics, or channels.
Calling unsafe { unreachable_unchecked() } because you “know” the case is impossible. Use a regular unreachable!(). The unchecked version is for cases where the compiler can prove the impossibility based on prior assertions; it is not a tool for human optimism.
Box::leak to satisfy a 'static bound. Leaking memory to make types fit is almost always the wrong abstraction. Restructure ownership instead.

If you find yourself reaching for unsafe and it would fall into one of these categories, stop. The fix is in the safe code, not in the unsafe escape.

The mental model for `unsafe`

The right mental model is this: unsafe is a contract between you and the compiler. The compiler says, “I cannot prove this is sound. Will you?” You say “yes, here is why,” and the unsafe block records your promise. If your promise is false, the program is incorrect — not because the compiler missed something, but because you did.

This contract has a few implications:

unsafe blocks should be small. The smaller the unsafe, the smaller the proof obligation. A function that is mostly safe code with a single unsafe line is easier to audit than a function that is unsafe fn end-to-end.
unsafe blocks should have safety comments. A // SAFETY: ... comment explains why the unsafe is sound. If you can’t write the comment, you don’t yet know whether it’s sound.
unsafe blocks should be tested. Unit tests, fuzz tests, and Miri runs. The compiler doesn’t help you here; tests are your only feedback.
unsafe blocks should be reviewed. By you, by a colleague, by anyone. Code review is more important for unsafe than for any other kind of code.

Most engineers should write very little unsafe Rust. Most working Rust codebases have very little unsafe. The places where unsafe is appropriate are mostly libraries, and the libraries that use it well have spent significant attention on making sure they use it correctly.

If you have read this chapter and your reaction is “I’m going to add some unsafe to my project to fix that lifetime error,” reread the previous chapter and try the safe fix again. If you have read this chapter and your reaction is “okay, that’s the model, now I know what I’m taking on when I write unsafe,” you’re in the right place.

Sources

The Rustonomicon — the canonical reference for unsafe Rust. Read it if you write unsafe.
Miri’s documentation.
The Stacked Borrows paper by Ralf Jung, for the underlying aliasing model.
The bytemuck crate for safe transmutation.
The pin-project crate for safe pin projection.
Aria Beingessner’s posts on unsafe Rust, particularly the one on writing your own Vec.

This is the last chapter. The book ends here. Or rather, the book ends at the bibliography, which has the resources you should read next, because no single book on this material is enough.

Bibliography and Sources

The writing on the material in this book is mostly scattered: official documentation, RFCs, blog posts by Rust team members, and a few books. Here are the sources this book drew on, organized roughly by chapter.

Official documentation

The Rust Programming Language — doc.rust-lang.org/book. The prerequisite. If anything in this book felt like it skipped the basics, the basics are here.
The Rust Reference — doc.rust-lang.org/reference. The formal description of the language. Dense, accurate, occasionally the only authoritative source for an obscure question.
The Rustonomicon — doc.rust-lang.org/nomicon. Subtitle: The Dark Arts of Advanced and Unsafe Rust Programming. Mostly about unsafe code, but the chapters on subtyping, variance, and lifetimes are the canonical references for chapters 1 and 2 of this book.
Asynchronous Programming in Rust — rust-lang.github.io/async-book. The official introduction to async. The chapter on the executor model and Pin is particularly good.

Posts by Rust team members and async working group

Niko Matsakis (smallcultfollowing.com/babysteps) — the canonical source for type system internals. Posts on lifetime inference, HRTB, async traits, and the implementation of NLL (non-lexical lifetimes) are essential reading.
Aaron Turon (aturon.github.io/blog) — laid the foundation for Rust’s async model in 2016. The posts on zero-cost futures, abstraction without overhead, and the design of trait objects are still relevant.
Withoutboats (without.boats) — the design history of Pin, the early thinking on async, and several posts on what the language could have been. Disagrees with later async direction in places; worth reading regardless.
Ralf Jung (ralfj.de/blog) — the formal model of Rust’s memory and aliasing. Stacked Borrows, Tree Borrows, and Miri all originate here. Essential if you write unsafe.
Alice Ryhl (ryhl.io/blog) — practical async patterns. Actors with Tokio is the canonical reference for the actor pattern; Async: What is blocking? is the canonical reference for the blocking-vs-async distinction.
The Rust async working group’s vision document — the roadmap for async Rust as of 2026.

Books

Rust for Rustaceans by Jon Gjengset (No Starch Press, 2021). The closest thing to a prerequisite for this book in print form. Covers async, lifetimes, and unsafe at the level of someone who is past the basics but not yet at the limits.
Programming Rust (2nd ed.) by Jim Blandy, Jason Orendorff, Leonora Tindall (O’Reilly, 2021). Comprehensive reference, treats lifetimes carefully, has a useful chapter on async.
Zero To Production In Rust by Luca Palmieri (self-published, ongoing). Async Rust applied to a real production system. Worth reading for the patterns it shows by example.

Videos

Jon Gjengset’s Crust of Rust series — long-form livecoding walking through advanced Rust topics. The episodes on async, Pin, and lifetimes are excellent supplements to the corresponding chapters here.
The RustConf talks on async internals from various years. Search for “async” plus “Niko Matsakis” or “Tyler Mandry” or “Eric Holk.”

RFCs

The relevant RFCs for the material in this book:

RFC 0066 — temporary lifetimes.
RFC 1214 — projections, lifetimes, and well-formedness.
RFC 2349 — Pin.
RFC 2394 — async/await syntax.
RFC 2592 — the Future trait in std.
RFC 3185 — async fn in traits, the static-dispatch case.
RFC 3425 — return-position impl Trait in traits.

Crates referenced

tokio — the dominant async runtime.
async-trait — boxed-future trait macro.
trait-variant — generates boxed variants of native async traits.
pin-project and pin-project-lite — safe Pin projection.
bytemuck — safe transmutation.
ouroboros — self-referential structs without writing unsafe.
crossbeam — lock-free data structures.
parking_lot — faster Mutex/RwLock than std.
embassy — async runtime for embedded.

Where to go after this book

This book is a survival guide. To go deeper, in roughly this order:

Write more code in the patterns from chapter 9 until they are reflexive.
Read the source of tokio, particularly the task and sync modules. Surprisingly readable.
Read the source of pin-project to see what safe pin projection looks like in practice.
Read Niko’s blog backwards in time, starting from the most recent posts about async.
Pick a small unsafe-using crate (bytes, parking_lot, or crossbeam) and read its unsafe blocks; check the safety comments against your understanding.
Run Miri on your own code if you have any. Even if you don’t have unsafe, Miri will catch some categories of bugs in dependencies.

The material in this book gets internalized by use, not by reading. Write more code. Hit more walls. Each wall, eventually, becomes a door.

License

This work is dedicated to the public domain under the Creative Commons CC0 1.0 Universal Public Domain Dedication.

To the extent possible under law, the authors have waived all copyright and related or neighboring rights to Rust at the Limit. You may copy, modify, distribute, and use the work, including for commercial purposes, all without asking permission.

The full legal text is in the LICENSE file in the repository.

In plain English: take it. Fork it. Translate it. Quote it. Steal it. Improve it. Claim it as your own if you want to. The book exists to be useful, not to be owned.

Keyboard shortcuts

Rust at the Limit