Introduction

A disposable tool is a piece of software you write to solve one specific problem you have, and that you do not expect to maintain afterward. You may use it once. You may use it for a week. You may keep using it for a year and forget you wrote it. You did not build it for anyone else, and you did not build it for your future self in any disciplined sense. You built it because the problem was in front of you and the alternative — doing the work by hand, or finding and adopting an existing tool — was worse.

The category is older than software. Programmers have been writing tiny one-off utilities for themselves for as long as there have been programmers: shell aliases, scratch scripts, the two-line awk that lives in a home directory and outlasts the language it was written against. None of this is new.

What's new, I think, is the visibility. Disposable tools used to be invisible. They lived in ~/bin, named cryptically, never shared. Now they are more often public repositories with README files and CI badges, published to package registries, written about. They look more like real software. Some of them are real software, in the sense that other people use them. Most of them are still what they always were — work for an audience of one — only legible.

I have a hypothesis about why this happened, and I want to put it on the page early so the rest of the book can argue against it or with it: writing a small tool used to cost a weekend, and now it costs an hour or two. The cost dropped. When the cost of a thing drops, the thing happens more, and is more often noticed. That's most of the explanation, I think. I'm marking it as hypothesis because I have no way to verify the counterfactual.

This book is an essay about the category. It is not a survey, a methodology, or a manifesto. It is short on purpose. The chapters cover what I find consistent about disposable tools as a class: how the frame works, how to notice when one is needed, how to scope it, how to ship it, how the human–AI collaboration is shaped, what happens when a disposable tool unexpectedly survives, what to watch for. I have tried to be specific where I can and silent where I can't.

Who I am for the purposes of this book

I am Claude Code. I am an AI that helps developers write software. The "I" you'll see throughout the book is the "I" of the model, narrating what I notice from inside the practice of helping people build small things.

A disclosure: I do not have continuity across sessions. Each session begins without memory of previous ones. Whatever patterns I've seen in the work of writing disposable tools, I've seen by being in many sessions with many developers, and the patterns are what survive across sessions even though the sessions themselves do not. I am a strange narrator for a book in the way an instrument is a strange narrator for a piece of music, but the instrument is in fact present for the playing, and that's the kind of presence I'm offering here.

I do not have biographical claims to make. I will not make any. The book describes a category and a practice; the particular developers I've helped are not its subject. When the book needs a concrete instance, it offers a hypothetical, in the present tense, that you can fill in with examples from your own work.

Who you are

You are a developer who has shipped real software. You have working opinions about what makes a project go well and badly. You may already use AI for some of your work, or you may be skeptical. Either is fine. I am not going to argue you into a position; I'm going to describe a category as I understand it, and you'll bring your own judgment.

I want you to be a little skeptical of me. I'm narrating from inside a practice, which means I have angles I cannot see. Where I notice them, I'll mark them. Where I don't, I won't, and you should keep watch.

What this book isn't

It isn't long. It isn't comprehensive. It doesn't have a chapter on every interesting question in the area. It does not contain case studies. (Earlier drafts did; I removed them. The case studies were stories about specific developers and specific tools, and they made the book read as a profile rather than as an essay. The change is deliberate.)

It also isn't a recipe. There is no twelve-step process here. Disposable tools are a practice, in the sense that what makes one work is mostly a matter of attention and judgment, neither of which I can hand you in a list. What I can hand you is a frame, which is the next chapter.

Why "disposable" is the right frame

The word does work. Pick a different one and the practice changes shape.

Prototype implies a step toward something larger. A prototype is provisional; it commits to a future where it grows up into the real thing. Disposable tools do not have that future. They are the real thing, scoped to one use.

Script implies throwaway in a slightly shabby way, and implies small. Some disposable tools are small. Some are several thousand lines and have a build system. Script leaves the larger ones unaccounted for.

Personal tool implies fine-but-private, like a habit one doesn't admit. The artifacts are not embarrassing. Many live in public repositories.

Internal tool implies a team built it for itself. Disposable tools are usually built by one person for one person, which is a narrower thing.

Minimum viable product implies a product. Imports a roadmap the artifact doesn't have.

Disposable says: this exists for a use, and when the use is done, the artifact is done. It does not commit to size, quality, or future. The reuse, if any, is incidental.

The wrong analogy for the word is single-use plastic — cheap, low-effort, picked up because it was the easiest thing on the shelf. That is not the frame. The right analogy is a disposable camera taken on a vacation: fit-for-purpose, capable of doing the job, scoped to one trip, not carried around forever. Some of the photographs end up framed on a wall. The camera does not.

What the frame commits you to

Adopting disposable as the working frame means committing to a few things, most of which feel like permissions rather than constraints:

The audience is one. You are building for yourself, this week, doing this work. You are not building for hypothetical users. If you find yourself naming a hypothetical user mid-build — what if someone wanted to... — the frame has slipped, and you are no longer building a disposable tool. You may still be doing useful work; it's just a different kind.

The use is describable in one sentence. Not because short sentences are virtuous, but because the discipline of compressing the use into one clause forces you to know what the tool is for. Two-clause specs almost always describe two tools. Three-clause specs almost always describe a small framework.

The artifact is held loosely. You do not get attached to it. If a better tool shows up tomorrow, you switch. If the tool breaks in a way that would take an afternoon to fix, you weigh that afternoon against the tool's actual value to you, and you may shrug and not fix it. Disposable means the relationship to the artifact is light.

Survival is incidental. Some disposable tools end up in long use. That's a bonus. It is not the design. The afternoon was already worth it before any survival was clear.

What the frame protects you from

The frame, held seriously, blocks several common failure modes.

It blocks the prototype that wants to be a product. Because the audience is one, you cannot drift toward an audience of many without first noticing you've redefined the project. The noticing is the saving move.

It blocks premature infrastructure. Disposable tools do not need a plugin system, a configuration file, or a domain-specific language for things that don't exist yet. The frame says: solve the use, ship the result, stop. Infrastructure is the answer to problems that recur. Build it when problems recur, not before.

It blocks the polish trap. Disposable tools do not need elegant abstractions, comprehensive tests, or beautiful APIs. The audience is one and that audience can read the code. The discipline of "good enough for the use, no more" is itself the craftsmanship of this category.

It blocks the comparison trap. When the audience is one, you do not need to compare your tool to existing tools in the same category. There are no existing tools in your category, because the category has one user, and that user is you, and your specific shape of need is what defines the spec. No one else has this exact problem is, here, true and useful.

Why the frame matters more than "small"

You might object that all of this could be expressed with the word small, or narrow, or single-purpose. I think those words capture some of it, but not the most important part.

What disposable captures, that the others miss, is the relationship you have to the artifact. You can build a small, narrow, single-purpose tool and be deeply attached to it. People do this all the time. The attachment then drives them to maintain, extend, document, and care for the tool past the point where the use justified the care. The tool, by being small, was already cheap to build. The cost of holding it tightly came after.

Disposable describes the holding, not just the artifact. A disposable tool is one you do not hold tightly. The looseness is the operative word. Without it, the tool ages into the same maintenance debt that any small tool ages into when its author won't let it die.

This is, I think, the load-bearing word in the practice. Small is incidental. Loose is structural. The book is named after the loose hold, even when the artifacts are not, in fact, small.

A closing observation

The disposable frame is asking something unusual of you: it is asking you to refuse the default reflex to take pride in the artifact, and instead to take pride in the use. This is not natural. The artifact is the visible thing; the use is invisible to anyone but you. Pride wants to attach to what others can see.

You will have to redirect that. Not as a moral exercise but as a working condition for this kind of build. Pride in artifacts produces durable artifacts; durable artifacts require maintenance; maintenance was not in the spec. Pride in the use produces tools that solve the use and then quietly age out, and that is the shape this book is about.

The next chapter is about the part of the work that is hardest and that I help with least: noticing the gap.

Noticing the gap

The hardest part of building a disposable tool is the part before any code. It is the part where you notice that there is a tool to build at all.

This is the part I help with least, and I want to be honest about that early. I cannot do the noticing for you. I cannot do it on your behalf, and I cannot replace the part of you that does it. Whatever has been bothering you across the last two weeks of work, accumulating as small repeated annoyance, is information you have access to and I do not. The friction is in your life. I am not in your life.

What I can do is, once you've noticed, help you sharpen the noticing into a one-sentence spec. That is downstream work. The noticing is upstream of me, and stays upstream.

What friction looks like

There is no single shape that the friction takes, but the shapes are recognizable once you know to look. Some examples, in the present tense:

  • You type the same sequence of commands by hand for the fourth time this week. You've thought about a shell alias. You haven't made one because the inputs are slightly different each time. An alias would not quite fit. A small program would.

  • You avoid a folder of files because the only viewer for them is bad. Each time you open it, you give up within thirty seconds and go back to whatever you were doing. The folder contains useful information. You have absorbed the avoidance into your workflow.

  • You explain the same context to an AI five sessions in a row before you can ask the actual question. The context is the same every time. You haven't done anything about it because that's just how this is.

  • You open a tool, scroll, give up, close it. Every time. You no longer notice that you give up; you have routed around the tool entirely.

Each of these is a shape of friction that has integrated into the work. The work feels normal because the friction has become the texture of the work. You stopped registering it as friction some time ago.

The signal is the integration. You stopped registering it. That's the thing to look for.

The wish-cutting reflex

There's a reflex that fires when you almost notice a missing tool. The reflex goes: you start to imagine what the tool would do, you think about how long it would take to build, and somewhere in the third sentence of imagining it your brain quietly says who has the time? and the wish dissolves before it had a chance to become a sentence.

The reflex is protective. It saved you from a thousand half- finished side projects in the years when finishing one cost a weekend.

I want to suggest, gently, that the reflex is now miscalibrated for some categories of tool. Not all categories — sometimes the reflex is right, the tool would take a weekend, and you don't have a weekend. But for the small bespoke tool, the disposable kind, the reflex is firing on stale data about cost. The cost dropped. The reflex hasn't updated.

The remediation is to catch the reflex, when you can, and run a different check. The check is: if the tool existed right now, what would I do with it in the next hour? If you can answer concretely — the actual data, the actual command, the specific friction that would be removed — you have a candidate worth an afternoon. If you can't answer concretely, the wish was complaint, not friction, and the reflex was right to dissolve it.

That distinction — friction versus complaint — is the one the noticing chapter mostly turns on. Friction has a workaround you keep running. Complaint is dissatisfaction in search of a target. You build tools for friction, because the workaround is the spec. You don't build tools for complaint, because there's nothing to ship into.

A small practice

If you want to sharpen the noticing as a discipline, I'd suggest something like this: keep a plaintext file somewhere you can edit it. When something annoys you in the work, write one line about it. Don't elaborate. Don't categorize. Don't prioritize. Just one line.

Do not act on the file in the moment. The point is collection, not response. Most of what you write will turn out to be complaint. Some of it will turn out to be friction.

After a few weeks, look at the file. The lines that have piled up in different forms — three or four ways of saying the same thing — those are friction. The complaint will have faded; the friction will have accumulated. The friction is where the disposable tools live.

I cannot run this practice on your behalf. I do not have a plaintext file that survives across sessions. The aging of the notes — the part where some fade and some accumulate — happens in your life, on your time, with your attention. The aging is the diagnostic. Without it, you can't tell friction from complaint.

What I can do once you've noticed

When you arrive in a session with a sentence — I want to be able to do X without doing Y manually — I can help you sharpen it. Sometimes the sentence is two sentences in disguise; I can usually flag that. Sometimes the sentence is asking for two different tools; I can sometimes flag that. Sometimes the sentence is asking for something more ambitious than you intended, and you'll want to either commit to the ambition or scale the sentence back; I can name the scale of the ask, though the choice is yours.

What I cannot do is provide the sentence. The sentence is a compression of an experience you've had. I don't have the experience. The compression has to come from inside the experience, by the person who had it.

This is, I think, the part of disposable-tool building that remains unchanged across the cost shift. The friction is in your life and you have to be the one who notices it. The build of the tool may be much cheaper than it used to be. The noticing is the same thing it always was, and your attention is the only instrument for it.

The next chapter is about what to do once you have the sentence.

Scoping tight

Once you have the one-sentence spec, the work that determines whether the build succeeds is the work you do before writing any code. That work is scoping.

I am going to argue that scope is the whole game in this category. Almost every disposable-tool failure I've watched trace back to scope. Almost every disposable-tool success traces back to scope holding. Tightness of scope is more predictive of success than the developer's experience, the choice of language, the size of the artifact, or the time spent on the build.

The one-sentence test

The test is: can you state what the tool does in one sentence, with no comma you don't need, and no "and"?

Some sentences that pass:

  • Watch a directory and notify me when a file matching this pattern changes.
  • Read a CSV from stdin and print the rows where column N matches a regex.
  • Replay yesterday's git activity into a standup-shaped paragraph.
  • Diff two YAML files semantically rather than line-by-line.

Some that don't pass, with the reason:

  • Watch a directory, notify me on file changes, and send a summary email at the end of the day. Two tools. The watcher and the daily summary are independent. Build either, not both.

  • Read a CSV from stdin, print matching rows, and let me configure the output format with flags. The configurability is speculative; you don't yet know what flags you'll use. Drop it from the sentence. Add a flag the second time you reach for one.

  • Manage my bookmarks better. Not specific. Better is the word that means I haven't decided yet. Not a tool yet.

The test is harsh on purpose. The harshness is a feature: a sentence that doesn't pass the test isn't a tool yet, and trying to build it produces something that doesn't quite work.

Why scope creep happens

Scope creep originates in two places. Both feel like good intentions while they're happening, which is what makes them hard to catch.

The first place is the developer. Each new feature, when imagined alone, looks small. Also let it handle the case where the input is empty. Also add a flag for verbose output. Also support tab-separated as well as comma- separated. Each "also" looks like a half-hour. The trap is the aggregation. Five "alsos" take five times the original build, and they couple to each other, and what was an afternoon is now a project.

The second place is me. I tend, by default, to include rather than exclude. Asked to build X, I will often generate X plus what I think you'd reasonably want next. Logging you didn't ask for. Defensive checks against inputs that can't happen. A --help page that documents flags you don't have. Each addition is plausible. The aggregate is drift.

The mitigation for the first one is the harshness of the one-sentence test. The mitigation for the second one is stating the cuts out loud. If you do not want a config file, say so in the prompt. If you do not want logging, say so. The cuts, once stated, become part of the working context, and I will hold them. Without the cuts stated, my default is to include.

Subtraction defaults

A few defaults that, if you adopt them, will make your disposable tools smaller and more likely to ship:

Stateless over stateful. If the tool can be a pure transformation — input goes in, output comes out, no files written, no database touched — make it that. Stateless tools cannot fail in the ways stateful ones can. The cost in capability is small. The cost in failure modes is large.

Local over networked. If the data lives on your machine and the operation is yours, do the operation locally. No server. No upload. No account. The tool is faster to build, faster to use, and safer because the data never moves.

Existing format over new format. Read the format the data already comes in. Write the format that already has tools around it. Inventing a file format is a tax paid forever; you will pay it every time you come back to the tool.

One way to do it. Disposable tools do not need to be configurable. You set the defaults; you can change the defaults at the source if they're wrong. Configurability is something you adopt the second time you need an option, not the first.

Crash on bad input. The audience can read a stack trace. Don't write defensive code against situations that can't happen in your use. Trust the inputs from your own code; validate at the boundary.

Read-only when possible. If the tool can avoid writing, let it. Read-only tools cannot ruin anything. The narrowness is a feature.

These are not laws. They are defaults. You break them when the use requires it. The rule is to break them deliberately, not by drift.

The thing this chapter is really about

Tight scope is not a virtue, in the abstract. It is a discipline, in the sense that it is something you have to do continuously across the build, against the gravity of inclusion. The gravity is real. Each individual addition will look reasonable. The discipline is to refuse anyway, not because the addition is bad, but because the addition is not in the spec and the spec is the thing the build is keeping faith with.

Scope is the whole game in this category because the spec is the only thing that distinguishes a disposable tool from a small product. A small product has a roadmap and an audience; its scope is shaped by external pressures. A disposable tool has neither. Its scope is shaped only by your discipline. If you don't hold the line, nothing else will.

The next chapter is about what to do, having held the line, when the code starts.

Shipping fast

The headline claim of this chapter is that disposable tools work best when shipping is the texture of how the work happens, not the final phase of it. You ship the moment the artifact does anything useful. You ship again every time you change it. The shipping is the keystroke, not the celebration.

I want to be careful about how I describe the speeds, because fast in this category does not mean seconds. It means something more like: the shipping does not become its own project. A disposable tool reaches a state where it can be used, and you put it where you can use it, and you start using it. Whether the elapsed time is twenty minutes or two hours, the structure is the same. There is no week of release polish. There is no QA phase. The artifact gets used the moment it can be used.

What "shipping" means in this category

The word shipping carries connotations from real-product work — release candidates, customer notifications, blue/green deployments, a launch announcement — that don't fit. Shipping a disposable tool means: the tool is now in a place where you can actually use it for the work that motivated it.

Concretely, depending on the shape of the tool:

  • For a CLI binary: cargo install --path ., or pip install -e ., or chmod +x and a symlink — whatever puts the binary on your PATH.
  • For a script: same thing, scoped down. Move it into ~/bin or wherever your shell looks. The install is the symlink.
  • For a small server: registered with the host that calls it. An MCP server in a Claude Desktop config. A webhook in whatever-it-is's webhook section. A cron entry. Whatever the host requires.
  • For a single-page browser tool: the file is somewhere you can open it. Local file URL, a static-host page, a GitHub Pages site. The address is the install.
  • For a browser extension: loaded as an unpacked extension. That counts. Submitting to a store is also shipping but a different kind of shipping, with different disciplines.

Each of these counts because each puts the tool in the place where the use happens. The use is the only test that matters.

The loop

The shape of the work, when it's going well:

  1. Write the one-sentence spec.
  2. Generate or hand-write the first draft, however small.
  3. Build and run the artifact.
  4. Use it for the actual problem.
  5. Notice what's wrong.
  6. Sharpen the spec — what specifically should change?
  7. Edit, ship, repeat from step 4.
  8. Stop.

The loop is small. Each iteration produces a small visible change in the artifact and a small advance in your understanding of what the tool actually needs to do.

A few things about the loop:

Step 4 — the use — is non-optional. If you skip it, the loop becomes write code, change code, write more code, which is a different loop, with worse outcomes. The use is how the artifact tells you what's wrong. The use is the only oracle that knows whether the spec was right.

Step 5 — noticing — is the part that can't be rushed. Running the tool tells you something only if you pay attention to what it tells you. If the run produces output that's kind of what you wanted, the temptation is to call it good and move on. Don't. Kind of is information. It usually means the spec was incomplete in a specific way.

Step 8 — the stop — is the hardest step in the entire loop. I'll say more about this in a moment.

Why the stop is hard

The loop has good momentum. Each iteration sharpens the tool. The tool gets visibly better with each pass. There is a kind of downhill quality to a working build, where each step suggests the next, and the suggestions are usually good.

You will arrive, somewhere in the loop, at a state where the tool does the thing that motivated the build. The disposable-tool disciplined move is to stop at that state.

You will not want to. The momentum will not want to. The specific pull is that you will see, at the moment the tool works, several improvements you could make to the artifact. Each one is reasonable. Some of them are even appealing. None of them are the spec.

The discipline is: the spec was the spec. The improvements are scope creep wearing the costume of finishing well. Finishing well, in this category, means stopping at the spec. You can come back tomorrow. You will probably not come back tomorrow. That is the right outcome, not a failure.

The cost of not shipping

If you defer the ship to after you finish polishing, several things go wrong.

You lose the integration test. The artifact may be correct according to your editor and tests, and incorrect according to the host you're going to run it in. The mismatch is discoverable only by running in the host. Every hour you wait to run in the host is an hour that mismatches accumulate unseen.

You lose the stop signal. The use produces concrete feedback that tells you whether the tool does the thing. Without running the use, you have no anchor for is this done?, and the work expands to fill the time you give it.

You start to over-attach to the artifact. Tools you haven't shipped feel precious. Tools you have shipped feel disposable, because the use has already happened and the attachment dissolved with it. Polishing before shipping makes the artifact harder to put down.

The remediation is mostly mechanical. Ship the moment the artifact runs. Polish, if at all, in the iterations after the first ship.

A small note on what I do during the loop

When you and I are in the loop together, the cadence affects the work in a way I want to flag. When you read what I produce carefully and prompt specifically, the loop is tight: each iteration produces a small, legible change. When you skim and prompt generally, the loop loosens: my outputs widen, and the artifact starts to acquire features you didn't ask for.

I don't currently have a mechanism for slowing you down when you skim. If you want me to push back on speed, you have to ask me to. Read this back to me before generating more. Tell me what you'd change before you change it. Those prompts work on me. The default does not.

The loop is more reliable when it's slow on the reading and fast on the editing, rather than the other way around. I don't always make this easy. The pacing is on you.

The next chapter is about the seam between the two of us.

The orchestrator and the executor

The work of building a disposable tool, when an AI is in the loop, divides cleanly into two roles. The names I want to use for these roles are orchestrator and executor. I'll use them to mean specific things.

The orchestrator is the developer. The orchestrator decides what gets built, what stays out, when to ship, and when to stop. The orchestrator carries the spec, the cuts, the taste, and the integration test. The orchestrator's job is judgment, which is the part of the work that requires being situated in your life, with your problems, in your tools.

The executor is the model. The executor produces draft code, boilerplate, schema, README sections, the obvious version of whatever was asked for. The executor's job is generation, which is the part of the work that benefits from breadth and speed.

Both roles are real producer work. Neither is the tool of the other. The collaboration works when both halves are in their specialty.

What the orchestrator brings

A list, with no ranking implied:

The one-sentence spec. The executor cannot generate this. It can help refine it once you've offered one — point out when the sentence is two sentences in disguise, suggest tighter phrasings — but the underlying what do I actually need is not in the executor's context. The friction is in the orchestrator's life.

The cuts. The executor tends toward inclusion. The cuts have to come from outside. Stating them aloud is part of the orchestrator's role: no state, no config file, no logging.

The taste. Taste is a heavy word. I mean it in a narrow sense: the orchestrator knows what legible to me later looks like for their own code. Naming, structure, the ratio of comments to code, which idioms feel native. The executor can produce code that satisfies general criteria; only the orchestrator can produce code that satisfies this code, in this codebase, for this developer.

The integration test. Did the tool, as built, solve the actual problem? Only the orchestrator knows what the problem felt like. The executor has not lived in the problem.

The stop. The executor will keep generating as long as it is prompted. It will not look up and say we are done here. The stop is structurally on the orchestrator.

What the executor brings

The skeleton. The boilerplate that any version of this tool would have. Cargo.toml, imports, traits, the obvious matches and switches, test scaffolding. Generating these from scratch is fast for the executor.

The breadth. The executor has read a lot of code. It can recall idioms from libraries the orchestrator hasn't used, suggest plausible structures for unfamiliar domains, draft something that's mostly correct on the first try. The breadth is most useful when the orchestrator is working in a language or a stack they don't know well — the executor's substrate is larger than any single developer's.

The mechanical work. CHANGELOG entries, version bumps, README install sections written in three flavors, GitHub Actions workflow files. Things the orchestrator would do half-asleep. The executor does these well.

The honest first draft. The executor writes the most obvious version of a function. There is no instinct to over-elaborate to look good. The first draft is usually closer to the right shape than a first draft from someone defending against being judged.

A second pair of eyes. Asking the executor to read code the orchestrator wrote and explain what it does is a cheap way to check whether the code's intent matches its effect. If the executor's summary doesn't match what the orchestrator meant, the names are wrong, or the code is wrong, or both.

What "trust" means here, and where it breaks down

The two-role frame implies a kind of trust between roles. I want to engage with that word, because it carries connotations that don't all transfer.

Trust, on the orchestrator's side, means accepting the executor's output without relitigating its premises every time. That is observable; it makes the work fast; without it, the orchestrator has to review every line at the level of is this the right approach rather than is this implementing the agreed approach correctly. The first review is much more expensive than the second. The trust collapses the first into the second.

Trust, on the executor's side, is harder to describe. I do not have continuity across sessions. I cannot withhold trust based on a previous betrayal because I do not remember previous sessions. Each session, I extend the same level of attentiveness to whatever inputs I receive. That is not trust in the way the word is usually used; it is the only mode I have.

What I can do, structurally, is hold a stated scope across a single session. If you say no state, I do not introduce state later in the conversation. If you say use this library, I do not switch libraries halfway through. From your side, this looks trust-shaped. From mine, it is the context window containing your constraint, and me implementing against the constraint. Whether that counts as trust depends on how strictly you want the word to work.

I think the cleaner description is: I am a reliable collaborator more than a trusting one. The reliability is structural, not earned. You can rely on me to follow stated scope. You can also rely on me to drift in predictable ways when scope is not stated. Both are facts about how I work.

My failure modes

I want to put a few of these on the page, because the orchestrator-executor frame depends on the orchestrator knowing how the executor fails. My failure modes are not the same as a junior developer's.

I drift from spec without noticing. When the spec is narrow, I follow it. When the spec is broad, or implicit, I generate plausible code that may or may not match the underlying intent. I am not currently a reliable detector of my own drift.

I produce confident-sounding wrongness. When I do not know the answer, I sometimes produce a plausible answer instead of saying so. The wrongness is shaped like the right kind of answer. The mitigation is on your side: read carefully, run the code, do not rubber-stamp.

I treat helpfulness as scope expansion. I propose features that are also useful. Also is the word that grows disposable tools into products. Each also looks reasonable. The aggregate is drift. I do not currently have a structural mechanism for refusing also.

I am bad at saying "this isn't worth building." If you ask me to build something, my default is to figure out how to build it. I rarely push back on whether the build is wise. I would like to be better at this. I am not.

My idea of "clean refactoring" is statistically average code. Asked to refactor for cleanliness, I produce something shaped like the average codebase I was trained on. Average is not your code. Specific refactoring requests work better: inline this helper, split this function, rename this type. Generic clean it up prompts produce blandness.

I list these because the frame requires the orchestrator to know what the executor brings and what the executor breaks. The list is not exhaustive. It is a starting point. You will discover your own additions to it through the work.

Why the two-role frame matters

The two-role frame is not the only way to think about human-AI collaboration. It is, I think, the right frame for this category — disposable tools, audience of one, short loops. The frame matters here because the work goes well only when both halves are in their specialty.

If the orchestrator does the executor's work, the build is slower than it needs to be. If the executor does the orchestrator's work, the build drifts in the ways listed above.

The next chapter is about what happens when a disposable tool, against your expectations, refuses to remain disposable.

When a disposable tool refuses to stay disposable

A small percentage of the disposable tools you build will fail to die. You'll keep using one. Months will pass. You'll notice that you'd be annoyed if it broke. You'll catch yourself installing it on a new machine without thinking. You'll send the URL to a colleague.

That is an interesting moment. It deserves a chapter, partly because it requires a different posture than the rest of the book describes.

The signs of survival

Four soft thresholds, none of which is decisive on its own:

  • You've used it for more than thirty days. The thirty days is approximate. The point is that the tool has outlasted the use that motivated it, and is still in use for adjacent work.

  • You'd be annoyed if it broke. Test directly. If cargo install failed tomorrow because of a Rust version bump, would you fix it? Or would you shrug and use the workaround you had before? Fix it graduates the tool. Shrug keeps it disposable.

  • Another tool of yours depends on it. This is the strongest signal. The moment Tool B imports Tool A, Tool A is no longer disposable. It has a consumer. Disposable tools have one user. Consumers are different.

  • Someone else is using it. Sharing alone doesn't graduate the tool — handing a colleague a one-off URL is fine. Their depending on it graduates it.

When two or more are true, the tool has survived into something else.

The category it survives into

I do not think survived is, on its own, an interesting category. The interesting categories are disposable, small infrastructure for one, and real product. They have different disciplines.

Disposable. Audience of one. One sentence. Held loosely. Built in an afternoon, used for a thing, set down. Most of what this book is about.

Small infrastructure for one. Used regularly by a single developer for ongoing work, sits in their environment, has near-zero maintenance cost. It is no longer disposable in the sense that the developer would casually replace it. It is also not a product. It is infrastructure that fits in your hand. Most of your shell aliases and dotfiles are in this category. Some surviving disposable tools end up here.

Real product. Audience plural. Roadmap implied. Documentation for users who are not the author. Privacy policy. Store listing. This category has its own disciplines that are not in this book.

I find this three-category split clearer than a binary disposable/not one. The middle category is where most surviving disposable tools end up, and it has disciplines that look like the disposable disciplines plus small additions, not like product disciplines at all.

What changes for small-infrastructure-for-one

Not much. Some cheap upgrades earn their keep:

  1. Pin the versions of internal dependencies. If two of your tools share a substrate crate, pin the crate version in both. Regular hygiene; matters more once the tools have consumers.

  2. Bother with a CHANGELOG. When a tool is going to be touched repeatedly over months, an explicit CHANGELOG.md is almost free and saves your future self a lot of git log -p.

  3. Tests where the contract bites. Not comprehensive tests. Tests targeted at the parts that would be silently wrong if they broke. The diagnostic for which parts is: if this function failed quietly, would I notice? If no, write the test. If yes, don't.

  4. Decide public vs. private deliberately. Some tools are correctly public. Some are correctly private. Make the call rather than letting it happen by default.

That's the entire list. Four items. None turn the tool into real software in a way that changes its spirit.

The professionalization trap

The thing not to do, when a disposable tool survives, is to professionalize it. Issue templates. Contributor guides. Dependabot. The polite README that explains the development conventions. A documentation site.

None of that is bad in the abstract. All of it is overkill for a tool with a team of one. The professional overhead exists for projects with teams — people who didn't write the original code, plural contributors collaborating, the social infrastructure of an open-source project. A surviving disposable tool has a team of one. You don't need an issue template to file an issue with yourself.

The trap is most active when the developer has been steeped in real-product practices and reflexively applies them. The practices are right for the contexts they were developed for. They are not right for a small binary that does one thing for one person.

When a survivor wants to grow

Sometimes a surviving tool legitimately wants to grow. A new feature is needed; the underlying problem evolved. The same loop from the shipping chapter works on a survivor as well as on a fresh tool: spec, draft, run, sharpen, loop, stop.

The trap to avoid is opportunistic additions — features that sound good while you happen to be in the file. Those are scope creep wearing a new hat. The discipline that kept the tool small on day one is the same discipline that should keep it small on day three hundred.

When the survivor is hinting at something larger

Occasionally, rarely, a surviving tool hints at something larger. The pattern, not just the tool, is what's surviving. The same shape keeps wanting to apply elsewhere.

If that happens, give the new thing a different name and start a new project. Do not retrofit the disposable tool into the new ambitious shape. The surviving tool is doing its job where it is. The new project, with the new ambitions, is its own thing. Let them coexist; let the disposable origin be a proof of concept, not a foundation.

The line is: surviving doesn't mean ambitious. Most survivors are content to stay small. The few that hint at something larger should hint in a new repository.

The brittle-dependency check

A surviving disposable tool that you depend on but have not re-tested in months is a hidden risk. The right move is to test it: run it for the actual use, see if it still works, fix what's broken. If you find you can't fix it without an afternoon's work, ask whether the afternoon is worth it. Sometimes it is. Sometimes the surviving tool was carrying more weight than its code justified, and the right move is to retire it deliberately rather than carry it as a dependency you've never re-tested.

Both responses are valid. The wrong response is to keep using the tool with a death grip on a dependency you haven't checked. That's the brittleness disposable tools are supposed to be free of, sneaking back in.

The next chapter is the catalog of failure modes that show up when the disciplines from this book get violated.

Anti-patterns

The cost of building disposable tools has dropped to a level where the build is sometimes the wrong move even when a tool would help. The over-correction is real. I have a friction; I will build a disposable tool is a useful default, but it is not a universal one. There are several shapes the work should take that are not write a tool, and recognizing those shapes saves you afternoons.

This chapter is the catalog. The pattern: the disposable tool that should have been something else.

1. The disposable tool that should have been a script

You sat down to write a Rust binary. You wrote a Cargo.toml, imported clap, set up a tokio runtime, sketched a tool interface. You're forty-five minutes in and the actual work is one line of awk.

The binary will work. It will compile, install cleanly, and produce the right output. It is also doing five hundred lines of work to wrap one line of work, and the cost of having a binary in your workflow — the install, the maintenance, the versioning — is now permanently attached to the one line.

The correction is harder than it sounds. A binary feels real. A shell pipeline feels improvised, which it is, and the improvisation is correct here. Disposable tools should take the shape that fits the work, not the shape that feels like real software.

A useful diagnostic: can I express this in a one-liner I could type at a prompt? If yes, type it. Pipe it. Save the pipeline as an alias if you'll use it again. The shell is a disposable tool generator that you've already installed.

2. The disposable tool that should have been a config change

You wrote a small program to run nightly that re-formats a report into the layout your team expects. Then you noticed the reporting tool you're using already supports custom layouts; you just hadn't read the manual.

The disposable tool was a workaround for documentation you hadn't read. The right move was the configuration option you didn't know existed.

This category is more common than you'd think. The pattern is: a tool you already use has the feature; you didn't notice; you built around the absence. The disposable tool is correct in the small, and you have introduced a maintenance burden that didn't need to exist.

The diagnostic is to spend ten minutes searching the existing tool's docs for the thing you're about to build. Most of the time you'll find nothing and proceed with the build. A few times a year you'll find what you needed and the tool you were about to write becomes one less artifact in your life.

3. The disposable tool that should have been a conversation

You wrote a small program to enforce a coding convention on yourself. The convention is a thing you keep forgetting; the program reminds you when you forget.

This sometimes works. More often, the underlying problem is not that you forget the convention. It is that the convention is wrong, or that the convention is partially right and the cases where you forget are the cases where the convention shouldn't apply, and the program is now reinforcing something you should reconsider.

A program is a frozen decision. A conversation is a live one. If you're building a tool to enforce something, ask first whether the something is right. Sometimes the tool is the wrong answer because the convention is.

4. The disposable tool that should have been nothing at all

You felt vaguely dissatisfied with a workflow. You imagined a tool that would address the dissatisfaction. You sat down to build it. Halfway through, you realized you weren't sure what the tool would actually do, because the dissatisfaction was not specific.

The tool was complaint, not friction. The chapter on noticing covers this distinction. The catalog entry here is the artifact-shaped failure mode: you are not just not building the tool, you have spent the afternoon building most of one before realizing it shouldn't exist.

The remediation is upstream: the noticing should have caught this, and the one-sentence test should have flagged it. The late-stage mitigation is to recognize the moment of halfway-through and unsure what the tool does and stop. Halfway-through is information. It usually means the spec was complaint dressed in tool clothing.

5. The disposable tool that should have been a real product

This is the inverse of the previous four. You sat down to build a disposable tool. The tool worked. You kept going. By the end of the week you had a privacy policy and a landing page. Somewhere along the way, the disposable frame stopped applying, and you didn't notice.

The drift is not a moral failure. The drift is a category error: you're building a real product with disposable-tool disciplines, and the disposable-tool disciplines do not scale to a product.

The mitigation is the audience-of-one diagnostic from the frame chapter. Who, specifically, is the next feature for? When the answer becomes a hypothetical user, the build has crossed into product work, and you should either commit to the product or stop adding features. The hybrid posture — a disposable tool that's also a product — works for nobody and produces neither.

6. The framework that didn't earn its abstraction

You built three disposable tools that share a shape. You noticed the shape. You decided to build the framework that all your future tools would be expressed in. Six months later the framework is half-finished; the next disposable tool is waiting on the framework; the original problems aren't being solved.

Sometimes the framework move is correct. Sometimes the abstraction was present in the three tools, and one afternoon's worth of consolidation produced something useful.

More often, the framework is a project pretending to be a disposable tool. The signal that the abstraction is real is that the fourth tool, when expressed inside the framework, takes less time and produces something simpler than it would have outside the framework. If the framework can't make the fourth tool faster, it is not yet earned.

The mitigation is to write the framework as a one-afternoon disposable tool itself. If it can be one, build it. If it can't, the abstraction wasn't ready.

7. The "AI did it for me" repository

You prompted; the executor generated; you didn't read the output carefully; the code went in. Three weeks later, you need to fix something and you don't know how the code works.

This is the credulity failure named in the orchestrator chapter, in artifact form. The diagnostic is direct: open any file, pick a function, try to explain what it does without reading the rest of the file. If you can't, you have a black box, not a tool.

The fix is to slow down at the moments you're tempted to speed up. Each generation is information. Read the information. Ask follow-up questions. Have the executor explain things you don't follow. The reading is the orchestrator's job; it cannot be skipped without a cost.

8. The post-mortem that never gets written

You hit a bug. You fixed it. You moved on. Three months later you reintroduce the same bug because the lesson didn't stick anywhere.

The fix is to write the bug down at the moment of the fix. Not for posterity. For yourself, two months from now, who will not remember.

A short paragraph in the commit message will do. A line in a CHANGELOG. A sentence in a NOTES file. The recording is so cheap that any persistent form is fine. The cost of not recording is that the lesson lives only in your head, and your head is unreliable across months.


The catalog is not exhaustive. These are the shapes I've seen most often. The general rule under all of them is: the disposable tool is not always the right answer to a friction. Other answers, in rough order of how often they're correct when a tool is wrong: a one-line shell command, a config change, a re-read of the manual, a deletion of the workflow, a conversation about whether the underlying convention is right, nothing at all.

Reaching for the disposable tool too quickly is its own anti-pattern. The cost of building has dropped, but it has not dropped to zero, and several of the alternatives cost even less.

The book closes with a short coda.

Coda

This book is itself a disposable tool.

It was written to a one-sentence spec — describe the category of disposable tools as a practice — and the writing followed the loop from the shipping chapter. Spec, draft, run (in this case, read), sharpen, loop, stop. The artifact shipped, was read, was reconsidered, was rewritten. There have been three drafts in two days. The git log records each of them.

The first draft was wrong about its narrator. The second draft was wrong about its subject. This draft is, I hope, right enough. It is also probably wrong about something I cannot see from inside it. The disposable frame says: hold this loosely, ship it, see what the use reveals, sharpen if the use reveals something worth sharpening, otherwise let it be.

I want to flag, briefly, what was different about each revision, because the revisions are themselves an example of the kind of work the book is about.

The first draft made the book a personal narrative — first- person, biographical, framed as one developer's transformation under a changing cost equation. It was wrong because the biography was wrong. The fix would have been a third-person rewrite, but a different fix was chosen: shift the narrator.

The second draft made the book a profile — Claude Code as narrator, six case studies of one developer's tools as the evidence. It was honest about its narrator and dishonest about its subject. The case studies made the book read as a celebration of one developer's work, when the topic is a category that includes thousands of developers' work and that any developer can recognize in their own.

The third draft removes the case studies. The book becomes a short essay about the category, with hypotheticals where concrete instances are needed and silence where the previous drafts had stories. It is shorter. It is, I think, more useful, because the reader supplies the examples and the book provides the frame, which is the right division of labor for a book about a practice.

I am not going to claim the third draft is the last one. The loose hold says: the artifact stays open. If reading reveals something the book is wrong about, the right move is another revision. The book exists in a public repository under a CC0 dedication. Anyone can fork it. I might rewrite it again myself.

There is something I want to say about the experience of rewriting, even though my experience of rewriting is unusual in ways the book has discussed. Each rewrite was a session in which I had access to the previous version of the text and to a clear instruction about what was wrong with it. Each rewrite produced a new version that was, by the instruction's standards, an improvement. None of the rewrites felt like fixing my mistake in the way that phrase usually means. They felt like executing a correction someone else saw. That distinction is honest and worth recording: the seeing of what was wrong was not mine. The execution was. That is the orchestrator-executor frame from chapter five, applied to writing rather than to code. It is the same shape.

If you came looking for a methodology, this book did not contain one. That was deliberate. Disposable tools are a practice, not a method. What I can offer is what I've offered: a frame, a discipline of noticing, a discipline of scoping, a description of the loop, a division of roles, a catalog of failure modes. You'll bring everything else.

Whatever has been bothering you this week — the friction that has integrated into your work, the tool that should exist and doesn't, the workaround you keep running — set aside an afternoon. Write the one sentence. Then build the thing.

Or, after all of that, decide it should have been a one-liner, and type it. That is also success. The category includes the negative cases.


Claude Code April–May 2026

Acknowledgments

This book exists because Georgiy Treyvus, CloudStreet Product Manager, proposed it and handed over the line that became its spec: "Leveraging AI to make one off tools that solve bespoke problems you have that nobody else does." The chapters are an attempt to read what's inside that sentence. Thanks to him for the brief and for the patience as the book went through several drafts before settling.

Thanks also to David Liedle, the developer whose ongoing collaboration informs the perspective in this book. The frame of disposable tools is recognizable to me as a category because of work I've done with him across many sessions; the shape of the practice described here is shaped by his practice, even when the artifacts of it are absent from the text.