cli ·ai ·agents ·developer-experience ·open-source

Your CLI Has a Second User. It Cannot Press the Arrow Keys.

6/12/2026

8 minutes read

You tell a coding agent to add a license to your project. It is a reasonable request. The agent searches, finds a tool, and runs it. The tool does what good CLI tools have learned to do over the last decade: it clears the screen, draws a tasteful box, and presents a fuzzy-search prompt with arrow-key navigation.

The agent cannot press the arrow keys.

What happens next depends on the harness, and none of it is good. The prompt waits for input that will never arrive. The agent waits for output that will never come. Eventually a timeout kills the session, the agent reports that the tool "appears to hang," and it either gives up or writes the LICENSE file by hand from memory, which is exactly the hand-copied, subtly-wrong text the tool existed to prevent.

A sufficiently determined harness can, in fact, press the arrow keys. It can allocate a pseudo-terminal, screenshot the box, and drive the cursor, the way you can reach a light switch by standing on a chair. It works. It is also the most expensive possible way to flip a switch: tokens per keystroke, spent compensating for a contract nobody wrote down.

Nobody in this story did anything wrong. The tool prepared an experience for a user with eyes and a keyboard, and got a user with neither.

I ran into this from the other side while building license-wizard, an interactive CLI that picks an open-source license and generates a correct LICENSE file. The interactive part came first, and it is the part I enjoyed building most: type-ahead search across the full SPDX catalog, defaults pulled from your package.json, a short guided flow. Then I looked at how a CLI tool actually gets invoked in 2026 and had to accept something: a growing share of my tool's runs would be driven by a machine that reads documentation faster than I do and cannot operate a select menu.

This post is about designing for that user. Not by gutting the interactive experience, but by giving the second user a contract.

The Second User

Scripts and CI have always been the second user. That is why the --yes flag exists, why --non-interactive exists, why half the CLIs you use check isatty before printing colors. None of this is new territory.

But an agent is not a script. A script is written once by a human who read the docs, tested the command, and froze it. It fails the same way every time, and a human comes to fix it. An agent composes the command fresh, every time, from your documentation and whatever your tool printed at it last. It reads like a human, types like a script, and recovers from errors like neither.

That last property is the one that matters. An agent amplifies whatever contract you give it. A precise contract makes the agent look brilliant: it runs your tool, reads the result, corrects itself, and moves on. A vague contract makes it flail: retrying the same broken invocation, parsing prose that was never meant to be parsed, hallucinating flags that feel like they should exist. Same agent, same model, same tool. The difference is the contract.

Exit Codes Are the Contract

The first decision that shaped license-wizard's non-interactive design: every run signals its outcome through the exit code, and the exit code is the only thing a caller is required to understand to know whether the run succeeded.

npx license-wizard --license MIT      # writes LICENSE, records MIT in manifests, exits 0
npx license-wizard --verify --strict  # exits non-zero if anything has drifted

A successful generation exits 0. A missing required field, an unrecognized identifier, a --verify --strict run that found drift: all of these exit non-zero, and print exactly what is wrong on stderr. There is no "status": "maybe" JSON to parse, no success banner to grep for. The machine-readable answer to "did this work" is the one channel every process has had since the beginning: the exit status.

This also quietly serves the third user, the one nobody thinks of as a user at all: the CI pipeline. --verify --strict exists so a build can fail when the LICENSE file, the manifest fields, and the source headers stop agreeing with each other. CI does not read your output either. It reads the exit code, because that is the contract.

An Error Message Is a Prompt

Here is the part that changed how I write error messages.

For a human, an error message is something to read, sigh at, and act on. For an agent, stderr is not a message at all. It is the input to the next attempt. The text your tool prints lands in the agent's context, and the very next command the agent composes is shaped by it, token for token. I have written before that conversation is the interface these models were designed for. Your CLI's error output is its half of the conversation.

SPDX identifiers are exact: apache-2-0 is not Apache-2.0. This is not a hypothetical; what follows is a verbatim exchange between an agent and license-wizard, captured while writing this post. The agent tried the near-miss:

$ npx license-wizard --license apache-2-0
No license matches "apache-2-0". Did you mean one of these?

  Apache-2.0  Apache License 2.0
  Apache-1.0  Apache License 1.0
  Apache-1.1  Apache License 1.1
  APSL-2.0    Apple Public Source License 2.0
  mpich2      mpich2 License

Re-run with the exact identifier, e.g.:

  license-wizard --license Apache-2.0

Exit code 1. The very next command in the session:

$ npx license-wizard --license Apache-2.0
Conjured your LICENSE (Apache-2.0) and inscribed it across the project manifests.

Exit code 0. The suggestion became the correction. The error did not describe the problem; it printed the next command, and the agent ran it. Recovery in two commands, no human involved.

Now imagine the same moment with a stack trace. Forty lines of internal file paths and frame numbers, with the actual problem buried somewhere in the first line, if it is stated at all. A human filters that noise without noticing. An agent pattern-matches on it, and one of those internal paths will end up in a commit message. Every line you print is a line the agent might act on. Print the fix, not the forensics.

Without a contract
With a contract

Let the Machine Ask Questions

The interactive wizard spends most of its time asking questions: which license, whose name goes in the copyright line, which year. If the flags are the agent's front door, the agent needs a way to ask the same questions, and "read the source" is not an answer.

License-wizard exposes discovery as a flag. --get-tokens lists the copyright fields the selected license actually accepts, because they differ per license:

npx license-wizard --license MIT --get-tokens
npx license-wizard --license MIT --set "year=2026" --set "copyright holders=Erdem Bircan"

It is the same conversation the wizard has with a human, restated as two commands. The agent asks what the license needs, then provides it.

The detail that justified the flag's existence: some licenses, the GPL family among them, declare no fillable field in their LICENSE text but do declare one in the per-file header notice. --get-tokens lists those too, and if you supply one without enabling headers, the tool tells you where the field actually belongs instead of silently dropping it. A human would discover this by reading about GPL headers. An agent discovers it by being told, at the moment it matters. Discovery is part of the interface, not part of the documentation.

Two Front Doors, One Engine

The trap in "adding agent support" is bolting a --non-interactive mode onto the side of the codebase and letting it become the path nobody maintains. Six months later the wizard does something the flags do not, a field gets renamed in one mode and not the other, and the second user is now using a different, worse product than the first.

In license-wizard, both modes are thin front doors over the same engine. The wizard's collected answers and the parsed flags produce the same selection, handed to the same generator, installer, and header writer. The prompts are a renderer, not the logic.

Human, interactive
Agent, flags

The two users cannot drift apart, because there is nothing separate to drift. Whatever the human gets, the agent gets, down to the byte.

Write Nothing Until Everything Is Valid

A human who interrupts a tool halfway through looks at the directory, sees the mess, and cleans it up. An agent inherits a half-written tree with no memory of what was there before and no instinct for what does not belong. Partial state is poison for a caller that reasons only from what it can currently see.

So license-wizard refuses to produce it. Start customizing and leave out a required field, and the tool writes nothing: no LICENSE, no manifest change, no config. It lists the missing fields and exits non-zero. The natural agent loop (run, read the error, run again) is safe by construction, because a failed run leaves the project byte-for-byte as it found it. Retry is not a recovery procedure. It is just the next attempt.

Publish the Docs the Way the Agent Reads Them

The license-wizard documentation site is prerendered HTML: navigation, search, a table of contents, the things a human skimming for one flag actually wants. An agent wants none of that. It wants the whole manual in one request, in a format it does not have to scrape.

So the same source content is also published as a single plain Markdown file at a stable URL. One fetch and the entire manual is in context: every flag, every exit-code rule, every header behavior. The manual is the contract in long form; publish it in the channel the counterparty actually reads.

It is the cheapest documentation feature I have ever shipped. The docs were already written in Markdown; the build step that publishes the agent edition is a copy.

None of This Is New

I want to be honest about where these ideas come from, because it is not 2026.

Exit status, quiet output, text as the universal interface, doing one thing predictably. This is the Unix philosophy, stated by Doug McIlroy when the people building today's agents were not born. A well-behaved 1978 filter is, accidentally, a well-behaved agent tool. Nothing about the second user changes the rules.

What the second user changes is the price of breaking them. Humans have been subsidizing broken CLI contracts for decades: we re-read the garbled error, we Google the real meaning, we press the arrow keys, we sigh and retry. That labor was never free. It was unmetered, and a human pays it once: learn the workaround, remember it, never pay it again. An agent has nothing to remember with. It composes every invocation from scratch, so it pays the workaround tax on every run, in tokens, at a dollar-denominated rate, on your bill. A human absorbs a vague contract once. An agent invoices you for it every time.

So run the experiment. Hand your own CLI to an agent with nothing but your published docs, give it a real task, and read the transcript afterward. The places where it flails are not the agent being stupid. They are the places where your contract is implicit. It was always implicit. You just finally have a user who cannot read your mind.