Mario Zechner, creator of the Pi coding agent, delivers a conference talk in three acts: (1) why he stopped using Claude Code and built Pi, (2) how AI-generated spam ("clankers") is destroying open source, and (3) a passionate argument for slowing down and writing critical code by hand. A raw, opinionated, and deeply technical talk about agent harness design, context control, and the compounding error problem of agent-generated code.
Started using Claude Code in April 2025. Initially simple, predictable, fit his workflow. But problems emerged:
Looked at alternatives: Amp and FactoryDroid ("the Porsche and Lamborghini"), OpenCode (brilliant team, but found issues: tool output pruning that "lobotomizes the model", LSP integration checking errors after every edit confusing the model, individual messages stored as separate JSON files, CORS security issue).
Discovered Terminal Bench — a benchmark giving the model only keystroke-sending to a tmux session. No file tools, no sub-agents. Yet it scored among the highest, often beating native harnesses.
Stripped everything down, built minimal but extensible core. Agent can modify itself.
Four packages: AI (provider abstraction), Agent Core (while loop + tool calling), TUI (bespoke framework from game dev background — doesn't flicker), Coding Agent.
Pi's system prompt: [shows nearly empty prompt]. "That's it." Models are reinforcement-trained on coding agents — they don't need 10,000 tokens telling them they're a coding agent. They know.
Four tools only: read, write, edit, bash. ▶ 7:01
YOLO by default — no approval dialogs. "My security needs are different than yours." Instead gives you rope to build your own security. ▶ 7:27
Sub-agents, plan mode, MCP — NOT built in. You ask Pi to build them as extensions based on your needs.
Extensions are TypeScript modules. Extension API hooks into everything: tools, slash commands, events, session state, custom compaction, custom providers, full tool control.
Packaged via NPM or GitHub — "we don't need another silo called a marketplace. We already have package managers."
Everything hot-reloads during sessions (game dev philosophy: low iteration time).
/byTheWay feature — someone rebuilt it in 5 minutes as a Pi extension with more features"How do you build a Pi extension? You don't. You tell Pi to build it for you." ▶ 10:01
Terminal Bench: Pi scored 6th place — before even having compaction.
"Clankers" (AI-generated spam) are destroying open source. TilDraw closed their issue tracker. OpenCode flooded. Pi's tracker filled with garbage from OpenCode instances using Pi as agent core without users knowing.
The core argument: "Everything's broken."
"Our product's been 100% built by agents." — "Yes, we know it sucks now. Congratulations."
Agents compound errors ("booboos") with zero learning, no bottlenecks, and delayed pain (for you). Visualization: 1 human → manageable errors. 1 agent → more errors. 10 agents → exponential errors. ▶ 13:00
"But I have a review agent!" — "Let me introduce you to the wonderful world of the ouroboros." Doesn't work.
They learned from the internet (90% garbage code). Every decision is local. They add abstractions, duplication, backwards compatibility everywhere. "Enterprise-grade complexity within 2 weeks with just two humans and 10 agents."
"But my detailed spec." — "A sufficiently detailed spec is a program." Blanks in specs get filled with internet garbage.
Humans vs agents: humans are fallible BUT they learn, they're bottlenecks (limited booboos per day), and they feel pain. Pain triggers action (quit, blame someone, or refactor). "Agents will happily keep shitting into your code base."
Good agent tasks:
Pattern: agent works → you evaluate → take what's reasonable (most isn't) → finalize.