Simple Pi Subagents — Deep Dive

Eero Alvar · ~13:42 · Watch on YouTube ↗

Overview

This video presents a minimal, extensible sub-agents framework for Pi (a coding agent). The author explains the philosophy behind giving AI coding agents their own sub-agents, demonstrates three agent types (Scout, Researcher, Worker), and shows nested sub-agent spawning up to 6 levels deep.

1 The Problem — Context Bloat ▶ 0:00 ▶ 1:30

Core problem: as task complexity grows, agents read too many files during planning and exploration, bloating the context window into the "dumb zone" before execution can begin. Most file reading is exploratory — finding where things are, how pieces connect. Full file contents don't need to stay in context.

Solution: give your agent its own agents to outsource mechanical, context-heavy work. Bonus: since this work is mechanical, you can use cheaper models (Haiku instead of Opus/Sonnet), saving money while keeping the master agent's context lean and effective.

2 Design Philosophy — Three Values ▶ 2:02 ▶ 3:25

Capability — Sub-agents should be as capable as needed for their role
Observability — Full visibility into what every agent is doing at all times, especially important when sub-agents spawn their own sub-agents. Two reasons: improving agent prompts/functionality, and maintaining a feeling of control
Extensibility — Two layers: (a) trivial to add/modify agents via markdown files, (b) the extension itself is minimal and easy to hack on

3 Agent Architecture — Markdown-Defined Agents ▶ 3:06 ▶ 5:11

Each agent is entirely defined by a markdown file with YAML front matter:

Name, description
Tools available
Available sub-agents (controls spawn depth)
Model and thinking level
System prompt in the markdown body

Agents are auto-discovered by the extension. Three ship by default:

Scout (Haiku) — File system exploration: read, grep, find, ls. Safe because it physically can't modify anything.
Researcher (Sonnet) — Web research: web search + web fetch. Needs more intelligence for synthesis.
Worker (full capabilities) — Same tools as master agent + its own sub-agents tool. Has a safer bash tool to prevent destructive operations. Can spawn Scouts and Researchers (not Workers by default), creating an effective max depth of 3.

4 Live Demo — Scout & Researcher ▶ 5:12 ▶ 7:12

Demonstration of the Pi interface with the sub-agents extension:

Live prose thinking displayed in real-time
Compact tool call view (one per line), expandable with Ctrl+O
Each sub-agent shows its own status bar: token metrics, cache reads/writes, total cost, context window meter
Scout demo: Haiku explored the filesystem, read many files — 50k tokens for ~15 cents
Researcher demo: searched the web for latest AI/Pi news — consumed 70k tokens while keeping the master agent lean

5 Live Demo — Worker Sub-Agent ▶ 7:13 ▶ 10:07

Real task: build a web UI (FastAPI + React) for an app that takes audio, removes silences, and outputs FCPXML.

Worker received the plan, delegated phases to sub-workers
Workers dispatched in parallel for efficiency
A worker spawned its own Scout (nested sub-agent) — displayed with nested indentation
Scout returned info → Worker implemented code → caught and fixed a bug
Final result: working UI that produced a valid FCPXML file importable into video editing software

"Use worker sub-agents. Cut the planning to phases and delegate each phase to a worker sub-agent to actually build it out. Don't write the code yourself. Delegate everything to workers."

6 Stress Test — 6 Levels Deep ▶ 10:08 ▶ 12:27

Experiment: allowing workers to spawn their own workers by adding worker to the sub-agents front matter field.

Prompted the agent to spawn workers inside workers, 6 levels deep
Each level reported in, nicely nested in the UI
All levels resolved their tool calls successfully
Conclusion: theoretically no cap to sub-agent depth
The observability design (tool calls + prose thinking) proved sufficient even at deep nesting

7 Limitations & Future Work ▶ 12:27 ▶ 13:42

Current limitation: sub-agents are NOT interactive — you can't interfere with a running sub-agent session
Desired feature: sub-agents should have their own "ask user question" tool for interactivity
Showcased another developer's implementation using a terminal multiplexer (tmux) — each sub-agent spawns as a new interactive terminal panel
Author prefers not to use tmux but acknowledges the approach is "really cool"

🔑 Key Takeaways

Context bloat is the primary barrier to complex, long-running AI agent tasks
Sub-agents solve this by isolating context-heavy work (exploration, research) in separate processes
Cheaper models can handle mechanical sub-tasks, reducing costs significantly
Markdown-based agent definitions make the system trivially extensible
Observability (live thinking, token metrics, nested indentation) is essential for debugging and trust
The delegation pattern (plan → phase → worker) mirrors human team management
Infinite nesting depth works but practical setups limit to 2–3 levels
Interactivity (human-in-the-loop for sub-agents) is the key missing feature

⏱ Timestamp Index

▶ 0:00 Context Bloat Problem ▶ 2:02 Design Philosophy ▶ 3:06 Agent Architecture ▶ 5:12 Scout & Researcher Demo ▶ 7:13 Worker Sub-Agent Demo ▶ 10:08 6-Level Stress Test ▶ 12:27 Limitations & Future Work