Harness Engineering: What Separates Top Agentic Engineers Right Now

Harness Engineering: What Separates Top Agentic Engineers Right Now

โฑ ~17 min ๐ŸŽค Cole Medin ๐Ÿท Harness Engineering ยท Agentic Coding ยท AI Layer ยท RALF Loop ยท Orchestration
Cole Medin ~17 min Watch on YouTube โ†—

Overview

Cole Medin breaks down harness engineering โ€” the 2026 evolution of context engineering โ€” explaining that it's not just about feeding context to an LLM, but about building the entire wrapper around the model: rules, skills, hooks, sub-agents, and orchestration layers. The video covers two key dimensions: the AI layer within a single coding agent session, and the more powerful practice of orchestrating multiple agent sessions into automated workflows (like the RALF loop). Most importantly, Cole argues that harness engineering is a mindset: every model mistake is an opportunity to improve your harness, not a reason to wait for the next model version.

1 What Is Harness Engineering? โ–ถ 0:00

Cole opens by noting that "harness engineering" is rapidly becoming the next big buzzword in the AI space for 2026, much like "context engineering" was for 2025. But just like its predecessor, people are already throwing the term around without truly understanding what it means.

The core question he poses: Is this skill โ€” or even mindset โ€” worth learning? His answer is an emphatic yes. The video promises a breakdown in under 15 minutes with concrete examples and demos to make it tangible.

Harness engineering is the next evolution of context engineering โ€” and it's already becoming a buzzword people use without understanding it.

2 Defining the Agent Wrapper โ–ถ 0:47

At its core, harness engineering is about building the wrapper around the model. Any AI agent is the combination of two things:

While the video focuses primarily on AI coding assistants, Cole emphasizes that the concept of harness engineering can be extrapolated to any agent you build for anything.

There are two distinct parts of harness engineering:

An agent = LLM + Harness. The harness is everything you build around the model โ€” and it's the part you actually control.

3 The Three Layers of AI โ–ถ 1:55

Cole presents a layered diagram that makes the architecture crystal clear. From inside out:

The six components that make up your AI layer:

"We take for granted all of the capabilities that AI coding assistants give to the model out of the box. An LLM by itself doesn't have any way to access a file system or run any commands."

4 Harness vs. Context Engineering โ–ถ 4:39

Cole addresses the elephant in the room: isn't this just context engineering? The answer is "yes, to an extent" โ€” and that's exactly why it's becoming a buzzword. Most people don't understand the true evolution.

He outlines two key distinctions, referencing a Martin Fowler article:

The article breaks the harness down into: context injection, actions (tools, MCPs), persistence, observability, and control. The first four are essentially context engineering. The fifth โ€” control โ€” is what makes harness engineering genuinely new.

Context engineering โŠ‚ Harness engineering. The "control" layer โ€” orchestrating sessions and RALF loops โ€” is the true evolution.

5 The Harness Engineering Mindset โ–ถ 5:46

Beyond being a skill, harness engineering is fundamentally a mindset reframe. Cole quotes from an article by Addy Osmani:

"There's a pattern I watch engineers fall into. The agent does something dumb, the engineer blames the model, and the blame gets filed under 'wait for the next version.'"

This is the anti-pattern: Claude messes up โ†’ "let's wait for Opus 5." GPT fails โ†’ "let's wait for GPT-6." Cole admits he's tempted to think this way too. But the harness engineering mindset rejects that default.

Instead, the approach is what Cole calls "system evolution":

This creates a virtuous cycle: you become the human steering the system, feeding forward with principles and context for generation, and sensors (hooks, review agents, skills) for feedback and self-correction โ€” evolving your AI layer over time.

"Every mistake becomes an opportunity to improve your harness. You're taking ownership and improving the performance of your coding agent over time with the AI layer that you control."

6 Components of the AI Layer โ€” Companion Repo โ–ถ 9:25

Cole walks through his companion repo (harness-engineering-demo) which provides a concrete, reusable AI layer template. Here are the key components:

Rules (Foundation):

Skills (Workflows):

Hooks (Underused Power):

The repo also includes instructions for running a basic PIT (Plan โ†’ Implement โ†’ Test) workflow manually: plan with the plan skill, iterate, produce a markdown document, hand it off to the implement skill in a separate session.

"Hooks are honestly pretty underused. I love using hooks for security (pre-tool-use), stop validation (force tests to pass), and post-edit linting."

7 Orchestrating Coding Agent Sessions โ–ถ 12:52

This is what Cole calls the "peak evolution of harness engineering" โ€” the real power move. The core idea:

The example harness workflow Cole describes:

  1. Explore โ€” one agent explores the implementation from a user requirement
  2. Plan โ€” one agent writes the plan
  3. Implement โ€” one agent handles the implementation
  4. Review (parallel) โ€” multiple review agents run simultaneously, each with a different focus:
    • Security review agent
    • Correctness review agent
    • Simplicity review agent
  5. Decision gate โ€” if all reviews pass โ†’ create the PR. Otherwise โ†’ iterate back to implementation.

You can do all of this manually (opening separate Claude Code sessions, copying plan documents between them), but the real power of harness engineering is automating the entire pipeline.

"If you send too much into the LLM at once, it is going to fall flat on its face. Give each coding agent a very focused task."

8 Automating with the RALF Loop โ–ถ 14:34

The RALF loop (created by Jeffrey Huntley) is one of the first and most influential examples of an agent harness that automates multi-session orchestration. Cole walks through how it works:

How it works:

The loop mechanics:

Cole also mentions Archon, his open-source harness builder, as the easiest way to get started building custom harnesses tailored to your exact process and software development lifecycle.

"We are using many coding agent sessions to keep each one very focused, but also we're automating it so we don't have to babysit our coding agent. This really is the future of agentic engineering."

๐ŸŽฏ Key Takeaways

  1. Harness = Everything Around the Model โ€” Any agent is the combination of the LLM (reasoning) plus the harness (context, tools, processes). The harness is the part that matters most and the part you control.
  2. Three Layers of AI Agents โ€” The LLM at the core, the coding agent tool (Claude Code, Codex) as the first harness, and your AI layer as the ultimate wrapper that you build and evolve.
  3. Six AI Layer Components โ€” Global rules, skills, MCP servers, codebase searching (LSP/knowledge graphs), hooks, and sub-agents. Every process you inject goes through one of these six channels.
  4. Context Engineering โŠ‚ Harness Engineering โ€” Most of the harness IS context engineering. The new element is "control" โ€” orchestrating sessions, RALF loops, and sub-agents.
  5. Mindset Over Skill โ€” The harness engineering mindset rejects "blame the model, wait for the next version." Instead, every mistake becomes an opportunity to improve your harness.
  6. System Evolution โ€” Treat your AI layer as a living system. Convention missed? Add it to rules. Destructive command? Add a hook. Each session should leave your harness better than before.
  7. Separate Sessions for Plan, Implement, Validate โ€” Keep each coding agent session focused and token-efficient by using separate skills/sessions for planning, implementation, and validation, with artifact handoffs between them.
  8. Hooks Are Underused โ€” Pre-tool-use security hooks, stop validation hooks (force tests to pass before "done"), and post-edit lint hooks are powerful but most engineers don't use them enough.
  9. Don't Overwhelm a Single Session โ€” Never hand a massive PRD to one agent session. No matter how good your AI layer is, too much context will make the LLM "fall flat on its face."
  10. Orchestrate Multiple Sessions โ€” The peak of harness engineering: automate pipelines where focused agent sessions handle explore โ†’ plan โ†’ implement โ†’ review โ†’ PR creation.
  11. The RALF Loop Pattern โ€” A simple script that splits a large scope of work into tasks, runs coding agent sessions iteratively, and only exits when all specs are met and validation passes.
  12. Building Harnesses Is the Future โ€” As models and tools get more powerful, the competitive advantage shifts to who builds the best harness โ€” the orchestration layer that makes agents reliably tackle larger scopes of work.

๐Ÿ”— Resources & Links

โฑ Timestamp Index

โ–ถ 0:00 What is Harness Engineering?
โ–ถ 0:47 Defining the Agent Wrapper
โ–ถ 1:55 The Three Layers of AI
โ–ถ 4:39 Harness vs. Context Engineering
โ–ถ 5:46 The Harness Engineering Mindset
โ–ถ 7:45 Sponsor: Google Cloud Agents CLI
โ–ถ 9:25 Components of the AI Layer
โ–ถ 12:52 Orchestrating Coding Agent Sessions
โ–ถ 14:34 Automating with the RALF Loop
โ–ถ 16:34 Final Thoughts