Multi-Agent Teams

AI agents reading files one at a time build mental models from fragments. On a 200-file repo, they never see architecture - they see one street sign at a time and try to navigate a city. They create utilities that already exist, refactor functions without knowing their callers, and write exports that are never wired up.

This is the visibility problem. Pharaoh fixes it with a knowledge graph.

But there's a second problem: a single agent is grading its own homework. It writes code, checks it, moves on. Technical debt migrates to the end of the project. By the time review happens, the cleanup is massive.

Multi-agent teams fix both problems. Different agents specialize in different tasks. Pharaoh's skill library gives each agent the architectural context it needs to do its job.

The wiring problem

AI agents have a systematic failure mode that doesn't show up in tests. They create dead exports - functions that compile, pass tests, but are never called from any entry point. The code sits there, unreachable, adding complexity.

Pharaoh's dead code detector (get_unused_code) finds this pattern in nearly every AI-assisted codebase. The fix is an iron law built into pharaoh:plan: every new export in a plan must have a declared caller. If a function has no caller, it's not part of the plan.

Adversarial review

One agent builds, a different agent reviews. Different model, different provider, different blind spots. They go back and forth through a structured handoff until the reviewer approves.

Why different providers? Each model has its own blind spots. Claude might miss what GPT catches. GPT might miss what Claude catches. Cross-model review covers more surface area than any single model reviewing its own output.

The mechanism is simple. A handoff file sits between builder and reviewer. Builder commits, writes status: pending-qa. Reviewer reads the diff, writes findings or status: approved. A Stop hook blocks the coordinator from exiting until the loop resolves.

When the builder uses pharaoh:plan and the reviewer uses pharaoh:review, both agents reason from the same architectural data. The builder is creative. The reviewer is skeptical. The graph keeps both honest.

Playbook: Code Review Team

Three agents. A coordinator triages PRs and delegates structural analysis to a Pharaoh specialist and code-level review to a reviewer.

Agent
Role
Pharaoh Skills

Coordinator

Triage PRs, synthesize verdict

pharaoh (core)

Pharaoh Specialist

Blast radius, regression risk, wiring, spec alignment

pharaoh, pharaoh:review

Reviewer

Logic, style, tests, security

pharaoh (core)

All three agents are read-only - none can modify code during review.

The workflow:

  1. Coordinator reads the PR diff.

  2. Coordinator delegates to Pharaoh Specialist: "Analyze structural impact of these changes."

  3. Coordinator delegates to Reviewer: "Review code changes for logic errors and security."

  4. Both report back.

  5. Coordinator synthesizes a verdict: SHIP / SHIP WITH CHANGES / BLOCK.

Auto-block triggers (any of these = BLOCK):

  • Unreachable exports (new code with zero callers)

  • New circular dependencies

  • HIGH regression risk without test coverage

  • Vision spec violations

Full config: code-review-team playbookarrow-up-right

Playbook: Feature Development

Three agents with a TDD workflow. No code is written until the plan passes adversarial review.

Agent
Role
Pharaoh Skills

Planner

Architecture recon, approach selection, wired plan

pharaoh, pharaoh:plan, pharaoh:brainstorm

Tester

Write failing tests first (TDD)

pharaoh, pharaoh:tdd, pharaoh:verify

Coder

Minimal code to pass tests

pharaoh, pharaoh:execute, pharaoh:wiring

The workflow:

  1. Planner runs pharaoh:plan - recon, analysis, approach selection, step-by-step plan with wiring declarations.

  2. For each step: Tester writes a failing test. Coder implements minimal code to pass it.

  3. After implementation, Coder runs pharaoh:wiring to verify all exports are connected.

  4. Planner does a final review: does the implementation match the plan?

The planner is read-only. It designs but never writes code. Add a cross-model reviewer between the coder and a QA agent and you get TDD with architectural awareness and adversarial review on every commit.

Full config: feature-development playbookarrow-up-right

Playbook: Codebase Onboarding

A single agent for rapid architecture orientation. Uses only free-tier Pharaoh tools.

The workflow:

  1. get_codebase_map - full module landscape.

  2. get_module_context on the 3 largest modules.

  3. search_functions for entry points ("main", "init", "start", "handler").

  4. get_blast_radius on the most-connected module.

  5. query_dependencies between the two most-connected modules.

Output: module map, entry points, core data flow, key functions to read first.

This is useful for bootstrapping any agent that needs to understand a repo before working on it. Point the onboarder at the repo first, then hand the summary to your implementation agents.

Full config: codebase-onboarding playbookarrow-up-right

Playbook: Tech Debt Sprint

Two agents. The auditor diagnoses, the fixer implements.

Agent
Role
Pharaoh Skills

Auditor

Full codebase audit, A-F grading, prioritized actions

pharaoh, pharaoh:health, pharaoh:debt, pharaoh:audit-tests

Fixer

Implements cleanup with blast-radius verification

pharaoh, pharaoh:refactor, pharaoh:verify, pharaoh:wiring

The workflow:

  1. Auditor runs pharaoh:health - grades the codebase A-F, finds top 5 risks and tech debt hotspots.

  2. Auditor runs pharaoh:debt - categorizes findings: DELETE, CONSOLIDATE, DOCUMENT, STABILIZE, TEST.

  3. Auditor hands highest-priority items to Fixer.

  4. Fixer runs pharaoh:refactor before every change (blast radius check), then pharaoh:verify after.

  5. Auditor re-runs health check - did the grade improve?

The auditor is read-only. It diagnoses but never modifies. Add a weekly heartbeat and the auditor runs automatically, reporting the grade and top risks.

Full config: tech-debt-sprint playbookarrow-up-right

Setting up a multi-agent team

All playbooks follow the same pattern:

  1. Install Pharaoh skills:

  2. Copy the openclaw.json snippet from the playbook into your config.

  3. Copy the per-agent AGENTS.md files into your workspace directory.

  4. Connect Pharaoh to your repo (install the GitHub App or use the upload path).

  5. Run it:

Each playbook includes the complete gateway configuration, agent workspace files, and MCP server setup. Copy, adjust to your repo, run.

Tips

  • Start with a single agent + Pharaoh before adding multi-agent complexity. One agent with pharaoh:plan and pharaoh:review already catches most architectural issues.

  • Use different model providers for builder and reviewer agents. Same-model review has correlated blind spots.

  • The onboarding playbook is useful as a first step for any other playbook - run it to prime the agents before starting implementation or review work.

  • All Pharaoh skills are platform-agnostic. The playbooks show OpenClaw configs, but the skills work with any agent framework that supports MCP.

Last updated