# Multi-Agent Teams

AI agents reading files one at a time build mental models from fragments. On a 200-file repo, they never see architecture - they see one street sign at a time and try to navigate a city. They create utilities that already exist, refactor functions without knowing their callers, and write exports that are never wired up.

This is the visibility problem. Pharaoh fixes it with a knowledge graph.

But there's a second problem: a single agent is grading its own homework. It writes code, checks it, moves on. Technical debt migrates to the end of the project. By the time review happens, the cleanup is massive.

Multi-agent teams fix both problems. Different agents specialize in different tasks. Pharaoh's skill library gives each agent the architectural context it needs to do its job.

## The wiring problem

AI agents have a systematic failure mode that doesn't show up in tests. They create dead exports - functions that compile, pass tests, but are never called from any entry point. The code sits there, unreachable, adding complexity.

Pharaoh's dead code detector (`get_unused_code`) finds this pattern in nearly every AI-assisted codebase. The fix is an iron law built into `pharaoh:plan`: every new export in a plan must have a declared caller. If a function has no caller, it's not part of the plan.

## Adversarial review

One agent builds, a different agent reviews. Different model, different provider, different blind spots. They go back and forth through a structured handoff until the reviewer approves.

Why different providers? Each model has its own blind spots. Claude might miss what GPT catches. GPT might miss what Claude catches. Cross-model review covers more surface area than any single model reviewing its own output.

The mechanism is simple. A handoff file sits between builder and reviewer. Builder commits, writes `status: pending-qa`. Reviewer reads the diff, writes findings or `status: approved`. A Stop hook blocks the coordinator from exiting until the loop resolves.

When the builder uses `pharaoh:plan` and the reviewer uses `pharaoh:review`, both agents reason from the same architectural data. The builder is creative. The reviewer is skeptical. The graph keeps both honest.

## Playbook: Code Review Team

Three agents. A coordinator triages PRs and delegates structural analysis to a Pharaoh specialist and code-level review to a reviewer.

| Agent              | Role                                                  | Pharaoh Skills              |
| ------------------ | ----------------------------------------------------- | --------------------------- |
| Coordinator        | Triage PRs, synthesize verdict                        | `pharaoh` (core)            |
| Pharaoh Specialist | Blast radius, regression risk, wiring, spec alignment | `pharaoh`, `pharaoh:review` |
| Reviewer           | Logic, style, tests, security                         | `pharaoh` (core)            |

All three agents are read-only - none can modify code during review.

**The workflow:**

1. Coordinator reads the PR diff.
2. Coordinator delegates to Pharaoh Specialist: "Analyze structural impact of these changes."
3. Coordinator delegates to Reviewer: "Review code changes for logic errors and security."
4. Both report back.
5. Coordinator synthesizes a verdict: SHIP / SHIP WITH CHANGES / BLOCK.

**Auto-block triggers** (any of these = BLOCK):

* Unreachable exports (new code with zero callers)
* New circular dependencies
* HIGH regression risk without test coverage
* Vision spec violations

Full config: [code-review-team playbook](https://github.com/Pharaoh-so/pharaoh/tree/main/docs/openclaw/playbooks/code-review-team.md)

## Playbook: Feature Development

Three agents with a TDD workflow. No code is written until the plan passes adversarial review.

| Agent   | Role                                               | Pharaoh Skills                                  |
| ------- | -------------------------------------------------- | ----------------------------------------------- |
| Planner | Architecture recon, approach selection, wired plan | `pharaoh`, `pharaoh:plan`, `pharaoh:brainstorm` |
| Tester  | Write failing tests first (TDD)                    | `pharaoh`, `pharaoh:tdd`, `pharaoh:verify`      |
| Coder   | Minimal code to pass tests                         | `pharaoh`, `pharaoh:execute`, `pharaoh:wiring`  |

**The workflow:**

1. Planner runs `pharaoh:plan` - recon, analysis, approach selection, step-by-step plan with wiring declarations.
2. For each step: Tester writes a failing test. Coder implements minimal code to pass it.
3. After implementation, Coder runs `pharaoh:wiring` to verify all exports are connected.
4. Planner does a final review: does the implementation match the plan?

The planner is read-only. It designs but never writes code. Add a cross-model reviewer between the coder and a QA agent and you get TDD with architectural awareness and adversarial review on every commit.

Full config: [feature-development playbook](https://github.com/Pharaoh-so/pharaoh/tree/main/docs/openclaw/playbooks/feature-development.md)

## Playbook: Codebase Onboarding

A single agent for rapid architecture orientation. Uses only free-tier Pharaoh tools.

**The workflow:**

1. `get_codebase_map` - full module landscape.
2. `get_module_context` on the 3 largest modules.
3. `search_functions` for entry points ("main", "init", "start", "handler").
4. `get_blast_radius` on the most-connected module.
5. `query_dependencies` between the two most-connected modules.

**Output:** module map, entry points, core data flow, key functions to read first.

This is useful for bootstrapping any agent that needs to understand a repo before working on it. Point the onboarder at the repo first, then hand the summary to your implementation agents.

Full config: [codebase-onboarding playbook](https://github.com/Pharaoh-so/pharaoh/tree/main/docs/openclaw/playbooks/codebase-onboarding.md)

## Playbook: Tech Debt Sprint

Two agents. The auditor diagnoses, the fixer implements.

| Agent   | Role                                                  | Pharaoh Skills                                                     |
| ------- | ----------------------------------------------------- | ------------------------------------------------------------------ |
| Auditor | Full codebase audit, A-F grading, prioritized actions | `pharaoh`, `pharaoh:health`, `pharaoh:debt`, `pharaoh:audit-tests` |
| Fixer   | Implements cleanup with blast-radius verification     | `pharaoh`, `pharaoh:refactor`, `pharaoh:verify`, `pharaoh:wiring`  |

**The workflow:**

1. Auditor runs `pharaoh:health` - grades the codebase A-F, finds top 5 risks and tech debt hotspots.
2. Auditor runs `pharaoh:debt` - categorizes findings: DELETE, CONSOLIDATE, DOCUMENT, STABILIZE, TEST.
3. Auditor hands highest-priority items to Fixer.
4. Fixer runs `pharaoh:refactor` before every change (blast radius check), then `pharaoh:verify` after.
5. Auditor re-runs health check - did the grade improve?

The auditor is read-only. It diagnoses but never modifies. Add a weekly heartbeat and the auditor runs automatically, reporting the grade and top risks.

Full config: [tech-debt-sprint playbook](https://github.com/Pharaoh-so/pharaoh/tree/main/docs/openclaw/playbooks/tech-debt-sprint.md)

## Setting up a multi-agent team

All playbooks follow the same pattern:

1. Install Pharaoh skills:

   ```bash
   npx @pharaoh-so/mcp --install-skills
   ```
2. Copy the `openclaw.json` snippet from the playbook into your config.
3. Copy the per-agent `AGENTS.md` files into your workspace directory.
4. Connect Pharaoh to your repo (install the GitHub App or use the upload path).
5. Run it:

   ```
   @planner Build a user notification system that sends emails on subscription renewal
   ```

Each playbook includes the complete gateway configuration, agent workspace files, and MCP server setup. Copy, adjust to your repo, run.

## Tips

* Start with a single agent + Pharaoh before adding multi-agent complexity. One agent with `pharaoh:plan` and `pharaoh:review` already catches most architectural issues.
* Use different model providers for builder and reviewer agents. Same-model review has correlated blind spots.
* The onboarding playbook is useful as a first step for any other playbook - run it to prime the agents before starting implementation or review work.
* All Pharaoh skills are platform-agnostic. The playbooks show OpenClaw configs, but the skills work with any agent framework that supports MCP.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://pharaoh.so/docs/guides/multi-agent-teams.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
