> For the complete documentation index, see [llms.txt](https://pharaoh.so/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://pharaoh.so/docs/concepts/how-it-works.md).

# How Pharaoh Works

Pharaoh turns codebases into queryable knowledge graphs. Three stages: parse, store, query.

## 1. Parse

When you install the Pharaoh GitHub App, it clones your repository using read-only installation tokens and parses every file with [tree-sitter](https://tree-sitter.github.io/) — the same parser used by GitHub's syntax highlighting, Neovim, and Helix.

Tree-sitter extracts structural metadata:

* **Functions:** name, signature, parameters, return type, JSDoc, complexity score, line range
* **Imports and exports:** what each file imports from and exports to
* **Call chains:** which functions call which other functions
* **Endpoints:** HTTP routes, their methods, and handler files
* **Cron jobs:** scheduled tasks and their handlers
* **Environment variables:** which env vars each function uses

Currently supported: **TypeScript** and **Python**. The parser is [open source](https://github.com/Pharaoh-so/pharaoh-parser) — you can audit exactly what gets extracted.

Source code is read during parsing and immediately discarded. Only structural metadata is stored.

## 2. Store

The extracted metadata is written to a [Neo4j](https://neo4j.com/) knowledge graph as nodes and relationships:

* **Nodes:** Repo, Module, File, Function, Endpoint, CronHandler, EnvVar
* **Relationships:** CONTAINS, IMPORTS, EXPORTS, CALLS, DEPENDS\_ON, USES\_ENV

This graph captures the architecture — every connection, dependency, and entry point — without storing any source code.

### What's in the graph

Function names, file paths, module boundaries, dependency edges, complexity scores, export signatures, endpoint routes, cron schedules, env var references, function body hashes (for duplication detection).

### What's NOT in the graph

Source code. Function bodies. Variable values. Comments (except JSDoc). String literals. Business logic. The graph is a table of contents, not the book.

## 3. Query

Your AI tool connects to Pharaoh via [MCP](https://modelcontextprotocol.io/) (Model Context Protocol) — the same protocol Claude Code, Cursor, and other AI tools use for external tool integration.

When your AI agent needs architectural context — before writing code, before refactoring, before reviewing a PR — it calls Pharaoh's MCP tools automatically. The tools query the Neo4j graph and return structured results in \~2K tokens instead of the 40K+ tokens it would take to read files manually.

The graph is queried live on every tool call. No caching across requests. The graph updates automatically via webhooks on every push to your default branch.

## The pipeline

```
GitHub App (push webhook)
  → Clone repo (read-only, installation tokens)
  → Parse with tree-sitter (extract structure, discard source)
  → Write to Neo4j (nodes + relationships)
  → MCP tools query the graph
  → AI agent gets architectural context in ~2K tokens
```

No config files. No per-repo setup. No maintenance. Install the GitHub App and the pipeline runs automatically.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://pharaoh.so/docs/concepts/how-it-works.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
