How to Build a PR Review Agent

This is how I built a PR review agent that does not rely on one giant prompt.

The hard part was not getting an LLM to comment on a diff. The hard part was making sure it did not miss the file that actually mattered.

The constraint stack

->PR has3,000 lines changed

->Full diff is180k tokens

->Model context is only128k tokens

->Worst bug is infile 37

->Model never seesfile 37

"How do you put a PR inside an LLM?"

down is the wrong question. the real one:

"How do you design a review system that does not miss the file that matters?"

Design checklist

The correct design

Click each stage to see what it actually does inside the system.

The correct design is not a prompt. It’s a pipeline.

Diff parser → Change graph → Risk scorer → Retrieval engine → Multi-pass reviewers → External memory → Coverage tracker → Final verifier

That pipeline decides what the model inspects next. Each stage feeds the next. Coverage is explicit. Context is budgeted per reviewer, not per PR. Risk drives ordering.

You don’t fight the model’s context limit. You work within it by breaking the review into pieces, routing each piece to a specialized reviewer, and tracking what’s been seen.

That is how you review a 180k-token PR with a 128k-token model.