Code Review Tools and Workflows That Actually Work
Code Review Tools and Workflows That Actually Work
Code review is the single highest-leverage quality practice most teams have. It's also the one most teams do poorly -- reviews sit for days, PRs are too large to meaningfully review, and the tooling is either misconfigured or underused. This guide covers the tools and workflows that make code review fast, thorough, and sustainable.
GitHub PR Workflows: The Foundation
Most teams use GitHub pull requests for code review. Out of the box, GitHub's review workflow is decent but permissive. You need to configure it to actually enforce standards.
Branch Protection Rules
Branch protection is the minimum. Go to Settings > Branches > Add branch protection rule for main. The settings that matter most:
- Require approvals (at least 1, 2 for critical repos)
- Dismiss stale approvals when new commits are pushed -- without this, someone can get approval, push completely different code, and merge
- Require review from Code Owners -- pairs with CODEOWNERS (see below)
- Require status checks to pass -- gate merges on CI
- Require linear history -- forces squash or rebase merges, keeps
mainclean
You can also manage this programmatically:
gh api repos/{owner}/{repo}/branches/main/protection \
--method PUT \
--field required_pull_request_reviews='{"required_approving_review_count":1,"dismiss_stale_reviews":true,"require_code_owner_reviews":true}' \
--field required_status_checks='{"strict":true,"contexts":["ci/build","ci/test"]}' \
--field enforce_admins=true
CODEOWNERS
The CODEOWNERS file assigns review responsibility automatically. When a PR touches files matching a pattern, the specified owners are added as required reviewers.
# .github/CODEOWNERS
* @myorg/core-team
/src/components/ @myorg/frontend
/src/api/ @myorg/backend
/src/db/ @myorg/backend
/infra/ @myorg/devops
/src/auth/ @myorg/security
CODEOWNERS is powerful but has a sharp edge: if a PR touches files across many ownership boundaries, it needs approval from every team. Keep ownership boundaries reasonable and avoid overly granular rules.
Required Status Checks
Pair branch protection with CI that runs on every PR:
# .github/workflows/ci.yml
name: CI
on:
pull_request:
branches: [main]
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
- run: bun install
- run: bun run check # formatting + linting
- run: bun run typecheck
- run: bun test
Mark these as required status checks in branch protection. PRs cannot merge until all checks pass.
Keeping PRs Small
The single most impactful thing you can do for code review quality is keep PRs small. Research from Google and Microsoft consistently shows that review quality drops off a cliff above 200-400 lines of changes. Large PRs get rubber-stamped because reviewers can't hold the full context in their heads.
Guidelines that work:
- Aim for under 300 lines of diff (excluding generated files, tests, and lock files).
- One logical change per PR. If you're touching auth and also reformatting CSS, split them.
- Use feature flags to merge incomplete features incrementally.
- Refactoring goes in its own PR, separate from behavior changes. Mixing the two makes both harder to review.
Stacking PRs (covered below) is the best technique for keeping PRs small without blocking your workflow.
Draft PRs and When to Use Them
GitHub's draft PR feature signals "this isn't ready for formal review." Use them for:
- Early feedback: Get architectural input before building out the details.
- CI validation: Run CI checks on work-in-progress without requesting review.
- Visibility: Show your team what you're working on without creating review pressure.
- Self-review: Review your own diff in the GitHub UI and clean up before marking ready.
# Create a draft PR from the CLI
gh pr create --draft --title "feat: add user notifications" --body "WIP -- architecture for notifications system. Looking for feedback on the event model."
# Mark ready for review when done
gh pr ready
Don't leave draft PRs open indefinitely. If a draft sits for more than a few days, close it.
Graphite and Stacked Diffs
The biggest friction in PR workflows is sequential dependency. You open PR #1, wait for review, then start PR #2 that builds on it. If PR #1 needs changes, you're stuck rebasing.
Stacked diffs solve this by letting you build a chain of dependent PRs and manage them as a unit. Graphite is the most popular tool for this on GitHub.
How Stacked Diffs Work
Instead of one large PR, you create a stack:
main
<- PR #1: add user model (50 lines)
<- PR #2: add user API endpoints (80 lines)
<- PR #3: add user UI (120 lines)
Each PR is small and reviewable on its own. Reviewers see only the incremental diff for each layer. When PR #1 merges, the rest of the stack automatically rebases.
Graphite Setup
# Install Graphite CLI
npm install -g @withgraphite/graphite-cli@stable
# Initialize in your repo
gt repo init
# Auth with GitHub
gt auth --token <github-pat>
Graphite Workflow
# Start a new stack from main
gt branch create feat/user-model
# ... make changes, commit ...
gt stack submit # creates PR #1
# Continue building on top
gt branch create feat/user-api
# ... make changes, commit ...
gt stack submit # creates PR #2, stacked on PR #1
# Continue further
gt branch create feat/user-ui
# ... make changes, commit ...
gt stack submit # creates PR #3, stacked on PR #2
If a reviewer requests changes to PR #1:
# Go back to the first branch
gt checkout feat/user-model
# Make changes, commit
gt commit amend
# Restack everything above it
gt stack restack
# Re-submit all PRs in the stack
gt stack submit
Graphite handles the rebasing automatically. Without it, you'd be manually rebasing each branch in the chain -- error-prone and tedious.
Graphite also offers a web dashboard (app.graphite.dev) that shows stacked PRs visually with the dependency chain clear -- much better than GitHub's native PR view for stacks.
Review Platforms Compared
| Feature | GitHub PRs | Graphite | Reviewable | Gerrit |
|---|---|---|---|---|
| Stacked diffs | No (manual) | Yes (core feature) | No | Yes (native) |
| Inline comments | Yes | Yes (GitHub-based) | Yes (enhanced) | Yes |
| Review tracking | Basic | Good | Excellent | Excellent |
| File-level review status | No | No | Yes (mark files reviewed) | Yes |
| CI integration | Native | GitHub Actions | GitHub Actions | Jenkins/custom |
| Self-hosted option | Enterprise | No | No | Yes (open source) |
| Price (team) | Free / $4/user | Free / $30/user | Free for open source | Free (self-hosted) |
| Learning curve | Low | Low | Medium | High |
GitHub native works for most teams. Its main weakness is no file-level "reviewed" tracking -- in a 30-file PR, you can't mark which files you've already looked at.
Graphite is the best option for stacked diffs. Seamless GitHub integration, plus auto-merge and merge queue features.
Reviewable has the most powerful review UI -- tracks reviewed files, shows only new changes since your last review, and has sophisticated comment resolution. Layers on top of GitHub PRs.
Gerrit is the gold standard for large-scale review (Android, Chromium, Go) but requires self-hosting and has a steep learning curve.
Recommendation: Start with GitHub PRs and branch protection. Add Graphite for stacked diffs. Consider Reviewable if you need better file-level tracking.
AI-Assisted Code Review
AI review tools have matured rapidly. They're best at catching things humans miss or find tedious -- bugs in edge cases, inconsistent error handling, security issues, and style violations.
GitHub Copilot Code Review
GitHub's native AI review, available in Copilot Enterprise and Copilot for Business plans.
- Triggered automatically on PR creation or manually via
@copilotin a review comment. - Leaves inline comments with suggestions that can be committed directly.
- Understands repository context and custom coding guidelines.
- Best at: catching bugs, suggesting performance improvements, flagging security concerns.
Enable it in your repository settings under Copilot > Code Review.
CodeRabbit
CodeRabbit is a dedicated AI review tool that integrates with GitHub and GitLab.
# .coderabbit.yaml
reviews:
auto_review:
enabled: true
path_instructions:
- path: "src/api/**"
instructions: "Check for proper error handling and input validation"
- path: "src/db/**"
instructions: "Watch for N+1 queries and missing indexes"
language: en
tone_instructions: "Be concise. Focus on bugs and security issues, not style."
CodeRabbit provides:
- Automatic review on every PR with a detailed summary
- Configurable review focus per directory
- Interactive chat -- reply to its comments to ask follow-up questions
- Learnable -- it adapts to your team's patterns over time
Where AI Review Helps (and Doesn't)
AI review is good at catching bugs (off-by-one errors, null risks, race conditions), security issues (injection, XSS, hardcoded secrets), consistency violations, and documentation gaps. It's bad at architecture decisions, business logic correctness, over-engineering assessment, and anything that requires team context.
Recommendation: Use AI review as a supplement, not a replacement. Let it handle tedious checks so human reviewers focus on architecture and design. CodeRabbit is the best standalone option; Copilot is best if you already pay for it.
Review Checklists
A lightweight review checklist keeps reviews consistent and prevents common oversights. Don't make it a 50-item bureaucratic nightmare -- keep it short and focused on what matters.
Put this in a PR template so it runs automatically:
<!-- .github/pull_request_template.md -->
## What
<!-- What does this PR do? -->
## Why
<!-- Why is this change needed? Link to issue. -->
## Testing
<!-- How did you test this? -->
## Author Self-Check
- [ ] PR description explains **why**, not just what
- [ ] Diff is under 400 lines (excluding generated files)
- [ ] Tests cover the new behavior
- [ ] No TODO comments without linked issues
## Reviewer Checklist
- [ ] Does the change do what the description says?
- [ ] Are edge cases and error paths handled?
- [ ] Any security concerns (user input, auth, data exposure)?
- [ ] Are new dependencies justified?
Common Anti-Patterns
"Looks good to me" reviews. If every review gets approved in under two minutes, your reviews aren't doing anything.
Mega-PRs. A 2000-line PR is not a code review -- it's a formality. Break it up or use stacked diffs.
Review ping-pong. If a PR goes back and forth more than twice, get on a call. Text-based review is terrible for resolving disagreements.
Blocking on style. If your linter isn't catching it, it doesn't matter enough to block a PR.
Skipping "trivial" changes. Config changes still deserve a look. Some of the worst outages come from "trivial" changes.
Recommendations
- Start with GitHub branch protection and CODEOWNERS. These are free and give you 80% of the value. Require at least one approval, dismiss stale reviews, and gate on CI.
- Keep PRs small. Under 300 lines. Use stacked diffs if needed. This is the highest-impact change you can make.
- Add AI review. CodeRabbit for standalone, Copilot if you already have it. Let AI handle the boring stuff.
- Use PR templates. A short self-check list catches issues before review even starts.
- Try Graphite if you do stacked diffs. The tooling has gotten good enough that stacking is practical for everyday use.
- Don't over-process. A two-person startup doesn't need CODEOWNERS and mandatory checklists. Scale your review process to your team size.