Code Review Tools and Workflows That Actually Work

Git 2026-02-09 · 8 min read code-review github graphite pull-requests stacked-diffs ai

Code Review Tools and Workflows That Actually Work

Code review is the single highest-leverage quality practice most teams have. It's also the one most teams do poorly -- reviews sit for days, PRs are too large to meaningfully review, and the tooling is either misconfigured or underused. This guide covers the tools and workflows that make code review fast, thorough, and sustainable.

GitHub PR Workflows: The Foundation

Most teams use GitHub pull requests for code review. Out of the box, GitHub's review workflow is decent but permissive. You need to configure it to actually enforce standards.

Branch Protection Rules

Branch protection is the minimum. Go to Settings > Branches > Add branch protection rule for main. The settings that matter most:

Require approvals (at least 1, 2 for critical repos)
Dismiss stale approvals when new commits are pushed -- without this, someone can get approval, push completely different code, and merge
Require review from Code Owners -- pairs with CODEOWNERS (see below)
Require status checks to pass -- gate merges on CI
Require linear history -- forces squash or rebase merges, keeps main clean

You can also manage this programmatically:

gh api repos/{owner}/{repo}/branches/main/protection \
  --method PUT \
  --field required_pull_request_reviews='{"required_approving_review_count":1,"dismiss_stale_reviews":true,"require_code_owner_reviews":true}' \
  --field required_status_checks='{"strict":true,"contexts":["ci/build","ci/test"]}' \
  --field enforce_admins=true

CODEOWNERS

The CODEOWNERS file assigns review responsibility automatically. When a PR touches files matching a pattern, the specified owners are added as required reviewers.

# .github/CODEOWNERS
* @myorg/core-team

/src/components/ @myorg/frontend
/src/api/        @myorg/backend
/src/db/         @myorg/backend
/infra/          @myorg/devops
/src/auth/       @myorg/security

CODEOWNERS is powerful but has a sharp edge: if a PR touches files across many ownership boundaries, it needs approval from every team. Keep ownership boundaries reasonable and avoid overly granular rules.

Required Status Checks

Pair branch protection with CI that runs on every PR:

# .github/workflows/ci.yml
name: CI
on:
  pull_request:
    branches: [main]
jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: oven-sh/setup-bun@v2
      - run: bun install
      - run: bun run check    # formatting + linting
      - run: bun run typecheck
      - run: bun test

Mark these as required status checks in branch protection. PRs cannot merge until all checks pass.

Keeping PRs Small

The single most impactful thing you can do for code review quality is keep PRs small. Research from Google and Microsoft consistently shows that review quality drops off a cliff above 200-400 lines of changes. Large PRs get rubber-stamped because reviewers can't hold the full context in their heads.

Guidelines that work:

Aim for under 300 lines of diff (excluding generated files, tests, and lock files).
One logical change per PR. If you're touching auth and also reformatting CSS, split them.
Use feature flags to merge incomplete features incrementally.
Refactoring goes in its own PR, separate from behavior changes. Mixing the two makes both harder to review.

Stacking PRs (covered below) is the best technique for keeping PRs small without blocking your workflow.

Draft PRs and When to Use Them

GitHub's draft PR feature signals "this isn't ready for formal review." Use them for:

Early feedback: Get architectural input before building out the details.
CI validation: Run CI checks on work-in-progress without requesting review.
Visibility: Show your team what you're working on without creating review pressure.
Self-review: Review your own diff in the GitHub UI and clean up before marking ready.

# Create a draft PR from the CLI
gh pr create --draft --title "feat: add user notifications" --body "WIP -- architecture for notifications system. Looking for feedback on the event model."

# Mark ready for review when done
gh pr ready

Don't leave draft PRs open indefinitely. If a draft sits for more than a few days, close it.

Graphite and Stacked Diffs

The biggest friction in PR workflows is sequential dependency. You open PR #1, wait for review, then start PR #2 that builds on it. If PR #1 needs changes, you're stuck rebasing.

Stacked diffs solve this by letting you build a chain of dependent PRs and manage them as a unit. Graphite is the most popular tool for this on GitHub.

How Stacked Diffs Work

Instead of one large PR, you create a stack:

main
  <- PR #1: add user model (50 lines)
    <- PR #2: add user API endpoints (80 lines)
      <- PR #3: add user UI (120 lines)

Each PR is small and reviewable on its own. Reviewers see only the incremental diff for each layer. When PR #1 merges, the rest of the stack automatically rebases.

Graphite Setup

# Install Graphite CLI
npm install -g @withgraphite/graphite-cli@stable

# Initialize in your repo
gt repo init

# Auth with GitHub
gt auth --token <github-pat>

Graphite Workflow

# Start a new stack from main
gt branch create feat/user-model
# ... make changes, commit ...
gt stack submit   # creates PR #1

# Continue building on top
gt branch create feat/user-api
# ... make changes, commit ...
gt stack submit   # creates PR #2, stacked on PR #1

# Continue further
gt branch create feat/user-ui
# ... make changes, commit ...
gt stack submit   # creates PR #3, stacked on PR #2

If a reviewer requests changes to PR #1:

# Go back to the first branch
gt checkout feat/user-model
# Make changes, commit
gt commit amend

# Restack everything above it
gt stack restack

# Re-submit all PRs in the stack
gt stack submit

Graphite handles the rebasing automatically. Without it, you'd be manually rebasing each branch in the chain -- error-prone and tedious.

Graphite also offers a web dashboard (app.graphite.dev) that shows stacked PRs visually with the dependency chain clear -- much better than GitHub's native PR view for stacks.

Review Platforms Compared

Feature	GitHub PRs	Graphite	Reviewable	Gerrit
Stacked diffs	No (manual)	Yes (core feature)	No	Yes (native)
Inline comments	Yes	Yes (GitHub-based)	Yes (enhanced)	Yes
Review tracking	Basic	Good	Excellent	Excellent
File-level review status	No	No	Yes (mark files reviewed)	Yes
CI integration	Native	GitHub Actions	GitHub Actions	Jenkins/custom
Self-hosted option	Enterprise	No	No	Yes (open source)
Price (team)	Free / $4/user	Free / $30/user	Free for open source	Free (self-hosted)
Learning curve	Low	Low	Medium	High

GitHub native works for most teams. Its main weakness is no file-level "reviewed" tracking -- in a 30-file PR, you can't mark which files you've already looked at.

Graphite is the best option for stacked diffs. Seamless GitHub integration, plus auto-merge and merge queue features.

Reviewable has the most powerful review UI -- tracks reviewed files, shows only new changes since your last review, and has sophisticated comment resolution. Layers on top of GitHub PRs.

Gerrit is the gold standard for large-scale review (Android, Chromium, Go) but requires self-hosting and has a steep learning curve.

Recommendation: Start with GitHub PRs and branch protection. Add Graphite for stacked diffs. Consider Reviewable if you need better file-level tracking.

AI-Assisted Code Review

AI review tools have matured rapidly. They're best at catching things humans miss or find tedious -- bugs in edge cases, inconsistent error handling, security issues, and style violations.

GitHub Copilot Code Review

GitHub's native AI review, available in Copilot Enterprise and Copilot for Business plans.

Triggered automatically on PR creation or manually via @copilot in a review comment.
Leaves inline comments with suggestions that can be committed directly.
Understands repository context and custom coding guidelines.
Best at: catching bugs, suggesting performance improvements, flagging security concerns.

Enable it in your repository settings under Copilot > Code Review.

CodeRabbit

CodeRabbit is a dedicated AI review tool that integrates with GitHub and GitLab.

# .coderabbit.yaml
reviews:
  auto_review:
    enabled: true
  path_instructions:
    - path: "src/api/**"
      instructions: "Check for proper error handling and input validation"
    - path: "src/db/**"
      instructions: "Watch for N+1 queries and missing indexes"
language: en
tone_instructions: "Be concise. Focus on bugs and security issues, not style."

CodeRabbit provides:

Automatic review on every PR with a detailed summary
Configurable review focus per directory
Interactive chat -- reply to its comments to ask follow-up questions
Learnable -- it adapts to your team's patterns over time

Where AI Review Helps (and Doesn't)

AI review is good at catching bugs (off-by-one errors, null risks, race conditions), security issues (injection, XSS, hardcoded secrets), consistency violations, and documentation gaps. It's bad at architecture decisions, business logic correctness, over-engineering assessment, and anything that requires team context.

Recommendation: Use AI review as a supplement, not a replacement. Let it handle tedious checks so human reviewers focus on architecture and design. CodeRabbit is the best standalone option; Copilot is best if you already pay for it.

Review Checklists

A lightweight review checklist keeps reviews consistent and prevents common oversights. Don't make it a 50-item bureaucratic nightmare -- keep it short and focused on what matters.

Put this in a PR template so it runs automatically:

<!-- .github/pull_request_template.md -->
## What
<!-- What does this PR do? -->

## Why
<!-- Why is this change needed? Link to issue. -->

## Testing
<!-- How did you test this? -->

## Author Self-Check
- [ ] PR description explains **why**, not just what
- [ ] Diff is under 400 lines (excluding generated files)
- [ ] Tests cover the new behavior
- [ ] No TODO comments without linked issues

## Reviewer Checklist
- [ ] Does the change do what the description says?
- [ ] Are edge cases and error paths handled?
- [ ] Any security concerns (user input, auth, data exposure)?
- [ ] Are new dependencies justified?

Common Anti-Patterns

"Looks good to me" reviews. If every review gets approved in under two minutes, your reviews aren't doing anything.

Mega-PRs. A 2000-line PR is not a code review -- it's a formality. Break it up or use stacked diffs.

Review ping-pong. If a PR goes back and forth more than twice, get on a call. Text-based review is terrible for resolving disagreements.

Blocking on style. If your linter isn't catching it, it doesn't matter enough to block a PR.

Skipping "trivial" changes. Config changes still deserve a look. Some of the worst outages come from "trivial" changes.

Recommendations

Start with GitHub branch protection and CODEOWNERS. These are free and give you 80% of the value. Require at least one approval, dismiss stale reviews, and gate on CI.
Keep PRs small. Under 300 lines. Use stacked diffs if needed. This is the highest-impact change you can make.
Add AI review. CodeRabbit for standalone, Copilot if you already have it. Let AI handle the boring stuff.
Use PR templates. A short self-check list catches issues before review even starts.
Try Graphite if you do stacked diffs. The tooling has gotten good enough that stacking is practical for everyday use.
Don't over-process. A two-person startup doesn't need CODEOWNERS and mandatory checklists. Scale your review process to your team size.