Engineers spend 20-30% of their time reviewing code. It's one of the most important things they do—catching bugs before production, maintaining code quality, spreading knowledge across the team. Yet most interview processes never assess this skill at all.

After implementing code review interviews at over 40 companies at SmithSpektrum, I've found them to be one of the highest-signal, lowest-anxiety assessment formats available. They reveal technical judgment, communication style, and collaboration skills in a realistic setting that feels like actual work[^1].

Here's the complete playbook.

Why Code Review Interviews Work

Traditional interviews test whether candidates can produce code under artificial pressure. Code review interviews test whether candidates can evaluate code—which is often more predictive of senior-level success.

What code review interviews reveal:

Technical judgment: What issues do they prioritize? Do they spot the critical bug or get lost in style nitpicks?

Communication: How do they give feedback? Constructive or condescending? Specific or vague?

Collaboration: Would you want to work with someone who reviews code this way?

Experience: Do they know the production gotchas that only come from years of shipping software?

Learning orientation: Do they ask good questions about context before assuming they know everything?

Compare this to other formats:

Format Realism Candidate Anxiety Signal Quality Time
Whiteboard coding Low High Medium 45-60 min
Take-home Medium Medium Medium-High 2-4 hours
Live coding Medium High Medium 45-60 min
Code review High Low High 30-45 min
System design Medium Medium High 45-60 min

Code review interviews are realistic (engineers actually do this), low-anxiety (no one watches you code in silence), and high-signal (they reveal things whiteboarding can't).

Creating the Sample PR

The quality of your code review interview depends entirely on the PR you create. Here's what works.

Size matters: Aim for 100-300 lines of code. Too short and there's not enough to discuss. Too long and the interview becomes a slog.

Language match: Use a language the candidate will actually use on the job. Code review skills transfer across languages, but you'll get better signal if they're comfortable with the syntax.

Intentional issues: Include 5-8 problems of varying severity and subtlety. Create a mix they should definitely catch, should probably catch, and nice-to-catch.

Realistic context: Write a PR description that gives enough context to understand what the code is trying to do without overwhelming detail.

Issue Categories to Include

Every sample PR should include issues from these categories:

Critical (should definitely catch): Null pointer risks, SQL injection, race conditions, off-by-one errors, authentication bypasses. If they miss these, that's concerning.

High priority (should probably catch): N+1 queries, missing error handling, security issues, performance problems at scale. Missing some is fine; missing all is a red flag.

Medium priority (good to catch): Edge case handling, poor naming, missing tests, boundary conditions. These show thoroughness.

Low priority (nice to catch): Style issues, minor formatting, documentation gaps. Spending too much time here is actually a negative signal.

Language-Specific Examples

For JavaScript/TypeScript, consider issues like type coercion bugs (== vs ===), missing await on promises, event listener memory leaks, or XSS through innerHTML.

For Python, look at mutable default arguments, SQL injection through f-strings, bare except clauses, or unclosed resources.

For Java, include null pointer paths, unclosed connections, non-thread-safe collection usage, or catching overly generic Exception.

Running the Interview

Structure the 45-minute session carefully.

In the first five minutes, introduce the format and share the PR link. Tell the candidate this is collaborative, they can ask questions, and you're interested in their process.

In the next five minutes, explain what the PR is trying to do. Give business context. Answer questions about requirements.

For 25 minutes, let them review. Observe without reacting. Take notes on what they find, in what order, and how they'd communicate feedback.

In the final 10 minutes, discuss trade-offs. Ask which findings are most important, what additional context they'd want, and whether they'd approve the PR.

What to Watch For During Review

Order of findings: Do they start with style nitpicks or go straight to substantive issues? The latter is better.

Questions asked: Candidates who ask about context before jumping to conclusions show maturity.

Communication style: Would the PR author feel attacked or supported by this feedback?

Reasoning depth: Can they explain why something is problematic, not just that it is?

Questions to Ask After Review

"If you could only leave three comments, which would they be?" tests prioritization.

"What would you do differently if you wrote this?" tests alternative approaches.

"What additional testing would you want before this ships?" tests quality mindset.

"Would you approve this PR? Under what conditions?" tests decision-making.

Evaluation Rubric

Score candidates across four dimensions:

Technical Detection (40% of overall score)

Score Criteria
4 Found all critical issues plus several moderate ones
3 Found critical issues, missed some moderate
2 Missed one or more critical issues
1 Missed multiple critical issues

Prioritization (25%)

Score Criteria
4 Focused on critical issues first, appropriate time on style
3 Generally good prioritization, minor misses
2 Spent too much time on minor issues
1 Fixated on style, missed substance

Communication (25%)

Score Criteria
4 Constructive, specific, actionable feedback
3 Clear feedback, mostly constructive
2 Vague or overly critical
1 Harsh, unconstructive, or unclear

Process (10%)

Score Criteria
4 Systematic approach, asked appropriate questions
3 Reasonable process
2 Disorganized or random approach
1 No clear process

Red Flags

Certain patterns reliably indicate problems:

Only finds style issues. Missing substantive bugs while commenting extensively on formatting suggests superficial technical review.

Harsh or condescending feedback. "This is terrible" or "obviously wrong" signals poor collaboration style.

Never asks context questions. Assumes rather than learns—will cause problems in real code reviews.

Can't explain why something is problematic. Pattern matching without understanding means they'll miss similar issues with different surface patterns.

Wants to rewrite everything. Perfectionism that would create friction in actual work.

Misses obvious security issues. A dangerous blind spot for any production role.

Adapting for Different Levels

For senior and staff candidates, add architecture-level questions. Ask how they'd structure this differently. Probe on system design implications and performance at scale.

For junior candidates, use simpler code with fewer hidden issues. Focus more on learning approach than on catching everything. Provide more context upfront.

For remote implementations, have the candidate share their screen. Take detailed notes since you can't rely on memory. Ensure clear communication about timing and have a backup channel ready for technical issues.

Calibration for Consistency

Use the same PR for all candidates to enable fair comparison. Train interviewers on expected findings so they know what "good" looks like. Have interviewers score independently before debrief to prevent anchoring.

Create a tracking document for expected findings with expected detection rates:

Issue Severity Should Find
SQL injection in query Critical 90% of candidates
Null pointer line 45 Critical 85%
Missing error handling High 75%
N+1 query pattern High 70%
Edge case: empty input Medium 60%
Variable naming Low 50%

If candidates are consistently finding things at different rates than expected, recalibrate your expectations or adjust the PR difficulty.

Integration with Other Interviews

Code review interviews complement rather than replace other formats. A recommended interview loop:

Interview Primary Focus
Code review Technical judgment, communication
Live coding or take-home Implementation ability
System design Architecture thinking
Behavioral Collaboration patterns, past experience

When you see signals in code review, validate them against other interviews. Strong technical judgment in code review should appear in system design discussions. Poor communication style should show up in behavioral examples.


The most common objection I hear is "it doesn't test if they can actually write code." True—pair it with a take-home or live coding exercise. But dismissing code review interviews misses that for senior engineers, judgment about code matters as much as producing it. The senior engineer who spots a critical bug in review prevents more production issues than the one who writes fast but doesn't catch problems.

Code review interviews reveal who engineers become when the pressure is off and the work is real.


References

[^1]: SmithSpektrum interview format analysis, 40+ company implementations, 2021-2026. [^2]: Google Engineering Practices, "How to do a code review," 2023. [^3]: Fowler, Martin. "Code Review Guidelines," martinfowler.com, 2020. [^4]: Thoughtworks Technology Radar, "Interview Techniques," 2024.


Want help implementing code review interviews? Contact SmithSpektrum for customized interview design and training.


Author: Irvan Smith, Founder & Managing Director at SmithSpektrum