ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Running Effective Architecture Reviews

Conduct architecture reviews that catch problems early without becoming bureaucratic bottlenecks. Covers review triggers, lightweight RFC processes, decision frameworks, review checklists, and scaling reviews across multiple teams.

Architecture reviews exist to catch expensive mistakes before they become expensive production problems. A missed scaling bottleneck caught in review costs a whiteboard session. The same bottleneck caught in production costs an incident, a rewrite, and three months of rework.

But reviews can also become bureaucratic gates that slow teams to a crawl. The goal is finding the right balance: enough review to catch systemic risks, not so much that every feature needs a committee to approve.


When to Trigger a Review

Not every change needs an architecture review. Reviews should be triggered by risk, not by habit.

Trigger Criteria

A review is warranted when a change:

  • Introduces a new service or removes an existing one
  • Changes a public API contract (breaking or non-breaking)
  • Adds a new data store or significantly changes schema
  • Crosses a security boundary (authentication, authorization, encryption)
  • Affects data retention or compliance (GDPR, SOC2, HIPAA)
  • Introduces a new external dependency (third-party API, vendor SDK)
  • Changes deployment topology (new region, new cloud provider, new network path)
  • Has a blast radius affecting more than two teams

What Does NOT Need a Review

  • Bug fixes within a single service
  • Refactoring that does not change interfaces
  • Adding tests or improving documentation
  • Upgrading dependencies (unless major version with breaking changes)
  • Feature work within established patterns

The Lightweight RFC Process

A Request for Comments (RFC) is a short document that proposes a technical approach and invites feedback before implementation begins.

RFC Template

# RFC: [Title]

**Author**: [Name]  
**Date**: [Date]  
**Status**: Draft / In Review / Accepted / Rejected  

## Problem Statement
What are we solving? [2-3 sentences]

## Proposed Solution
How do we solve it? [Architecture diagram + explanation]

## Alternatives Considered
What else did we consider and why did we reject it?

## Risks and Mitigations
What could go wrong? How do we handle it?

## Data Model Changes
Any schema changes, new tables, new fields?

## API Changes
Any new or modified endpoints?

## Rollout Plan
How will this be deployed? Canary? Feature flag? Big bang?

## Open Questions
What do you need input on?

RFC Review Workflow

  1. Author writes the RFC in a shared doc or Git PR
  2. Reviewers are automatically assigned based on affected systems (CODEOWNERS-style)
  3. Review period: 3-5 business days (not weeks)
  4. Synchronous review meeting only if async comments are insufficient
  5. Decision recorded in the document: Accepted, Accepted with modifications, or Rejected with rationale

Scaling RFCs

For organizations with 5+ teams:

  • Lightweight RFCs (within one team’s domain): 2-3 reviewers, 3-day review, async only
  • Cross-team RFCs (affecting multiple domains): 4-6 reviewers, 5-day review, one sync meeting
  • Platform RFCs (affecting all teams): Platform team + stakeholders, 7-day review, dedicated meeting

The Review Meeting

When a synchronous meeting is needed, structure it to be productive:

Before the Meeting

  • All reviewers have read the RFC (enforce this — cancel if they have not)
  • Author identifies 2-3 specific questions for the group
  • Time-boxed to 45 minutes

During the Meeting

0:00 - 0:05  Author summarizes the proposal (NOT reads the doc aloud)
0:05 - 0:10  Clarifying questions only
0:10 - 0:30  Discussion of open questions and concerns
0:30 - 0:40  Decision: go / no-go / modify
0:40 - 0:45  Action items and next steps

The Review Checklist

Use a standard checklist to ensure consistency:

  • Failure modes: What happens when this fails? Is there a fallback?
  • Scale: Will this work at 10x current load? 100x?
  • Data consistency: Are there race conditions? Eventual consistency issues?
  • Security: Authentication, authorization, data encryption, input validation
  • Observability: Can we detect problems? Are there metrics, logs, alerts?
  • Reversibility: Can we roll this back without data loss?
  • Dependencies: Are we coupling to services that have different SLAs?
  • Cost: What is the infrastructure cost at projected scale?
  • Team impact: Does this require knowledge from someone not on the team?

Decision Framework

When reviewers disagree, use a structured framework:

The DACI Model

  • Driver: Owns the RFC and final decision (usually the author’s team lead)
  • Approver: Has veto power (usually the architect or VP of Engineering)
  • Contributors: Provide input and expertise
  • Informed: Need to know the outcome but do not participate in the decision

Tie-Breaking Rules

  1. Prefer reversible decisions over irreversible ones
  2. Prefer boring technology over novel technology
  3. When two options are equally good, go with the one that has less operational burden
  4. If no decision is clearly best, set a time limit and choose — analysis paralysis is worse than a suboptimal choice

Architecture Decision Records

Every review should produce an ADR that captures:

## ADR-023: Use Event Sourcing for Order History

**Status**: Accepted  
**Decision Group**: Backend Architects  
**Date**: 2026-02-15  

### Context
Order history requires audit logging, replay capability, and temporal queries.

### Decision
Implement event sourcing for the order domain using Kafka as the event store.

### Consequences
- (+) Full audit trail for compliance
- (+) Temporal queries ("what was the order state at 3pm?")
- (-) Higher complexity in read-path (CQRS required)
- (-) Team needs training on event-sourcing patterns

ADRs are immutable. If a decision is superseded, a new ADR references the old one:

**Supersedes**: ADR-023  
**Reason**: Event sourcing complexity exceeded benefits for our scale

Anti-Patterns

Anti-PatternImpactFix
Reviewing every changeTeams cannot shipDefine trigger criteria, trust teams
Ivory tower reviewsReviewers disconnected from realityInclude implementers in the review
No follow-throughDecisions made but not enforcedADRs in the codebase, automated checks
Design by committeeLowest-common-denominator decisionsClear DACI roles with one final decision-maker
Reviewing after implementationToo late to change courseReview at design phase, not PR phase

Architecture reviews are an investment. Done well, they prevent weeks of rework and reduce incidents. Done poorly, they add weeks to every project and teach teams to route around the process. The key is calibrating the review intensity to the risk of the change.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →