ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Infrastructure Testing: Validating Before You Deploy

Test infrastructure changes before they reach production. Covers Terraform plan analysis, policy-as-code with OPA, integration testing for IaC, chaos engineering for infrastructure, and building confidence in infrastructure changes.

Infrastructure code deserves the same testing rigor as application code. A misconfigured security group, an incorrect IAM policy, or a missing resource tag can cause outages, security breaches, or compliance violations. Testing infrastructure changes before deployment catches these issues when the cost of fixing them is minutes, not hours.


The Testing Pyramid for Infrastructure

          ╱╲
         ╱  ╲       Chaos / Integration Tests
        ╱    ╲      (real cloud resources, slow, expensive)
       ╱──────╲
      ╱        ╲    Policy Tests
     ╱          ╲   (OPA/Sentinel, fast, comprehensive)
    ╱────────────╲
   ╱              ╲  Static Analysis / Linting
  ╱________________╲ (tflint, checkov, fastest)

Layer 1: Static Analysis

Catch syntax errors, deprecated resources, and security misconfigurations without deploying anything:

# Terraform linting
tflint --recursive

# Security scanning
checkov -d .
trivy config .

# Format validation
terraform fmt -check -recursive

Layer 2: Policy-as-Code

Enforce organizational rules before apply:

# OPA policy: All S3 buckets must have encryption
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket"
    not resource.change.after.server_side_encryption_configuration
    msg := sprintf("S3 bucket '%s' must have encryption enabled", [resource.name])
}

# All resources must have required tags
deny[msg] {
    resource := input.resource_changes[_]
    required_tags := {"Environment", "Team", "CostCenter"}
    provided_tags := {tag | resource.change.after.tags[tag]}
    missing := required_tags - provided_tags
    count(missing) > 0
    msg := sprintf("Resource '%s' missing tags: %v", [resource.name, missing])
}

Layer 3: Integration Tests

Deploy to ephemeral environments and validate:

// Terratest example
func TestVPCModule(t *testing.T) {
    terraformOptions := &terraform.Options{
        TerraformDir: "../modules/vpc",
        Vars: map[string]interface{}{
            "environment": "test",
            "cidr_block":  "10.99.0.0/16",
        },
    }
    
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    
    vpcId := terraform.Output(t, terraformOptions, "vpc_id")
    assert.NotEmpty(t, vpcId)
    
    subnets := terraform.OutputList(t, terraformOptions, "subnet_ids")
    assert.Equal(t, 3, len(subnets))
}

Terraform Plan Analysis

Automated Plan Review

import json

def analyze_plan(plan_file):
    with open(plan_file) as f:
        plan = json.load(f)
    
    changes = plan['resource_changes']
    
    creates = [c for c in changes if 'create' in c['change']['actions']]
    updates = [c for c in changes if 'update' in c['change']['actions']]
    deletes = [c for c in changes if 'delete' in c['change']['actions']]
    
    # Alert on destructive changes
    if deletes:
        print(f"WARNING: {len(deletes)} resources will be DESTROYED")
        for d in deletes:
            print(f"  - {d['type']}.{d['name']}")
    
    # Alert on replacement (destroy + create)
    replaces = [c for c in changes if 'delete' in c['change']['actions'] 
                and 'create' in c['change']['actions']]
    if replaces:
        print(f"CRITICAL: {len(replaces)} resources will be REPLACED")

CI Pipeline Integration

jobs:
  plan:
    steps:
      - run: terraform init
      - run: terraform plan -out=tfplan
      - run: terraform show -json tfplan > plan.json
      - run: python analyze_plan.py plan.json
      - run: conftest test plan.json --policy policies/
      - run: checkov -f plan.json

Drift Detection

Infrastructure drift occurs when real-world resources differ from the Terraform state:

# Detect drift
terraform plan -detailed-exitcode
# Exit code 0: No changes
# Exit code 1: Error
# Exit code 2: Changes detected (drift)

Automated Drift Monitoring

# Run daily via cron
- name: Drift Detection
  schedule: "0 6 * * *"
  steps:
    - run: terraform plan -detailed-exitcode
    - if: exit_code == 2
      run: |
        terraform plan -no-color > drift-report.txt
        notify_slack "Infrastructure drift detected" drift-report.txt

Anti-Patterns

Anti-PatternConsequenceFix
No plan reviewUnexpected destructive changesAutomated plan analysis in CI
Manual infrastructure changesState drift, “who changed this?”All changes through IaC + CI
No policy enforcementSecurity/compliance violationsOPA/Sentinel policies in pipeline
Testing in production onlyOutages from untested changesEphemeral test environments
No drift detectionReality diverges from codeDaily drift scans with alerting

Infrastructure testing is not optional. Every terraform apply without prior testing is a deployment to production without tests — and the blast radius of infrastructure changes is typically larger than application changes.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →