Multi-Cloud Architecture
Design and operate workloads across multiple cloud providers. Covers abstraction layers, data replication, identity federation, cost management, disaster recovery, and the patterns that deliver multi-cloud value without drowning in complexity.
Multi-cloud means deliberately using two or more cloud providers to run production workloads. The reasons are compelling — avoid vendor lock-in, leverage best-of-breed services, meet data sovereignty requirements, and improve resilience. The cost is significant — complexity, operational overhead, and the temptation to build the lowest common denominator.
Multi-Cloud Strategies
Strategy 1: Best-of-Breed
Use each cloud for what it does best
Example: GCP for data/ML, AWS for compute, Azure for enterprise
Pro: Optimal service selection
Con: Multiple skill sets, operational complexity
Strategy 2: Portability
Abstract cloud-specific services behind portable interfaces
Example: Kubernetes everywhere, Terraform for infra
Pro: Avoid lock-in, move workloads freely
Con: Lose cloud-native advantages, lowest common denominator
Strategy 3: Disaster Recovery
Primary on Cloud A, failover on Cloud B
Pro: True cloud-level resilience
Con: Expensive, complexity of cross-cloud data sync
Strategy 4: Regulatory
Different clouds for different regions/regulations
Example: AWS for US, sovereign cloud for EU
Pro: Compliance-driven, clear boundaries
Con: Fragmented operations
Abstraction Layers
# Terraform: Multi-cloud infrastructure abstraction
# Same concepts, different providers
# AWS
resource "aws_instance" "app" {
ami = "ami-0123456789abcdef0"
instance_type = "t3.medium"
}
# GCP
resource "google_compute_instance" "app" {
machine_type = "e2-medium"
boot_disk {
initialize_params {
image = "debian-cloud/debian-11"
}
}
}
# Azure
resource "azurerm_linux_virtual_machine" "app" {
size = "Standard_B2s"
source_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
}
}
# Kubernetes: Workload portability
# Same Deployment YAML works on EKS, GKE, AKS
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
replicas: 3
template:
spec:
containers:
- name: order-service
image: registry.company.com/order-service:v1.2.3
# Same container, any cloud
Data Replication
Challenge: Data must be consistent across clouds
Patterns:
Active-Passive:
Primary write in Cloud A → Async replicate to Cloud B
RPO: Minutes (some data loss possible during failover)
Active-Active:
Write to both clouds simultaneously
Conflict resolution required (last-write-wins, CRDTs)
Event-Driven:
Changes published as events → consumed by both clouds
Eventually consistent, flexible
Tool options:
- Database-native replication (PostgreSQL logical replication)
- Change Data Capture (Debezium → Kafka → both clouds)
- Object storage sync (rclone, cloud-native replication)
Identity Federation
# Single identity across clouds
identity_federation:
identity_provider: "Okta"
aws:
method: SAML 2.0
roles: mapped_from_okta_groups
gcp:
method: Workload Identity Federation
pools: mapped_from_okta_groups
azure:
method: Azure AD SSO
roles: mapped_from_okta_groups
result:
- Single login for all clouds
- Consistent RBAC across providers
- Centralized audit trail
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Multi-cloud “just because” | Complexity without benefit | Clear business justification required |
| Lowest common denominator | Lose all cloud-native value | Use cloud-native where it matters |
| No shared identity | Different credentials per cloud | Identity federation (Okta, Azure AD) |
| Manual cross-cloud operations | Inconsistent, error-prone | Unified IaC (Terraform), unified CI/CD |
| Ignoring data gravity | Moving data between clouds is expensive | Co-locate compute with data |
Multi-cloud is a strategy, not a religion. Use it where it provides clear business value — compliance, resilience, best-of-breed. Do not use it for its own sake.