How to Manage Multi-Cloud Architecture
Run workloads across AWS, Azure, and GCP without drowning in complexity. Covers service mapping, identity federation, networking, cost management, and governance.
Multi-cloud is a reality for 89% of enterprises. Most didn’t choose it — it happened through acquisitions, team preferences, vendor mandates (Microsoft 365 on Azure, data analytics team on GCP, legacy infrastructure on AWS), and the desire to avoid single-vendor lock-in. The goal is not to make multi-cloud perfect — it’s to make it manageable, secure, and cost-efficient.
Multi-cloud done badly is worse than single-cloud done well. Every additional cloud adds operational overhead: separate IAM systems, different networking models, distinct monitoring tools, and multiplied on-call complexity. This guide covers the systematic approach to making multi-cloud work.
When Multi-Cloud Makes Sense (and When It Doesn’t)
| Scenario | Multi-Cloud? | Rationale |
|---|---|---|
| Acquisition brought in a second cloud | ✅ Yes (unavoidable) | Migrate incrementally, don’t rush |
| Best-of-breed services (BigQuery + Azure AD) | ✅ Yes (justified) | Use each cloud’s strongest capability |
| Vendor lock-in avoidance (theoretical) | ❌ Usually not worth it | The operational overhead exceeds the lock-in risk for most organizations |
| Regulatory/data sovereignty | ✅ Yes (required) | Some data must reside in specific regions or providers |
| Disaster recovery across providers | ⚠️ Maybe | Cross-cloud DR is complex; same-cloud multi-region is usually sufficient |
| ”We want to be cloud-agnostic” | ❌ Anti-pattern | Lowest-common-denominator architecture sacrifices cloud-native advantages |
Service Mapping Across Clouds
The first step is building a Rosetta Stone — a mapping of equivalent services across your clouds. This enables team members trained on one cloud to reason about the others.
| Capability | AWS | Azure | GCP |
|---|---|---|---|
| Compute (VMs) | EC2 | Virtual Machines | Compute Engine |
| Containers | ECS / EKS | AKS | GKE |
| Serverless | Lambda | Azure Functions | Cloud Functions |
| Object Storage | S3 | Blob Storage | Cloud Storage |
| Relational DB | RDS (Aurora) | Azure SQL / PostgreSQL | Cloud SQL / AlloyDB |
| NoSQL | DynamoDB | Cosmos DB | Firestore / Bigtable |
| Data Warehouse | Redshift | Synapse | BigQuery |
| Streaming | Kinesis | Event Hubs | Pub/Sub + Dataflow |
| AI/ML | SageMaker | Azure AI Studio | Vertex AI |
| CDN | CloudFront | Azure Front Door | Cloud CDN |
| DNS | Route 53 | Azure DNS | Cloud DNS |
| IAM | IAM (policies + roles) | Entra ID (Azure AD) | Cloud IAM |
| Monitoring | CloudWatch | Azure Monitor | Cloud Monitoring |
| Key Management | KMS | Key Vault | Cloud KMS |
| Cost Management | Cost Explorer | Cost Management | Billing |
Step 1: Unified Identity
The most critical multi-cloud decision is identity federation. Without a single identity provider, you manage separate credentials per cloud — a security and operational nightmare.
# Federate identity across clouds using a single IdP (Azure AD / Entra ID)
# AWS — configure SAML federation with Azure AD
aws iam create-saml-provider \
--saml-metadata-document file://azure-ad-metadata.xml \
--name AzureAD
# GCP — configure workload identity federation
gcloud iam workload-identity-pools create azure-pool \
--location="global" \
--display-name="Azure AD Pool"
gcloud iam workload-identity-pools providers create-oidc azure-provider \
--workload-identity-pool="azure-pool" \
--location="global" \
--issuer-uri="https://login.microsoftonline.com/{tenant-id}/v2.0" \
--allowed-audiences="api://gcp-federation"
Identity Federation Best Practices
| Practice | Implementation | Why |
|---|---|---|
| Single IdP for all clouds | Azure AD / Okta / OneLogin as authoritative source | One place to manage users, groups, MFA policies |
| No local cloud accounts | Disable AWS IAM user creation, use federated roles only | Prevents credential sprawl |
| Consistent RBAC naming | Same role names across clouds (e.g., “platform-admin”, “developer”) | Reduces confusion, simplifies auditing |
| Centralized MFA | MFA enforced at IdP level, not per-cloud | Consistent security posture |
| Service-to-service identity | Workload Identity Federation (no static keys) | Short-lived tokens, no key rotation burden |
Step 2: Cross-Cloud Networking
┌─────────────┐ VPN / Interconnect ┌─────────────┐
│ AWS │◄──────────────────────────►│ Azure │
│ VPC │ (encrypted, dedicated) │ VNet │
│ 10.1.0.0/16│ │ 10.2.0.0/16│
└──────┬──────┘ └──────┬──────┘
│ │
│ VPN / Interconnect │
│ ┌───────────────────────────────┐ │
└───►│ GCP │◄────┘
│ VPC 10.3.0.0/16 │
└───────────────────────────────┘
Network Design Rules
| Rule | Why | Common Mistake |
|---|---|---|
| Non-overlapping CIDR ranges | Routes must be unambiguous across clouds | Using 10.0.0.0/16 everywhere — causes routing conflicts |
| Consistent security groups/NSGs | Same security policy everywhere | Different firewall rules per cloud — inconsistent posture |
| Centralized DNS | Single namespace resolution across clouds | Split DNS causing resolution failures |
| Encrypted transit (IPSec/WireGuard) | Data protection for inter-cloud traffic | Assuming cloud provider backbone is sufficient |
| Bandwidth monitoring and alerting | Egress costs are the #1 hidden multi-cloud cost | Discovering $50K/month egress bill after the fact |
| Dedicated interconnect for high-volume | VPN throughput is limited (~1.25 Gbps per tunnel) | VPN for production database replication — saturated and unreliable |
Egress Cost Reality
Cross-cloud data transfer is expensive and often overlooked:
| Transfer Type | AWS | Azure | GCP |
|---|---|---|---|
| Intra-region (same cloud) | Free or $0.01/GB | Free or $0.01/GB | Free |
| Cross-region (same cloud) | $0.02/GB | $0.02/GB | $0.01/GB |
| Cross-cloud (internet egress) | $0.09/GB | $0.087/GB | $0.12/GB |
| Dedicated interconnect | $0.02/GB | $0.02/GB | $0.02/GB |
Example: A database replication stream moving 1 TB/day between AWS and GCP costs ~$2,700/month via internet egress vs. ~$600/month via dedicated interconnect. At multi-cloud scale, interconnect pays for itself quickly.
Step 3: Multi-Cloud Cost Management
Without unified cost visibility, you cannot optimize spending across clouds. Build a cross-cloud cost dashboard.
# Unified cost view across clouds
def monthly_cost_report():
aws_cost = get_aws_cost_explorer() # boto3
azure_cost = get_azure_cost_mgmt() # azure.mgmt.costmanagement
gcp_cost = get_gcp_billing() # google.cloud.billing
total = {
"AWS": aws_cost["total"],
"Azure": azure_cost["total"],
"GCP": gcp_cost["total"],
}
total["Grand Total"] = sum(total.values())
# Top cost drivers across all clouds
by_service = sorted(
aws_cost["by_service"] + azure_cost["by_service"] + gcp_cost["by_service"],
key=lambda x: x["cost"],
reverse=True
)[:20]
# Identify optimization opportunities
idle_resources = detect_idle_resources() # Unused VMs, unattached disks
rightsizing = get_rightsizing_recommendations()
return {
"totals": total,
"top_services": by_service,
"idle_resources": idle_resources,
"rightsizing": rightsizing
}
Multi-Cloud FinOps Tools
| Tool | Clouds | Strength | Cost |
|---|---|---|---|
| CloudHealth (VMware) | AWS, Azure, GCP | Enterprise governance | $$$ |
| Spot.io (NetApp) | AWS, Azure, GCP | Automated savings | Usage-based |
| Infracost | All (via IaC) | Pre-deployment cost estimation | Free/OSS |
| OpenCost (CNCF) | Any K8s | Kubernetes cost allocation | Free/OSS |
| Custom dashboards | All | Full control | Engineering time |
Step 4: Workload Placement Strategy
Not every workload belongs on every cloud. Place workloads where each cloud has a genuine advantage.
| Workload Type | Best Cloud | Reason |
|---|---|---|
| .NET / D365 / Power Platform | Azure | Native ecosystem, licensing discounts |
| Data analytics (warehouse) | GCP (BigQuery) | Price-performance, serverless scaling |
| ML training (GPU intensive) | AWS or GCP | GPU availability, spot instance pricing |
| Kubernetes (managed) | GCP (GKE) | Best managed K8s service, GKE Autopilot |
| Serverless (event-driven) | AWS (Lambda) | Most mature, largest trigger ecosystem |
| Microsoft 365 integration | Azure | Native SSO, Graph API, Purview |
| General compute (VM workloads) | Compare pricing | Use reserved instances on primary, spot on secondary |
| Edge / IoT | AWS (Greengrass) or Azure (IoT Hub) | Depends on device ecosystem |
Step 5: Governance and Standards
| Standard | Implementation | Enforced By |
|---|---|---|
| Tagging policy | Consistent tags across all clouds (Environment, Owner, CostCenter, Application) | Policy-as-code (OPA, Azure Policy, AWS SCPs) |
| Naming conventions | {env}-{app}-{service}-{region} format | Linting in IaC pipeline |
| Security baseline | CIS benchmarks per cloud, applied automatically | Prowler (AWS), Defender (Azure), SCC (GCP) |
| Change management | All changes via IaC (Terraform/Pulumi), no console clicks | Branch protection + CI/CD enforcement |
| Incident response | Unified runbooks covering all clouds | On-call tooling (PagerDuty, Opsgenie) |
Multi-Cloud Checklist
- Identity federated through single IdP (Azure AD / Okta)
- Cross-cloud networking established (VPN/Interconnect with encryption)
- Non-overlapping CIDR ranges across all VPCs/VNets
- Unified cost dashboard deployed with optimization recommendations
- Workload placement strategy documented and followed
- Infrastructure-as-Code (Terraform/Pulumi) for all clouds
- Centralized logging and monitoring (Datadog, Grafana, or cloud-native)
- Security policies consistent across providers (CIS benchmarks)
- Egress costs monitored with alerts for unexpected spikes
- DR strategy tested across cloud boundaries
- Tagging policy enforced consistently across all clouds
- On-call team cross-trained on all active cloud platforms
:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For cloud architecture consulting, visit garnetgrid.com. :::