ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Service Mesh Architecture

Implement a service mesh to manage service-to-service communication with zero application code changes. Covers sidecar proxies, mTLS, traffic management, observability, and deciding whether a service mesh is worth the operational complexity.

As microservices architectures grow beyond 10-20 services, the cross-cutting concerns — mutual TLS, retries, circuit breaking, observability — become too complex to implement in every service individually. A service mesh extracts these concerns into infrastructure, providing them uniformly to every service through sidecar proxies.


How a Service Mesh Works

Service A Pod                        Service B Pod
┌──────────────────────┐            ┌──────────────────────┐
│ Application          │            │ Application          │
│ (no mesh awareness)  │            │ (no mesh awareness)  │
│        ↓             │            │        ↑             │
│ Sidecar Proxy        │───────────▶│ Sidecar Proxy        │
│ (Envoy)              │  mTLS      │ (Envoy)              │
└──────────────────────┘            └──────────────────────┘
           ↑                                   ↑
           └────── Control Plane ──────────────┘
                   (Istio/Linkerd)

The sidecar proxy intercepts all inbound and outbound traffic. The application sends plain HTTP; the sidecar handles TLS, retries, load balancing, and telemetry transparently.


Core Capabilities

Mutual TLS (mTLS)

Zero-trust networking without application changes:

# Istio: Enable strict mTLS for the entire mesh
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT  # All traffic must be mTLS

Traffic Management

# Canary deployment: 90% to v1, 10% to v2
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    - route:
        - destination:
            host: order-service
            subset: v1
          weight: 90
        - destination:
            host: order-service
            subset: v2
          weight: 10

Retry and Timeout Policies

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: payment-service
spec:
  hosts:
    - payment-service
  http:
    - route:
        - destination:
            host: payment-service
      retries:
        attempts: 3
        perTryTimeout: 2s
        retryOn: "5xx,connect-failure,retriable-4xx"
      timeout: 10s

Circuit Breaking

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: payment-service
spec:
  host: payment-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        h2UpgradePolicy: DEFAULT
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50

Mesh Selection

FeatureIstioLinkerdConsul Connect
ComplexityHighLowMedium
Resource overhead~100MB per sidecar~20MB per sidecarMedium
mTLSYesYesYes
Traffic managementAdvancedBasicMedium
Multi-clusterYesYesYes
Best forComplex requirementsSimplicity, performanceHashiCorp ecosystem

Observability

A service mesh provides uniform telemetry for free:

Metrics (per-service, per-endpoint):
  - Request rate, error rate, latency (RED metrics)
  - Connection count, bytes transferred
  - Retry count, circuit breaker trips

Traces:
  - Automatic span injection at sidecar
  - Full distributed trace across all mesh services

Access Logs:
  - Structured logs for every request
  - Source, destination, duration, status code, response size

Anti-Patterns

Anti-PatternConsequenceFix
Mesh for < 10 servicesOperational overhead exceeds benefitUse simple retry libraries instead
Not monitoring sidecar resourcesMemory/CPU overhead ignoredMonitor and set sidecar resource limits
mTLS in permissive mode foreverFalse sense of securitySet to strict after testing
Overly aggressive retriesAmplify failures during outagesRetry budgets, exponential backoff
Mesh as a substitute for good designInfrastructure cannot fix bad architectureFix service boundaries first

A service mesh is infrastructure for infrastructure. It is most valuable when the alternative is implementing the same cross-cutting concerns in 50 different services in 5 different languages.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →