Identity and Access Management Architecture
Design IAM systems that balance security with usability. Covers authentication protocols, authorization models, identity federation, session management, API key patterns, machine-to-machine auth, and the IAM architecture decisions that protect without paralyzing.
IAM determines who can access what resources under which conditions. It is the front door to every system, the authorization check on every API call, and the audit trail for every action. Getting IAM wrong means either a security breach (too permissive) or a usability nightmare (too restrictive).
Authentication vs Authorization
Authentication (AuthN): "Who are you?"
Proves identity via credentials
Methods: Password, MFA, SSO, certificates, biometrics
Authorization (AuthZ): "What are you allowed to do?"
Determines permissions based on identity
Models: RBAC, ABAC, PBAC, ReBAC
Authentication Protocols
| Protocol | Use Case | Token Type |
|---|---|---|
| OAuth 2.0 | Delegated authorization | Access token (JWT or opaque) |
| OIDC | Authentication + user info | ID token (JWT) |
| SAML 2.0 | Enterprise SSO | XML assertion |
| mTLS | Service-to-service | X.509 certificate |
| API Keys | Machine access, public APIs | Static key |
| WebAuthn/FIDO2 | Passwordless, phishing-resistant | Public key credential |
Authorization Models
RBAC (Role-Based)
roles:
viewer:
permissions: [read:orders, read:products]
editor:
permissions: [read:orders, write:orders, read:products, write:products]
admin:
permissions: [read:*, write:*, delete:*, manage:users]
assignments:
alice: [admin]
bob: [editor]
carol: [viewer]
# Simple, but does not scale for fine-grained access
# 1000 users × 50 resources = role explosion
ABAC (Attribute-Based)
# Policy: "Users can edit orders in their own department during business hours"
def check_access(user, action, resource, context):
policies = [
# Department match
lambda: user.department == resource.department,
# Action allowed for role
lambda: action in role_permissions[user.role],
# Business hours (optional)
lambda: 9 <= context.current_hour <= 17 or user.role == "admin",
# Not suspended
lambda: not user.is_suspended,
]
return all(policy() for policy in policies)
ReBAC (Relationship-Based)
# Google Zanzibar model (used by Google, Airbnb, etc.)
# "Can user:alice view document:123?"
# Relationships:
document:123#owner@user:alice # Alice owns doc 123
document:123#viewer@group:engineering # Engineering can view doc 123
group:engineering#member@user:bob # Bob is in engineering
# Check: user:bob → member of group:engineering → viewer of document:123
# Result: Bob can view doc 123 (through group membership)
Session Management
# Secure session configuration
session_config = {
"cookie": {
"name": "__session",
"httpOnly": True, # Prevent XSS access
"secure": True, # HTTPS only
"sameSite": "Strict", # CSRF protection
"maxAge": 3600, # 1 hour
"domain": ".company.com",
"path": "/",
},
"store": "redis", # Server-side session store
"id_length": 128, # Bits of entropy
"rotation": True, # New session ID after login
"absolute_timeout": 28800, # 8 hours max
"idle_timeout": 3600, # 1 hour inactive
}
Machine-to-Machine Auth
# OAuth 2.0 Client Credentials Flow
import requests
def get_service_token():
response = requests.post("https://auth.company.com/oauth/token", data={
"grant_type": "client_credentials",
"client_id": os.environ["SERVICE_CLIENT_ID"],
"client_secret": os.environ["SERVICE_CLIENT_SECRET"],
"audience": "https://api.company.com",
"scope": "orders:read orders:write"
})
return response.json()["access_token"]
# Token caching with refresh
class ServiceTokenManager:
def __init__(self):
self._token = None
self._expires_at = 0
def get_token(self):
if time.time() > self._expires_at - 60: # Refresh 60s early
self._refresh()
return self._token
def _refresh(self):
result = get_service_token()
self._token = result["access_token"]
self._expires_at = time.time() + result["expires_in"]
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Homegrown auth | Security vulnerabilities, maintenance burden | Use established protocols (OIDC, OAuth) |
| Long-lived tokens | Compromised token = persistent access | Short TTL + refresh tokens |
| No MFA for admin accounts | Single point of compromise | MFA required for privileged access |
| Shared service accounts | No attribution, no audit trail | Individual machine identities |
| Hardcoded API keys | Key rotation impossible, leak risk | Secret manager with rotation |
IAM is the foundation of security. Every other security control assumes IAM is working correctly. If IAM is broken, everything else is broken too.