Real-Time Streaming with Kafka: Architecture Guide
Design production Kafka architectures. Covers topic design, partitioning, consumer groups, exactly-once semantics, Kafka Connect, and operational best practices.
Apache Kafka is the backbone of real-time data infrastructure at companies processing millions of events per second. Netflix, LinkedIn, Uber, and thousands of enterprises rely on Kafka for event streaming, log aggregation, CDC (Change Data Capture), and real-time analytics.
But Kafka is infrastructure, not a solution — a misconfigured topic, a lagging consumer group, or a bad partitioning strategy can turn your real-time pipeline into a real-time bottleneck. This guide covers topic design, producer and consumer patterns, Kafka Connect, monitoring, and the operational knowledge needed to run Kafka in production without surprises.
Core Concepts
Producers → Topics (partitioned, replicated) → Consumer Groups → Downstream Systems
│
├── Partition 0: [msg1, msg4, msg7, ...] ── Consumer A
├── Partition 1: [msg2, msg5, msg8, ...] ── Consumer B
└── Partition 2: [msg3, msg6, msg9, ...] ── Consumer C
| Concept | Description | Key Detail |
|---|---|---|
| Topic | Named stream of records (like a table) | Immutable append-only log |
| Partition | Ordered, immutable sequence within a topic | Unit of parallelism |
| Producer | Publishes records to topics | Controls partition assignment via key |
| Consumer Group | Set of consumers sharing work across partitions | Each partition assigned to exactly one consumer |
| Offset | Position of a consumer within a partition | Consumers track their own position |
| Broker | Kafka server that stores and serves data | Typically 3+ brokers per cluster |
| Replication Factor | Number of copies per partition | RF=3 means 3 brokers have the data |
| ISR (In-Sync Replicas) | Replicas that are caught up with the leader | Dropped ISR = potential data loss risk |
When to Use Kafka vs Alternatives
| Use Case | Kafka | SQS/SNS | RabbitMQ | Redis Streams |
|---|---|---|---|---|
| High-throughput event streaming | ✅ Best | ❌ | ⚠️ | ⚠️ |
| Log aggregation | ✅ Best | ❌ | ❌ | ❌ |
| Simple task queue | ⚠️ Overkill | ✅ Best | ✅ | ✅ |
| CDC (Change Data Capture) | ✅ Best | ❌ | ❌ | ❌ |
| Request/reply pattern | ❌ Wrong tool | ⚠️ | ✅ Best | ✅ |
| Event replay/reprocessing | ✅ Best | ❌ | ❌ | ⚠️ |
Topic Design
Naming Convention
A consistent naming convention prevents confusion as topics multiply:
<domain>.<entity>.<event-type>
Examples:
orders.created ← New order placed
orders.shipped ← Order shipped
payments.processed ← Payment completed
payments.failed ← Payment failed
users.profile-updated ← User profile changed
inventory.stock-changed ← Stock level update
analytics.page-viewed ← Page view event
cdc.public.orders ← CDC from PostgreSQL orders table
Partitioning Strategy
Partitioning determines parallelism and ordering. Choose your partition key carefully — you cannot change it later without creating a new topic.
# Partition by entity ID — all events for an entity go to same partition
producer.send(
topic="orders.created",
key=order.customer_id.encode(), # Partition key
value=json.dumps(order).encode()
)
# This ensures:
# 1. All orders for customer_123 go to the SAME partition
# 2. Events for customer_123 are processed in ORDER
# 3. Different customers are distributed across partitions for parallelism
| Partition Count | Throughput | Ordering | When to Use |
|---|---|---|---|
| 1 | Low, but total ordering | Global order guaranteed | Strict sequential processing |
| 6-12 | Good for most workloads | Per-key ordering | Default starting point |
| 50-100 | High throughput | Per-key ordering | High-volume production |
| 500+ | Maximum throughput | Per-key ordering | Extreme scale (Netflix-level) |
Rule: Start with 6-12 partitions. You can increase partitions later, but be aware that increasing partitions redistributes keys — messages for the same key may land on different partitions after the change.
Producer Patterns
from confluent_kafka import Producer
conf = {
'bootstrap.servers': 'kafka-1:9092,kafka-2:9092,kafka-3:9092',
'acks': 'all', # Wait for all ISR replicas to acknowledge
'enable.idempotence': True, # Exactly-once per partition (prevents duplicates on retry)
'max.in.flight.requests.per.connection': 5, # Max with idempotence
'retries': 2147483647, # Infinite retries (idempotence handles dedup)
'linger.ms': 5, # Wait 5ms to batch messages
'batch.size': 32768, # 32KB batch size
'compression.type': 'lz4', # LZ4 for best speed; zstd for best ratio
}
producer = Producer(conf)
def on_delivery(err, msg):
if err:
log.error(f"Delivery failed for {msg.key()}: {err}")
# Alert, retry, or send to dead letter topic
else:
log.debug(f"Delivered to {msg.topic()}[{msg.partition()}] @ offset {msg.offset()}")
producer.produce(
topic='orders.created',
key=order_id.encode(),
value=json.dumps(event).encode(),
callback=on_delivery
)
producer.flush() # Block until all messages are delivered
Producer Configuration Guide
| Setting | Default | Recommended | Why |
|---|---|---|---|
acks | 1 | all | Prevents data loss on broker failure |
enable.idempotence | false | true | Prevents duplicate messages on retry |
compression.type | none | lz4 or zstd | 40-60% bandwidth reduction |
linger.ms | 0 | 5-50 | Batching improves throughput dramatically |
batch.size | 16384 | 32768-65536 | Larger batches = fewer requests |
Consumer Patterns
from confluent_kafka import Consumer
conf = {
'bootstrap.servers': 'kafka-1:9092,kafka-2:9092,kafka-3:9092',
'group.id': 'order-processor',
'auto.offset.reset': 'earliest', # Start from beginning if no offset stored
'enable.auto.commit': False, # Manual commit for exactly-once processing
'max.poll.interval.ms': 300000, # 5 min max processing time per batch
'session.timeout.ms': 45000, # 45 sec heartbeat timeout
'fetch.min.bytes': 1024, # Wait for at least 1KB to reduce requests
}
consumer = Consumer(conf)
consumer.subscribe(['orders.created'])
while True:
msg = consumer.poll(1.0)
if msg is None:
continue
if msg.error():
handle_error(msg.error())
continue
try:
event = json.loads(msg.value())
process_order(event)
consumer.commit(msg) # Commit ONLY after successful processing
except RetryableError as e:
log.warning(f"Retryable error, will retry: {e}")
# Don't commit — message will be reprocessed
except Exception as e:
send_to_dlq(msg, error=str(e)) # Dead letter queue for poison pills
consumer.commit(msg) # Commit to advance past the bad message
Consumer Group Scaling
| Consumers in Group | Partitions | Assignment | Notes |
|---|---|---|---|
| 3 consumers | 6 partitions | 2 partitions each | Balanced |
| 6 consumers | 6 partitions | 1 partition each | Maximum parallelism |
| 9 consumers | 6 partitions | 6 active, 3 idle | Wasted resources — max consumers = partition count |
Kafka Connect
Pre-built connectors for moving data in and out of Kafka without writing custom code. Use Debezium for CDC (capturing database changes as Kafka events).
{
"name": "postgres-source",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "db.internal",
"database.port": "5432",
"database.dbname": "orders_db",
"database.user": "kafka_connect",
"topic.prefix": "cdc",
"table.include.list": "public.orders,public.order_items",
"slot.name": "kafka_connect_slot",
"publication.name": "kafka_connect_pub",
"transforms": "route",
"transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.route.regex": "cdc.public.(.*)",
"transforms.route.replacement": "orders.$1"
}
}
| Direction | Popular Connectors | Use Case |
|---|---|---|
| Source → Kafka | Debezium (CDC) | Real-time database replication |
| Source → Kafka | JDBC Source | Periodic database polling |
| Source → Kafka | S3 Source | File ingestion |
| Kafka → Sink | Elasticsearch | Search indexing |
| Kafka → Sink | S3 Sink | Data lake ingestion |
| Kafka → Sink | Snowflake/BigQuery | Data warehouse loading |
| Kafka → Sink | JDBC Sink | Database synchronization |
Operational Best Practices
Monitoring
| Metric | Alert Threshold | Severity | Tool |
|---|---|---|---|
| Consumer lag | > 10,000 messages | Warning | Burrow, Kafka Exporter |
| Consumer lag | > 100,000 messages | Critical | Burrow, Kafka Exporter |
| Under-replicated partitions | > 0 for > 5 min | Critical | JMX / Prometheus |
| Request latency (p99) | > 100ms | Warning | JMX / Prometheus |
| Disk usage per broker | > 75% | Warning | Node exporter |
| ISR (In-Sync Replicas) shrink | Any shrink event | Critical | JMX / Prometheus |
| Active controller count | != 1 | Critical | JMX / Prometheus |
Retention & Storage
# Topic-level configuration
retention.ms=604800000 # 7 days (default for event topics)
retention.bytes=10737418240 # 10 GB per partition (cap for cost control)
cleanup.policy=delete # Delete old segments after retention
# cleanup.policy=compact # Keep latest value per key (changelog/state)
segment.ms=3600000 # Roll segments every hour
min.insync.replicas=2 # At least 2 replicas must acknowledge (with acks=all)
Schema Management
Always use a schema registry (Confluent Schema Registry or Apicurio) with Avro or Protobuf schemas. Without it, a producer change can break every downstream consumer.
# Schema evolution rules
# BACKWARD compatible: new schema can read old data (add fields with defaults)
# FORWARD compatible: old schema can read new data (add optional fields)
# FULL compatible: both backward and forward (safest, recommended)
Checklist
- Topic naming convention established and documented
- Partition count set based on throughput needs (start 6-12)
- Replication factor ≥ 3 for all production topics
-
min.insync.replicas = 2configured - Producer idempotence enabled (
enable.idempotence=true) - Consumer manual offset commit (not auto-commit)
- Dead letter queue (DLQ) configured for failed messages
- Consumer lag monitoring active with alerts
- Retention policy configured per topic (time + size limits)
- Schema registry deployed with compatibility mode set
- Broker disk space monitored with 75% alert threshold
- Cluster has at least 3 brokers for fault tolerance
:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For data engineering consulting, visit garnetgrid.com. :::