Reverse ETL Patterns
Push transformed warehouse data back into operational tools. Covers reverse ETL architecture, common destinations, sync strategies, data activation workflows, and the patterns that close the loop between analytics and operations.
Reverse ETL takes data from your data warehouse and pushes it back into operational systems — CRMs, marketing platforms, customer support tools, and product applications. It closes the gap between analytical insights and operational action.
Why Reverse ETL
Traditional ETL:
Operational Systems → Data Warehouse → Dashboards
Result: Insights trapped in dashboards
Reverse ETL:
Operational Systems → Data Warehouse → Operational Systems
Result: Insights drive automated actions
Examples:
- Warehouse identifies high-value customers → Sync to CRM
- ML model scores churn risk → Push to customer success tool
- Product analytics segments → Sync to marketing platform
- Usage data → Billing system for usage-based pricing
Architecture
Data Warehouse (source of truth for analytics)
├── Customer 360 table
├── Product usage aggregates
├── ML model predictions
└── Revenue analytics
↓
Reverse ETL Tool (Census, Hightouch, Polytoml)
├── Model: SQL query or dbt model defining the data
├── Mapping: Warehouse column → Destination field
├── Sync mode: Full refresh, incremental, mirror
└── Schedule: Real-time, hourly, daily
↓
Destinations
├── Salesforce (lead scoring, customer data)
├── HubSpot (marketing segments, lifecycle stage)
├── Intercom (product usage, support context)
├── Stripe (billing metadata)
└── Slack (alerts, notifications)
Sync Strategies
Incremental Sync
-- Only sync records changed since last sync
SELECT
customer_id,
name,
email,
lifetime_value,
churn_score,
product_tier,
last_activity_at
FROM analytics.customer_360
WHERE updated_at > '{{ last_sync_timestamp }}'
Mirror Sync
-- Full state of the dataset, handles deletes
-- Reverse ETL tool compares against destination
-- Adds new records, updates changed, removes deleted
SELECT
segment_id,
customer_id,
segment_name,
entered_at
FROM analytics.customer_segments
WHERE segment_name = 'high_value_at_risk'
Data Activation Workflows
# Lead scoring workflow
model:
name: "enriched_leads"
query: |
SELECT
l.email,
l.company,
CASE
WHEN u.page_views > 100 AND u.trial_days_remaining < 7 THEN 'hot'
WHEN u.page_views > 50 THEN 'warm'
ELSE 'cold'
END as lead_score,
u.features_used,
u.last_active_at
FROM warehouse.leads l
JOIN warehouse.product_usage u ON l.user_id = u.user_id
destination: salesforce
mapping:
email: Email
company: Company
lead_score: Lead_Score__c
features_used: Features_Used__c
last_active_at: Last_Product_Activity__c
schedule: every_hour
sync_mode: incremental
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Syncing raw data | Destination polluted with noise | Sync curated, purpose-built models |
| No deduplication | Duplicate records in CRM | Deduplicate in warehouse before sync |
| Real-time sync for batch use cases | Unnecessary complexity and cost | Match sync frequency to business need |
| No monitoring | Silent sync failures | Alert on sync errors, row count anomalies |
| Bi-directional sync without conflict resolution | Data ping-pong between systems | One system is source of truth per field |
Reverse ETL turns the warehouse from a reporting tool into an operational engine. The insight is only valuable if it reaches the person or system that can act on it.