Recommendation Systems
Build recommendation engines that surface relevant content, products, and experiences. Covers collaborative filtering, content-based filtering, hybrid approaches, evaluation metrics, cold start problem, and the patterns that power personalized recommendations at scale.
Recommendation systems predict what a user will want to see or buy next. They drive 35% of Amazon’s revenue, 80% of Netflix viewing, and 60% of YouTube watch time. The difference between a good and great recommendation system is the difference between suggesting “more shoes” and suggesting the exact shoe the user will love.
Recommendation Approaches
Collaborative Filtering:
"Users who liked X also liked Y"
Based on user behavior patterns
No need to understand content
Content-Based Filtering:
"You liked action movies, here's another action movie"
Based on item attributes and user preferences
Works for new items (no cold start for items)
Hybrid:
Combine collaborative + content-based
Best of both approaches
Most production systems are hybrid
Knowledge-Based:
Rules and constraints
"Budget < $100, size M, color blue"
Best for high-consideration purchases
Collaborative Filtering
# User-Item Matrix
# Movie1 Movie2 Movie3 Movie4
# Alice: 5 3 ? 1
# Bob: 4 ? 4 1
# Carol: 1 1 5 5
# Dave: ? 3 4 4
# Alice hasn't rated Movie3
# Similar users (Bob) rated Movie3: 4
# Prediction for Alice on Movie3: ~3.5-4.0
import numpy as np
from scipy.sparse.linalg import svds
class MatrixFactorization:
def __init__(self, n_factors=50):
self.n_factors = n_factors
def fit(self, ratings_matrix):
"""Factor rating matrix into user and item latent factors."""
# Center ratings
self.user_means = ratings_matrix.mean(axis=1)
centered = ratings_matrix - self.user_means.reshape(-1, 1)
# SVD decomposition
U, sigma, Vt = svds(centered, k=self.n_factors)
self.user_factors = U # (n_users, n_factors)
self.item_factors = (np.diag(sigma) @ Vt).T # (n_items, n_factors)
def predict(self, user_id, item_id):
"""Predict rating for user-item pair."""
prediction = (
self.user_means[user_id] +
self.user_factors[user_id] @ self.item_factors[item_id]
)
return np.clip(prediction, 1, 5)
def recommend(self, user_id, n=10):
"""Get top-N recommendations for a user."""
scores = (
self.user_means[user_id] +
self.user_factors[user_id] @ self.item_factors.T
)
# Exclude already-rated items
already_rated = self.get_rated_items(user_id)
scores[already_rated] = -np.inf
top_indices = np.argsort(scores)[::-1][:n]
return top_indices, scores[top_indices]
Evaluation Metrics
# Offline evaluation
def evaluate_recommendations(model, test_data):
metrics = {
# Ranking quality
"precision@10": precision_at_k(model, test_data, k=10),
"recall@10": recall_at_k(model, test_data, k=10),
"ndcg@10": ndcg_at_k(model, test_data, k=10),
"map@10": mean_average_precision(model, test_data, k=10),
# Coverage
"catalog_coverage": len(recommended_items) / len(all_items),
"user_coverage": len(users_with_recs) / len(all_users),
# Diversity
"intra_list_diversity": average_pairwise_distance(recommendations),
# Novelty
"novelty": average_self_information(recommendations),
}
return metrics
# Online evaluation (A/B test)
online_metrics = {
"ctr": "Click-through rate on recommendations",
"conversion_rate": "Purchase rate from recommendations",
"revenue_per_user": "Revenue attributed to recommendations",
"engagement_time": "Time spent with recommended content",
"diversity_of_consumption": "Breadth of categories consumed",
}
Cold Start Problem
New user cold start:
Problem: No history, can't find similar users
Solutions:
1. Ask preferences during onboarding
2. Recommend popular/trending items
3. Use demographic-based recommendations
4. Content-based using first few interactions
New item cold start:
Problem: No user interactions with new item
Solutions:
1. Content-based features (genre, description)
2. Explore/exploit: show to random users
3. Editorial curation for launch period
4. Similar item features from existing catalog
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Popularity bias only | Filter bubble, no personalization | Balance popularity with diversity |
| No exploration | Users never see new/niche items | Explore/exploit (epsilon-greedy, Thompson sampling) |
| Offline metrics only | Model optimizes for proxy, not real engagement | A/B test recommendations |
| Training on implicit feedback only | Biased toward easy-to-measure actions | Combine implicit + explicit signals |
| No freshness | Stale recommendations | Time-decay, recency weighting |
Recommendations are not just about accuracy. Diversity, novelty, and serendipity are what keep users engaged long-term.