ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Time Series Forecasting in Production

End-to-end time series forecasting for production systems. Covers classical methods, modern ML approaches, forecast evaluation, and building automated forecasting pipelines.

Time series forecasting powers demand planning, capacity management, financial projections, and anomaly detection. Yet most forecasting projects fail in production — not because the model is wrong, but because the pipeline around it is fragile. A production forecasting system needs automated retraining, forecast evaluation, uncertainty quantification, and graceful handling of missing data and regime changes.


Method Selection Guide

Data CharacteristicsRecommended MethodWhen It Fails
Strong seasonality, stable trendSARIMA / ETSRegime changes, multiple seasonalities
Multiple seasonalities (daily+weekly)Prophet / MSTLComplex non-linear patterns
Many related seriesLightGBM / XGBoostShort series (< 100 points)
Long history, complex patternsN-BEATS / TFTLimited data, need interpretability
Hierarchical dataReconciliation + any base modelIncoherent base forecasts

The practical rule: Start with a seasonal naive baseline (repeat last year’s pattern). If you can’t beat seasonal naive, your model isn’t learning anything useful. Then try ETS and Prophet. Only reach for deep learning if simpler methods plateau and you have > 1,000 data points per series.


Production Pipeline Architecture

Data Ingestion → Preprocessing → Feature Engineering → 
Model Training → Forecast Generation → Evaluation → 
Reconciliation → Storage → API / Dashboard

Preprocessing for Real-World Data

Real time series data is messy: missing values, outliers, calendar effects, and structural breaks.

class TimeSeriesPreprocessor:
    def __init__(self, freq='D'):
        self.freq = freq
    
    def process(self, series: pd.Series) -> pd.Series:
        # 1. Ensure regular frequency
        series = series.asfreq(self.freq)
        
        # 2. Handle missing values (max 3 consecutive)
        series = series.interpolate(method='time', limit=3)
        
        # 3. Detect and replace outliers (IQR method)
        q1, q3 = series.quantile([0.25, 0.75])
        iqr = q3 - q1
        lower, upper = q1 - 3 * iqr, q3 + 3 * iqr
        series = series.clip(lower, upper)
        
        # 4. Flag remaining gaps (fill with seasonal average)
        if series.isna().any():
            seasonal_avg = series.groupby(series.index.dayofweek).transform('mean')
            series = series.fillna(seasonal_avg)
        
        return series

Forecast Evaluation

Metrics That Matter

MetricFormulaUse When
MAPEMean Absolute Percentage ErrorComparing across different scales
RMSSERoot Mean Squared Scaled ErrorM5/M6 competition standard
WAPEWeighted Absolute Percentage ErrorAggregated business reporting
Coverage% of actuals within prediction intervalEvaluating uncertainty

Never use MAPE alone. It’s undefined when actuals are zero and biased toward under-forecasting. Use WAPE for business reporting and RMSSE for model comparison.

Backtesting Protocol

def time_series_cv(series, model_fn, n_splits=5, horizon=30, gap=7):
    """Walk-forward cross-validation with gap."""
    results = []
    
    for i in range(n_splits):
        # Training: all data up to cutoff
        cutoff = len(series) - (n_splits - i) * horizon - gap
        train = series[:cutoff]
        
        # Gap: skip `gap` days to simulate production delay
        test_start = cutoff + gap
        test = series[test_start:test_start + horizon]
        
        # Forecast
        model = model_fn(train)
        forecast = model.predict(horizon)
        
        results.append({
            'fold': i,
            'mape': mean_absolute_percentage_error(test, forecast),
            'coverage': prediction_interval_coverage(test, forecast)
        })
    
    return pd.DataFrame(results)

The gap parameter simulates the real-world delay between data availability and forecast generation. If your data pipeline has a 2-day lag, set gap=2. Omitting the gap overestimates accuracy.


Uncertainty Quantification

Point forecasts are almost always wrong. Prediction intervals communicate the range of likely outcomes, which is far more useful for decision-making.

Methods:

  1. Parametric: Assume residuals follow a distribution (normal, Student-t)
  2. Bootstrap: Resample residuals and regenerate forecasts
  3. Quantile regression: Directly model quantiles (10th, 50th, 90th)
  4. Conformal prediction: Distribution-free, guaranteed coverage

For business use: Report the 10th, 50th, and 90th percentile forecasts. The 10th percentile is your conservative plan, the 50th is the expected outcome, and the 90th is the optimistic scenario. This maps cleanly to “worst case / base case / best case” planning.


Monitoring and Retraining

Forecast Monitoring

Track these metrics in production:

  1. Forecast accuracy decay: Does accuracy degrade over the forecast horizon?
  2. Bias detection: Are forecasts systematically high or low?
  3. Coverage calibration: Do 90% prediction intervals actually contain 90% of actuals?
  4. Concept drift: Has the underlying data distribution changed?

Retraining Triggers

  • Scheduled: Weekly or monthly, depending on data velocity
  • Performance-based: Retrain when rolling MAPE exceeds threshold
  • Event-driven: After known structural changes (new product launch, policy change)
  • Drift-detected: When statistical drift tests fail (PSI > 0.2)

The best forecasting systems aren’t the ones with the most sophisticated models — they’re the ones that detect when their models are wrong and adapt automatically.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →