Explainable AI (XAI) Methods

A model that predicts a loan should be denied but cannot explain why is useless in regulated industries — and dangerous in unregulated ones. Explainable AI (XAI) provides the tools to understand WHY a model made a specific decision. This is not just about compliance — it is about building trust, debugging models, and catching bias before it causes harm.

The Explainability Spectrum

Black Box ◄──────────────────────────► Glass Box

Deep Neural Networks    Random Forest    Decision Tree    Linear Regression
  (least interpretable)                                   (most interpretable)

The tradeoff:
  More complex model → better accuracy → harder to explain
  Simpler model → worse accuracy → easier to explain

XAI closes this gap:
  Complex model + explanation method = accuracy + interpretability

SHAP (SHapley Additive exPlanations)

import shap
import xgboost as xgb

# Train a model
model = xgb.XGBClassifier()
model.fit(X_train, y_train)

# Create SHAP explainer
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test)

# Global explanation: Which features matter most overall?
# shap.summary_plot(shap_values, X_test)
# Output: Ranked list of features by importance across all predictions

# Local explanation: Why THIS specific prediction?
# For a single loan application:
single_prediction = shap_values[0]
# Feature contributions:
#   income:       +0.35  (high income → approved)
#   credit_score: +0.28  (good credit → approved)
#   debt_ratio:   -0.45  (high debt → denied)
#   employment:   +0.12  (stable employment → approved)
#   ─────────────────────
#   Base value:    0.50  (average prediction)
#   Final:         0.80  (likely approved)

# SHAP tells you:
# "This loan was approved primarily because of low debt ratio (+0.35)
#  and good credit score (+0.28), despite shorter employment history (-0.12)"

LIME (Local Interpretable Model-Agnostic Explanations)

from lime.lime_tabular import LimeTabularExplainer

# LIME works differently from SHAP:
# 1. Take a single prediction
# 2. Generate similar samples (perturb input)
# 3. Fit a simple model (linear) on the neighborhood
# 4. The simple model's weights ARE the explanation

explainer = LimeTabularExplainer(
    training_data=X_train.values,
    feature_names=feature_names,
    class_names=["denied", "approved"],
    mode="classification",
)

# Explain a single prediction
explanation = explainer.explain_instance(
    data_row=X_test.iloc[42].values,
    predict_fn=model.predict_proba,
    num_features=10,
)

# Output:
# credit_score > 720:        +0.31 (toward approved)
# debt_to_income < 0.3:      +0.25 (toward approved)
# years_employed > 5:         +0.18 (toward approved)
# previous_defaults = 0:      +0.15 (toward approved)
# loan_amount > 50000:        -0.08 (toward denied)

Anti-Patterns

Anti-Pattern	Consequence	Fix
Skip explanations for “accuracy”	Biased models deployed without detection	Mandatory explanations for high-stakes decisions
Explain only in development	Users and regulators need explanations too	Real-time explanations in production
Trust explanation blindly	Explanations can be misleading	Validate explanations against domain knowledge
Same explanation for all audiences	Data scientists ≠ business users	Tailor explanations: technical vs. natural language
Post-hoc explanation only	Model may be fundamentally uninterpretable	Consider inherently interpretable models first

Explainability is not a feature — it is a responsibility. Every model that affects human outcomes should be explainable to the degree required by its impact. Medical diagnoses need more explanation than movie recommendations.

The Explainability Spectrum

SHAP (SHapley Additive exPlanations)

LIME (Local Interpretable Model-Agnostic Explanations)

Anti-Patterns

More in Data Science

A/B Testing at Scale

A/B Testing Statistical Framework

A/B Testing Infrastructure: Making Data-Driven Decisions Without Breaking Production