Explainable AI (XAI) Methods
Make machine learning model decisions interpretable and transparent. Covers SHAP values, LIME explanations, feature importance, model-agnostic methods, and the patterns that bridge the gap between model accuracy and human understanding.
A model that predicts a loan should be denied but cannot explain why is useless in regulated industries — and dangerous in unregulated ones. Explainable AI (XAI) provides the tools to understand WHY a model made a specific decision. This is not just about compliance — it is about building trust, debugging models, and catching bias before it causes harm.
The Explainability Spectrum
Black Box ◄──────────────────────────► Glass Box
Deep Neural Networks Random Forest Decision Tree Linear Regression
(least interpretable) (most interpretable)
The tradeoff:
More complex model → better accuracy → harder to explain
Simpler model → worse accuracy → easier to explain
XAI closes this gap:
Complex model + explanation method = accuracy + interpretability
SHAP (SHapley Additive exPlanations)
import shap
import xgboost as xgb
# Train a model
model = xgb.XGBClassifier()
model.fit(X_train, y_train)
# Create SHAP explainer
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test)
# Global explanation: Which features matter most overall?
# shap.summary_plot(shap_values, X_test)
# Output: Ranked list of features by importance across all predictions
# Local explanation: Why THIS specific prediction?
# For a single loan application:
single_prediction = shap_values[0]
# Feature contributions:
# income: +0.35 (high income → approved)
# credit_score: +0.28 (good credit → approved)
# debt_ratio: -0.45 (high debt → denied)
# employment: +0.12 (stable employment → approved)
# ─────────────────────
# Base value: 0.50 (average prediction)
# Final: 0.80 (likely approved)
# SHAP tells you:
# "This loan was approved primarily because of low debt ratio (+0.35)
# and good credit score (+0.28), despite shorter employment history (-0.12)"
LIME (Local Interpretable Model-Agnostic Explanations)
from lime.lime_tabular import LimeTabularExplainer
# LIME works differently from SHAP:
# 1. Take a single prediction
# 2. Generate similar samples (perturb input)
# 3. Fit a simple model (linear) on the neighborhood
# 4. The simple model's weights ARE the explanation
explainer = LimeTabularExplainer(
training_data=X_train.values,
feature_names=feature_names,
class_names=["denied", "approved"],
mode="classification",
)
# Explain a single prediction
explanation = explainer.explain_instance(
data_row=X_test.iloc[42].values,
predict_fn=model.predict_proba,
num_features=10,
)
# Output:
# credit_score > 720: +0.31 (toward approved)
# debt_to_income < 0.3: +0.25 (toward approved)
# years_employed > 5: +0.18 (toward approved)
# previous_defaults = 0: +0.15 (toward approved)
# loan_amount > 50000: -0.08 (toward denied)
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Skip explanations for “accuracy” | Biased models deployed without detection | Mandatory explanations for high-stakes decisions |
| Explain only in development | Users and regulators need explanations too | Real-time explanations in production |
| Trust explanation blindly | Explanations can be misleading | Validate explanations against domain knowledge |
| Same explanation for all audiences | Data scientists ≠ business users | Tailor explanations: technical vs. natural language |
| Post-hoc explanation only | Model may be fundamentally uninterpretable | Consider inherently interpretable models first |
Explainability is not a feature — it is a responsibility. Every model that affects human outcomes should be explainable to the degree required by its impact. Medical diagnoses need more explanation than movie recommendations.