Explaining RandomForestRegressor Predictions with SHAP Values

  • Share this:

Code introduction


This function first generates a random dataset, then fits a RandomForestRegressor model to it, and then uses the SHAP library to interpret the model's predictions and generate SHAP values.


Technology Stack : numpy, pandas, sklearn.ensemble.RandomForestRegressor, shap

Code Type : Function

Code Difficulty : Intermediate


                
                    
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
import shap

def predict_and_explain(input_features):
    # Load a dataset
    data = pd.DataFrame({
        'feature1': np.random.rand(100),
        'feature2': np.random.rand(100),
        'feature3': np.random.rand(100)
    })
    
    # Define the target variable
    target = data['feature1'] * data['feature2'] + data['feature3']
    
    # Fit a RandomForestRegressor
    model = RandomForestRegressor()
    model.fit(data[['feature1', 'feature2', 'feature3']], target)
    
    # Generate SHAP values
    explainer = shap.TreeExplainer(model)
    shap_values = explainer.shap_values(data[['feature1', 'feature2', 'feature3']])
    
    # Summarize the SHAP values
    shap.summary_plot(shap_values, data[['feature1', 'feature2', 'feature3']])
    
    return shap_values