Random Forest SHAP Explainer and Summary Plot Generation

  • Share this:

Code introduction


This function first generates a synthetic dataset, then trains a random forest classifier. Next, it creates a SHAP explainer using the SHAP library, computes SHAP values for the first instance, and finally generates a SHAP summary plot to visualize these values.


Technology Stack : SHAP, NumPy, Pandas, scikit-learn

Code Type : SHAP Explainer and Summary Plot

Code Difficulty : Intermediate


                
                    
def random_shap_function():
    import shap
    import numpy as np
    import pandas as pd
    from sklearn.datasets import make_classification
    from sklearn.ensemble import RandomForestClassifier

    # Generate a synthetic dataset
    X, y = make_classification(n_samples=1000, n_features=20, random_state=42)

    # Train a random forest classifier
    clf = RandomForestClassifier(n_estimators=100, random_state=42)
    clf.fit(X, y)

    # Create a SHAP explainer
    explainer = shap.TreeExplainer(clf)

    # Compute SHAP values for the first instance
    shap_values = explainer.shap_values(X[0])

    # Create a SHAP summary plot
    shap.summary_plot(shap_values, X)

    return explainer