You can download this code by clicking the button below.
This code is now available for download.
This code defines a function that uses the SHAP library to analyze the performance of a model on the training set and test set, including calculating SHAP values and the accuracy on the test set.
Technology Stack : SHAP, NumPy, Pandas, scikit-learn
Code Type : The type of code
Code Difficulty : Intermediate
import numpy as np
import pandas as pd
import shap
def analyze_model_performance(model, X_train, y_train, X_test, y_test):
# Train the model
model.fit(X_train, y_train)
# Compute SHAP values for the training set
explainer = shap.TreeExplainer(model)
shap_values_train = explainer.shap_values(X_train)
# Compute SHAP values for the test set
shap_values_test = explainer.shap_values(X_test)
# Create a DataFrame to display SHAP values
train_shap_df = pd.DataFrame(shap_values_train[0], index=X_train.columns, columns=['SHAP Values'])
test_shap_df = pd.DataFrame(shap_values_test[0], index=X_test.columns, columns=['SHAP Values'])
# Calculate the model's accuracy on the test set
test_accuracy = model.score(X_test, y_test)
return train_shap_df, test_shap_df, test_accuracy