You can download this code by clicking the button below.
This code is now available for download.
This function randomly selects the most important features from the given dataset and returns their names and importance scores.
Technology Stack : XGBoost, NumPy
Code Type : Function
Code Difficulty : Intermediate
import xgboost as xgb
import numpy as np
def random_xgb_feature_importance(data, label, num_features=5):
"""
Selects random features from the dataset and returns their importance scores.
"""
# Initialize the XGBoost classifier
xgb_clf = xgb.XGBClassifier(use_label_encoder=False, eval_metric='mlogloss')
# Fit the classifier to the data
xgb_clf.fit(data, label)
# Get the feature importances
importances = xgb_clf.feature_importances_
# Randomly select features based on their importances
indices = np.argsort(importances)[::-1][:num_features]
selected_features = [data.columns[i] for i in indices]
# Return the selected features and their importances
return selected_features, importances[indices]