Hi, how do I interpret the Feature Explainability Graph after running feature importance?
To calculate feature importance, PyRasgo first builds a gradient boosted tree using catboost on 80% of the data, then calculates the SHAP values on that data. SHAP values are calculated for each feature of each observation and represent the amount that feature pushes the prediction for this observation away from expected prediction over the entire dataset.
The feature importance value is the mean absolute value of SHAP for that feature across all observations. While this shows overall importance, it loses some of the information SHAP contains on an observation by observation level.
The Feature Explainability shows, for each observation, what was the feature value on the horizontal axis and what was the corresponding SHAP value on the vertical axis. Instead of showing each observation as a single point on the graph, PyRasgo bins both the feature value and SHAP values and shows you which bins have the most observations. Red bins have negative SHAP values and green bins have positive SHAP values. The intensity of the bin shows the frequency of occurrences. Solid bins have more observations that fall within their feature and SHAP value ranges, while more transparent bins have fewer observations.
Can I share the feature importance plots from PyRasgo?