Feature Ranking#

Feature ranking allows to evaluate single features or pairs of features using a variety of metrics that score the features on the scale [-1, 1] or [0, 1] allowing them to be ranked. Two types of ranking are supported right now:

  • 1-D Rank : Ranking that considers one feature at a time and plots the relative ranks of each feature on a bar chart. Default is Shapiro-Wilk algorithm.

  • 2-D Rank : Ranking that considers pairs of features at a time and visualizes the ranks on the lower left triangle of a feature co-occurence matrix.

import matplotlib
from sklearn.datasets import load_iris as load_data
from sklearn_evaluation.plot import Rank1D, Rank2D
matplotlib.rcParams["figure.figsize"] = (7, 7)
matplotlib.rcParams["font.size"] = 18
X, y = load_data(return_X_y=True)
features = [
    "sepal length (cm)",
    "sepal width (cm)",
    "petal length (cm)",
    "petal width (cm)",
]

Rank 1D#

rank1d = Rank1D(features=features)
rank1d.feature_ranks(X)
<Axes: title={'center': 'Shapiro Ranking of 4 Features'}>
../_images/feature_ranking_5_1.png

Rank 2D#

rank2d = Rank2D(features=features)
rank2d.feature_ranks(X)
<Axes: title={'center': 'Pearson Ranking of 4 Features'}>
../_images/feature_ranking_7_1.png