Report: Evaluation
Contents
Report: Evaluation#
New in version 0.11.4.
We use different metrics to estimate a machine learning model’s performance, and to understand its strengths and weaknesses.
In this guide, we’ll show you how to easily generate a report with everything your need in one place using our evaluate_models
.
We’ll use the heart disease dataset, you can download it from here.
Download the data#
import urllib.request
import pandas as pd
urllib.request.urlretrieve(
"https://raw.githubusercontent.com/sharmaroshan/"
+ "Heart-UCI-Dataset/master/heart.csv",
filename="heart.csv",
)
data = pd.read_csv("heart.csv")
Prepare the data#
from sklearn.model_selection import train_test_split
column = "fbs"
X = data.drop(column, axis=1)
y = data[column]
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=2023
)
Define the model#
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
y_score = model.predict_proba(X_test)
Evaluate the model#
from sklearn_evaluation.report import evaluate_model
report = evaluate_model(model, y_test, y_pred, y_score=y_score)
Embed the report#
report
Model evaluation - RandomForestClassifier
balance
- Your test set is highly imbalanced
- If you need help understanding these stats, send us a message on slack
accuracy
- Accuracy is 0.9016393442622951
- Please note your model is unbalanced, so high accuracy could be misleading
auc
- Area under curve is low for class 0
- If you need help understanding these stats, send us a message on slack
- Number of classes : 1
- AUC (roc) is : 0.49464285714285716
general stats
Save report as HTML#
report.save("report.html")