Experiment tracking#
SQLiteTracker
provides a powerful and flexible way to track computational (e.g., Machine Learning) experiments using a SQLite database. Allows you to use SQL as the query language, giving you a powerful tool for experiment comparison, and comes with plotting features to compare plots side-by-side and to combine plots for better comparison.
Read more about the motivations in our blog post, check out the HN discussion.
This tutorial will walk you through the features with a Machine Learning use case; however, the tracker is generic enough to be used in any other domains.
from pathlib import Path
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# delete our example database, if any
db = Path("my_experiments.db")
if db.exists():
db.unlink()
from sklearn_evaluation import SQLiteTracker
tracker = SQLiteTracker("my_experiments.db")
X, y = datasets.make_classification(200, 10, n_informative=5, class_sep=0.65)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.33, random_state=42
)
models = [RandomForestClassifier(), LogisticRegression(), DecisionTreeClassifier()]
Training and logging models#
for m in models:
model = type(m).__name__
print(f"Fitting {model}")
experiment = tracker.new_experiment()
m.fit(X_train, y_train)
y_pred = m.predict(X_test)
acc = accuracy_score(y_test, y_pred)
# log a dictionary with log_dict
experiment.log_dict({"accuracy": acc, "model": model, **m.get_params()})
Show code cell output
Fitting RandomForestClassifier
Fitting LogisticRegression
Fitting DecisionTreeClassifier
Displaying latest experiments#
Display the tracker
object to show last experiments:
tracker
SQLiteTracker
uuid | created | parameters | comment |
---|---|---|---|
7838a86d | 2023-03-20 14:22:00 | {"accuracy": 0.5757575757575758, "model": "DecisionTreeClassifier", "ccp_alpha": 0.0, "class_weight": null, "criterion": "gini", "max_depth": null, "max_features": null, "max_leaf_nodes": null, "min_impurity_decrease": 0.0, "min_samples_leaf": 1, "min_samples_split": 2, "min_weight_fraction_leaf": 0.0, "random_state": null, "splitter": "best"} | |
967f9125 | 2023-03-20 14:21:59 | {"accuracy": 0.7121212121212122, "model": "RandomForestClassifier", "bootstrap": true, "ccp_alpha": 0.0, "class_weight": null, "criterion": "gini", "max_depth": null, "max_features": "sqrt", "max_leaf_nodes": null, "max_samples": null, "min_impurity_decrease": 0.0, "min_samples_leaf": 1, "min_samples_split": 2, "min_weight_fraction_leaf": 0.0, "n_estimators": 100, "n_jobs": null, "oob_score": false, "random_state": null, "verbose": 0, "warm_start": false} | |
15d673ec | 2023-03-20 14:21:59 | {"accuracy": 0.7575757575757576, "model": "LogisticRegression", "C": 1.0, "class_weight": null, "dual": false, "fit_intercept": true, "intercept_scaling": 1, "l1_ratio": null, "max_iter": 100, "multi_class": "auto", "n_jobs": null, "penalty": "l2", "random_state": null, "solver": "lbfgs", "tol": 0.0001, "verbose": 0, "warm_start": false} |
(Most recent experiments)
Tip
Click here to see the detailed user guide.