Classification - Hand on Machine Learning Book

Ziad Tamim / April 23, 2025

ClassificationMachine LearningMNIST

This project is form the book "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition".

This project explores the fundamentals of supervised machine learning by building a digit classifier using the MNIST dataset—a widely used benchmark in the field. The goal was to implement and evaluate various classification strategies, starting with binary classification and extending to multiclass, multilabel, and multioutput tasks. Using Scikit-Learn, i have:

Trained a binary classifier to detect the digit “5” using an SGDClassifier, analyzed performance using confusion matrices, precision, recall, F1 scores, and visualized the precision-recall and ROC curves.
Compared classifiers like SGD and Random Forest, demonstrating the importance of precision-recall trade-offs and using threshold tuning to optimize for different metrics.
Extended the model to multiclass classification, training models that predict digits from 0 to 9 using One-vs-One and One-vs-Rest strategies with SVM and SGD.
Performed in-depth error analysis using normalized confusion matrices and visual inspection of misclassified digits to identify model weaknesses.
Implemented multilabel classification (e.g., large vs. small digits, odd vs. even) and a multioutput classification system to denoise handwritten digit images using KNeighborsClassifier.

This project served as a foundational hands-on exercise in building, evaluating, and improving classification systems, while demonstrating the value of choosing the right metrics and understanding classifier behavior through visualization and error analysis.

Links