A machine learning perception system that classifies breast tumors as benign or malignant using morphological features from digitized biopsy images, achieving up to 98.2% test accuracy.
Early breast cancer diagnosis relies heavily on expert interpretation of biopsy images. Subtle differences in cell morphology—such as nucleus size, boundary irregularity, and concavity—can indicate whether a tumor is benign or malignant.
This project explores how far a machine learning system can go using only hand‑crafted morphological features, without full image‑based deep learning, to support high‑accuracy classification aligned with clinical intuition.
The system is built on the UCI Breast Cancer Wisconsin dataset, which contains morphological measurements extracted from digitized fine‑needle aspiration (FNA) biopsy images.
The project evaluates multiple supervised learning models on the same feature set to compare performance, robustness, and sensitivity to feature scaling.
The best‑performing models reach up to 98.2% accuracy on the test set, demonstrating that carefully engineered morphological features can support highly reliable classification of breast tumors.
Analysis shows that “worst” (max) values of certain features are especially predictive of malignancy:
These findings align with clinical expectations: more irregular, larger, and more concave nuclei are more likely to be malignant.
The repository is structured for clarity and experimentation:
data/ — dataset filesmodels/ — trained model and scaler artefactsresults/ — evaluation outputssrc/ — training and prediction scriptsnotebooks/ — exploratory analysisFrom medical morphology to swarm behaviour, my work focuses on building ML systems that turn raw structure into reliable, decision‑ready signals.
Talk about ML perception work