Breast Cancer Perception

High‑accuracy morphological classification with multi‑model ML.

A machine learning perception system that classifies breast tumors as benign or malignant using morphological features from digitized biopsy images, achieving up to 98.2% test accuracy.

Supervised learning Medical imaging features Multi‑model evaluation

Problem

Encoding clinical morphology into a perception system.

Early breast cancer diagnosis relies heavily on expert interpretation of biopsy images. Subtle differences in cell morphology—such as nucleus size, boundary irregularity, and concavity—can indicate whether a tumor is benign or malignant.

This project explores how far a machine learning system can go using only hand‑crafted morphological features, without full image‑based deep learning, to support high‑accuracy classification aligned with clinical intuition.

Dataset

Morphological features from digitized biopsies.

The system is built on the UCI Breast Cancer Wisconsin dataset, which contains morphological measurements extracted from digitized fine‑needle aspiration (FNA) biopsy images.

569 labeled samples (benign vs malignant)
30 continuous morphological features
Features derived from cell nuclei shape, size, and texture

Models

A multi‑model pipeline for robust classification.

The project evaluates multiple supervised learning models on the same feature set to compare performance, robustness, and sensitivity to feature scaling.

Logistic Regression
Support Vector Machine (RBF kernel)
Random Forest Classifier
Multi‑Layer Perceptron (Neural Network)

Results

High accuracy aligned with clinical intuition.

The best‑performing models reach up to 98.2% accuracy on the test set, demonstrating that carefully engineered morphological features can support highly reliable classification of breast tumors.

Feature Insights

Worst‑case morphology as a key signal.

Analysis shows that “worst” (max) values of certain features are especially predictive of malignancy:

Worst perimeter — boundary irregularity
Worst area — largest observed cell region
Worst concave points — strongest concavity deviations

These findings align with clinical expectations: more irregular, larger, and more concave nuclei are more likely to be malignant.

Implementation

A reusable perception module for experimentation.

The repository is structured for clarity and experimentation:

data/ — dataset files
models/ — trained model and scaler artefacts
results/ — evaluation outputs
src/ — training and prediction scripts
notebooks/ — exploratory analysis

Perception systems for real decisions.

From medical morphology to swarm behaviour, my work focuses on building ML systems that turn raw structure into reliable, decision‑ready signals.

Talk about ML perception work