● Academic research

Lung Cancer HistopathologyFour CNNs compared on LC25000

Four convolutional architectures evaluated head-to-head on the LC25000 lung histopathology subset — Modified AlexNet, Modified CNN, EfficientNetB4, and DenseNet-121.

Academic research

Year2024

RoleResearch project · pipeline + evaluation

StackTensorFlow · Keras · OpenCV · NumPy

StatusAcademic research

Academic research. Not intended for clinical diagnosis or medical use.

● The problem

What it had to solve.

Histopathology slide classification is sensitive to small architectural choices — filter sizes, depth, batch normalisation placement. The brief was to evaluate four architectures on a single split, including two built from scratch (a Modified CNN and a Modified AlexNet) and two ImageNet-pretrained backbones (EfficientNetB4, DenseNet-121). The interesting question wasn't 'can we hit state-of-the-art' but 'how do small modifications and transfer learning compare when everything else is held constant?'

● The methodology

Pipeline, end-to-end.

Standard CV preprocessing fed into four CNN architectures, evaluated on a single shared train/val/test split. The headline came from a Modified AlexNet variant; a Modified CNN trained from scratch, EfficientNetB4, and DenseNet-121 made up the comparison set.

● Results

What the comparison actually said.

Comparative results

All four models evaluated on the same held-out test split of LC25000 (lung subset, ~2,100 test images, 700 per class). Modified AlexNet led the comparison; the three other architectures sat within ~3 percentage points of each other.

Modified AlexNet

Test accuracy99.8%

EfficientNetB4

Test accuracy99.7%

DenseNet-121

Test accuracy99.6%

Modified CNN

Test accuracy96.7%

● Approach

Choices that shaped the comparison.

01 / 04

Held the split fixed across all models

Same pre-defined train / validation / test folders used for every architecture. Comparing accuracies on different splits silently rewards luck — the comparison only meant something with the split frozen.

02 / 04

Same input footprint for every model

64×64 normalised inputs across all four architectures so every model trained on the same pixel budget. Light enough to iterate quickly on academic compute; honest enough that the comparison stayed apples-to-apples.

03 / 04

Reported per-class metrics, not just top-line accuracy

Precision, recall, and F1 per class, plus confusion matrices and ROC curves. The architectures agreed on the easy classes and diverged on the boundary cases — that's where any honest discussion has to start.

04 / 04

Trained-from-scratch alongside transfer learning

Modified CNN and Modified AlexNet were trained from scratch; EfficientNetB4 and DenseNet-121 used frozen ImageNet backbones with a small trainable head. Keeping both styles in the comparison kept the conclusions grounded.

● What I learned

The honest part.

The big lesson was epistemic: clean datasets like LC25000 make it easy to land a respectable accuracy number, and much harder to say something honest about where one architecture is actually better than another. Reading every model under identical preprocessing and split conditions — and reporting per-class behaviour, not just top-line accuracy — turned out to matter more than any single architectural tweak.

● Current status

Where this lives now.

Completed as an academic research project at NSU. A conference-style paper, final report, and poster were produced. Code and notebooks archived; not actively developed.

● Next case studyheart disease

Academic research

Heart Disease NLPTransformer comparison on medical reports

Read case study