New Deep Learning Model Predicts Cancer Aggressiveness From Routine Pathology Slides

by Christos Evangelou, MSc, PhD – Medical Writer and Editor

In a new study, researchers at the University of Naples and the University of Molise developed a deep learning model to predict Ki-67 immunohistochemical staining from hematoxylin-eosin (H&E)-stained images of oral cancer samples. After training the model on H&E and Ki-67 image pairs, researchers used it to predict Ki-67 in routine slides.1

The model showed promise for obtaining Ki-67 information directly from H&E slides without additional staining, reducing the use of consumables and turnaround times. Although further tuning on more diverse datasets is needed to improve its generalizability, the model could help prioritize cases in digital pathology workflows.

“We demonstrated that digital pathology could provide accurate and reproducible prediction and quantification of Ki-67 expression in tumor tissues of oral cancer samples,” said Francesco Merolla, MD, PhD, associate professor at the University of Molise and the corresponding author of this study.

“Digital pathology could facilitate the standardization and automation of Ki-67 assessment in clinical practice and research,” he added.

The report was published in the Journal of Pathology Informatics.


Determining the expression levels of the proliferation marker Ki-67 provides valuable information about the aggressiveness and likely prognosis of many cancer types, including oral cancer samples (OSCC). However, assessing Ki-67 levels using standard tissue staining methods increases pathology laboratory workloads.

“Our study aimed to investigate the potential of generative networks to generate synthetic immunohistochemical stains and their reproducibility by means of quantitative analysis of Ki-67expression,” said Dr. Merolla.

To overcome the need for immunohistochemical staining of tissue samples to determine Ki-67 expression in tumor tissues, researchers trained a deep neural network that could directly predict patterns of Ki-67 staining using only standard H&E images as input.1 By bypassing additional slide preparation and staining, AI-assisted computational pathology techniques could reduce costs and turnaround times while still extracting key prognostic biomarkers.

Data Generation and Model Design

The researchers obtained digitized whole slide images from 175 archived formalin-fixed paraffin-embedded oral cancer specimens, prepared initially with H&E staining for standard histopathological analysis.1 After scanning and destaining the slides, they immunostained them for Ki-67 expression. This allowed researchers to generate a dataset of 349 OSCC tissue cores. They then carefully aligned the H&E and Ki-67 images pixel-by-pixel to obtain input and expected output pairs for deep learning.

“Through a process of destaining and a consequent restaining of the slides, we could perfectly align matching cores with pixel-wise precision,” explained Dr. Merolla.

The researchers split the 349 extract tissue microarray cores into 60% for model training, 10% for testing model performance iteratively during training, and 30% for final model validation.1

The team then trained a generative adversarial network model called pix2pix to learn associations between H&E and Ki-67 patterns in a supervised fashion. This training process allowed the model to predict synthetic immunohistochemical Ki-67 stains directly from H&E input slides.1

The Deep Learning Model Generates Realistic Synthetic Staining Images

In a proof-of-concept study, the researchers found that the deep learning model could generate realistic Ki-67 immunohistochemistry stains directly from H&E tissue slides of OSCC samples without requiring additional staining or processing.1

In a double-blinded test, two pathologists could discern the AI-generated Ki-67 images from real immunostains in only around 55% of cases, suggesting that synthetic Ki-67 images demonstrated realistic tissue patterns.1 Quantitative analysis of the validation dataset showed that predicted Ki-67 positivity was moderately correlated with actual Ki-67 expression (R2 = 0.56, P < 0.001).

“Our model generated realistic images that were undetectable by trained pathologists, and the quality of the images was confirmed by the automatic analysis performed with QuPath,” noted Dr. Merolla.

The Deep Learning Model Extracts Prognostic Information From Routine H&E Pathology Images

The proposed AI approach could extract critical prognostic information about cancer proliferation rates and patient outcomes from routine pathology slides. The model showed good accuracy (74%–88%) against key diagnostic Ki-67 expression category thresholds indicating high versus low proliferation rates and, hence, cancer aggressiveness.1 Categorical analysis using Ki-67 cutoffs of 5%, 10%, and 15% showed positive-predictive values of 79.3%, 77.8%, and 75.0%, respectively.

Implications: Reduced Costs and Delays

Because preparing, staining, and analyzing additional slides to assess Ki-67 substantially increases pathology workloads, the ability to computably predict Ki-67 expression from standard H&E slides with deep learning could considerably improve efficiency.

The findings of this study demonstrate the feasibility of using deep learning to generate synthetic but realistic surrogate Ki-67 data directly from routine H&E slide scans, extracting critical prognostic cancer biomarkers without additional slide preparation or staining.

“By avoiding the consumable reagents, technical personnel time, and workflow delays involved in additional immunohistochemistry, computational assessment of cancer proliferation rates and likely aggressiveness could aid pathologists and clinical decision making while reducing costs,” added Dr. Merolla.

Future Directions

Despite demonstrating promising performance and useful potential clinical applications, the model requires further validation across more diverse oral cancer samples beyond this initial institutional dataset. Testing for generalizability across cancer types and laboratories is also required to establish the robustness of the model in different settings.

“Our study proves the feasibility of the application of generative AI in OSCC, but extensive studies are necessary to evaluate the reliability in different tissues and clinical practice. This will be the focus of future studies,” said Dr. Merolla.

Contrast enhancement and architecture tuning could further improve synthetic Ki-67 accuracy, and model training on digital pathology data flows could one day allow automated and rapid computational prediction of cancer proliferation rates upon slide scanning.

With digital and computational innovation transforming modern pathology, creative AI systems can automate tedious manual analyses to improve clinician productivity. This study outlines a pragmatic deep learning approach that could extract better value from routine histopathological workflows, enhancing the efficiency of pathology workflows through smarter use of existing tissue images.

“The application of generative and predictive models represents the fourth revolution in anatomical pathology and represents the first step to the real digitization of anatomical pathology,” concluded Francesco Martino, the first author of the study.

The study was funded by POR Campania FESR. Scientific Director Prof.ssa Stefania Staibano, Director of Pathology Unit, University of Naples “Federico II”.


  1. Martino F, Ilardi G, Varricchio S, et al. A deep learning model to predict Ki-67 positivity in oral squamous cell carcinoma. J Pathol Inform. 2023;15:100354. Published 2023 Nov 22. doi:10.1016/j.jpi.2023.100354

Share This Post

Leave a Reply