by Christos Evangelou, MSc, PhD – Medical Writer and Editor
Manual scoring of protein biomarkers in tumor tissues by pathologists has been the gold standard for assessing cancer progression and prognosis. However, manual analysis of tissues stained with immunohistochemistry (IHC) is labor intensive and time consuming when large numbers of samples need analysis.
In a recent study, researchers at the University of Illinois Chicago trained and validated a machine learning algorithm to automatically score protein expression in digitized images of lung cancer tissues. The algorithm accurately scored the expression levels of PRMT6, a protein that predicts lung cancer prognosis, providing a high level of agreement with pathologists’ scores.1 Their findings demonstrate the feasibility of using artificial intelligence (AI) to efficiently extract valuable molecular data from lung cancer tissues.
“The findings of our study demonstrated that training algorithms for AI-based scoring can successfully replicate the level of accuracy pathologist’s manual scoring of PRMT6 expression in lung cancer tissues,” said Associate Professor Sage J. Kim, PhD, who was the lead investigator of this study. “AI-based approaches to IHC scoring can be a reliable and efficient way to deal with a large amount of tissue samples,” she added.
Dr. Kim noted, however, that the key to improving the accuracy of machines in scoring protein expression is the pathologist’s input and adjustment to the initial AI scoring results.
“One important adjustment was in the tissue segmentation to cancer tissues versus non-cancer tissues to avoid errors from scoring non-cancer tissues or missing cancer cells,” she said.
The report was published in Cancers.
Importance of Biomarkers for Lung Cancer
Lung cancer remains the leading cause of cancer mortality in the US, with over 140,000 deaths annually.1 Despite recent progress in immunotherapy and targeted therapies for lung cancer, not all patients benefit from these new treatments. The mechanisms underlying the differences in treatment responses among patients and the development of treatment resistance are not fully understood.
Proteins involved in lung cancer progression may contribute to the development of resistance to therapies and could be useful biomarkers for predicting treatment response. One such protein is PRMT6 (protein arginine methyltransferase 6), an enzyme that modifies other proteins and is associated with poor outcomes in patients with lung cancer.1 However, manual scoring of PRMT6 expression across many samples is time consuming, limiting the wide clinical adoption of PRMT6 as a biomarker.
Improving the Speed of Biomarker Scoring
Conventionally, pathologists manually score the expression of protein biomarkers in tissues stained with IHC — this is also known as immunoreactive score (H-score).
However, this process is time consuming, particularly when analyzing large sections of lung cancer tissues to explore epigenetic mechanisms and social determinants of health that require incorporating population health approaches.
“Analyzing PRMT6 the traditional way poses a bottleneck for processing large cohorts to uncover complex risk factors for cancer,” said Dr. Kim. “We wanted to develop an efficient scoring method for bigger data-driven studies. Thus, in this study, using HALO software, we optimized the machine learning method for scoring PRMT6 expression on immunohistochemically stained lung cancer tissue.”
To improve the speed of biomarker scoring, the team trained and validated a machine learning algorithm for PRMT6 scoring in digitized images of IHC-stained lung cancer tissue samples.
Samples were manually annotated to identify regions of interest and to exclude artifacts.1 Using a drawing tool to outline and label regions as ground truth, an experienced non-pathologist researcher trained the MiniNet classifier in HALO to segment tissue into tumor, non-tumor, and background regions.
“Our findings showed that tissue segmentation to cancer vs. non-cancer tissues was the most critical parameter that required training and adjustment of the algorithm to prevent scoring non-cancer tissues or ignoring relevant cancer cells,” Dr. Kim explained.
The pretrained HALO AI Nuclei Segmenter network embedded in the Multiplex IHC module was used with minimal additional training to segment nuclei in all regions of interest.1 Multiplex IHC algorithms were then used to deconvolve hematoxylin and DAB and quantify DAB intensity in each cell.
“While training these networks, visual review of segmentation accuracy, real-time cross entropy readout, and validation metrics were used to determine segmentation error and the need for additional training examples,” Dr. Kim noted.
Digital scoring was first performed on eight samples that the pathologist carefully selected to encompass both typical and atypical histologic patterns. Under the supervision of a pathologist, a non-pathologist researcher manually annotated tumor regions on whole-slide images using Aperio’s annotation software (ImageScope, Aperio).
Image regions were annotated to represent three user-defined classes (carcinoma, stroma, and background) for automated tissue segmentation.1 These image regions served as input parameters for histologic pattern recognition training software to generate a training set.
The effectiveness of tissue segmentation was enhanced by feedback from one of the pathologists who manually scored the sections to improve the segmentation of cancer cells against stromal and immune cells.1
“Based on this input, training regions were edited by the AI researcher to improve the AI classifier,” Dr. Kim said.
She explained that this procedure of iteratively modifying annotations and re-executing the training algorithm was repeated until the classification reached its optimal state, which was visually confirmed by the pathologist.1 With feedback from the pathologists, thresholds were visually set to distinguish negative, weak, moderate, and strong stains.
The adjusted thresholds based on the training set were then used across all samples. For each sample, the H-score was calculated as the sum of staining intensity (0,1+,2+,3+) multiplied by the percentage of positive cells (0%–100%) within each intensity category. When performance on the initial eight samples reached a satisfactory level, the AI algorithm was applied to the full set of 33 samples.
After training the algorithm with pathologist input to distinguish cancerous regions, the HALO PRMT6 scores showed striking agreement with the pathologists’ immunoreactive scores, with a correlation coefficient of 0.88.
In addition, the intraclass correlation coefficient was 0.95, and the scale reliability coefficient was 0.96, demonstrating strong inter-rater agreement.
“We successfully optimized a machine learning algorithm for scoring PRMT6 expression in lung cancer that matches the degree of accuracy of scoring by pathologists,” Dr. Kim noted.
Despite these encouraging findings, the authors emphasized that AI scoring should not replace pathologists. Rather, human-machine collaboration, whereby pathologists train algorithms to recognize challenging tumor features, can combine the accuracy of human interpretation with the scalability of automation.
One of the limitations of the study was the limited sample size; thus, further validation of the algorithm in large datasets is needed. In addition, fragmented or irregular growth patterns in tumor tissues can influence the pattern or intensity of PRMT6 staining.
“Digital scoring may be affected by artifacts such as nonspecific signals or altered tissue morphology when tissues are inadequately preserved or if antigenic retrieval or antibody dilution is suboptimal,” Dr. Kim cautioned. Future optimization should focus on enhancing the algorithm’s performance on heterogeneous tumor regions.
Although further improvements are needed, this study provides a framework for the precise quantification of protein biomarkers in lung cancer using AI. The authors stated that the method could be applied to samples from large cohorts to uncover relationships between PRMT6 levels and social determinants of health or clinicopathological outcomes.
“Once validated, automated scoring can become the standard for big data research and unlock population-level insights into social epigenetic research linking social exposure and biophysical consequences,” stated Dr. Kim. “We envision this benefiting underserved minorities most impacted by lung cancer disparities.”
- Mahmoud AM, Brister E, David O, et al. Machine Learning for Digital Scoring of PRMT6 in Immunohistochemical Labeled Lung Cancer. Cancers (Basel). 2023;15(18). doi:10.3390/cancers15184582