Representation learning, also known as feature learning, is a new and rapidly expanding set of machine learning techniques with numerous promising applications in digital pathology.
Graph-based models have emerged as powerful tools for solving complex image classification problems and accurately simulating biological systems, including cellular and tissue interactions.1
Additionally, graph-based models can enhance pathology-specific interpretability and human-machine co-learning. Therefore, representation learning using graph-based models has the potential to improve digital pathology workflows and transform diagnostic pathology.1
Emerging applications of graph data representations in digital pathology were systematically and extensively reviewed by Ahmedt-Aristizabal et al.1 in their article published in the journal Computerized Medical Imaging.
Graph convolutional networks versus traditional convolutional neural networks for computational pathology
Typically, whole-slide images are divided into patches for further analysis. However, feature learning using traditional convolutional neural networks (CNNs) does not take into account the relationships between these patches. Compared to deep learning using traditional CNNs, representation learning using graph convolutional networks (GCNs) provides superior relation-aware representations, complex neighborhood information, and learning of patch-wise features.1
This characteristic of graph-based models enables them to uncover tissue composition and the spatial relationship between cells, as well as capture global tissue micro-architecture and geometrical and topological properties of tissues.1
Using graph-based deep learning to uncover tissue composition in biopsy slides
During graph-based deep learning for digital pathology, histopathological images of patients are “translated” into one or more graphs through multiple steps. The graphs retain spatial and contextual information of cells and tissues.1
After stain normalization and detection of nuclei, image patches, tissue regions, or other image entities, as well as their features (e.g., shape, size, orientation, and intensity) and spatial or semantic relationships are defined. The graph representation of these entities (nodes of the graphs) and their features and relationships (edges of the graphs) is processed using GNNs (e.g., graph pooling and node/graph-level model prediction) to generate graph models. Graph attentions, post-hoc graph explainers, and other interpretability methods can be used to interpret the graph-based models.1
Graph-structured image analysis can effectively detect objects, reveal relationships between objects, classify tissues, and assign a class to each pixel of the image. As graphs capture relationships between variables, graph-based models can be used to encode relational information between interacting objects or variables.1
Graph representations with potential applications in digital pathology include cell graphs, patch graphs, tissue graphs, and hierarchical cell-tissue representations.2
Clinical applications of graph-based models in digital pathology
Tissue structure and composition play a pivotal role in disease diagnosis based on histopathological analysis of biopsy slides. Therefore, by uncovering tissue composition and capturing complex cellular interactions within tissues, graph-based data representations can prove valuable tools in diagnostic pathology, especially in oncology.1
Emerging clinical applications of graph data representations in digital pathology include tumor localization, classification, and staging. Furthermore, GCNs, alone or in combination with CNNs, can be used to extract morphological features during multimodal fusion analysis for patient risk-stratification.3
Most emerging clinical applications of graph-based deep learning methods in digital pathology involve the analysis of biopsy slides from patients with breast cancer, colorectal cancer, and prostate cancer.3
Anand et al.4 used GCNs to classify graph representations of whole-slide images of breast cancer tissues. Nuclei detection was achieved using pre-trained CNNs, and nuclear morphology and gland formation information were included as vertex features and edge attributes, respectively. Supervised GCN training was conducted using cell graph representations of each tissue image. GCN provided high accuracy in classifying breast tissues as malignant or non-malignant.
Because graph models may be difficult to interpret, Jaume et al.5 developed quantitative measures based on cellular properties to increase the interpretability of cell graph representations for breast cancer subtyping. Histology images from patients with breast cancer were transformed into cell graphs, and various post-hoc graph explainers were used to generate an interpretation for each entity graph. Graph interpretation using the graph explainer GraphGrad-CAM++ resulted in the highest agreement in breast cancer subtyping between the GCN and the pathologists.
In a recent study, Raju et al.6 developed a graph attention multi-instance learning framework to predict tumor node metastasis (TNM) staging based on the spatial relationship between tumor cells and other tissue regions within colorectal cancer tissues. This framework improved the accuracy of TNM stage prediction.
In a similar approach, Zhao et al.7 developed a GCN multiple-instance learning framework with a feature selection strategy to predict lymph node metastasis in patients with colon adenocarcinoma. The proposed framework outperformed CNN-based and attention-based multiple-instance learning models in predicting lymph node metastasis.
The Gleason score is based on the architecture of prostate cancer tissues and can predict the aggressiveness of the disease. Wang et al.8 established a weakly supervised GCN pipeline to predict Gleason scores and risk-stratify patients with prostate cancer based on the spatial distribution of the glands.
Cell graph representations were generated for each image; in these graphs, nodes represented the nuclei, and edges represented the distance between neighboring nuclei. Morphological, spatial, and textural features were extracted, and a GCN using a self-supervised technique was employed to calculate attention scores to identify high-risk patients.8
Challenges and future perspectives
Proof-of-concept studies have confirmed the ability of graph-based deep learning to reveal phenotypical and topological characteristics of tissues, and GNN models may help pathologists detect and classify tumors and risk-stratify patients.1
Although GNN models appear to outperform traditional CNNs and other conventional deep learning methods, their clinical validation has not been as extensive. The application and performance of graph-based models when used in a clinical setting for disease diagnosis or patient risk-stratification need to be validated in large cohort studies.1
Automated estimation of graph structure with the desired properties from histopathology images remains one of the key challenges limiting the clinical implementation of GNN models.
The explainability and interpretability of graph models remain low compared with traditional CNNs. Moreover, graph-based models are computationally complex, and it might be challenging to determine which GNN architecture is the most appropriate for certain clinical applications.1
Additional challenges include the uncertainty regarding the generalizability of GNN findings in heterogeneous populations of patients with different types of cancer and the lack of human-in-the-loop systems for detecting and correcting algorithmic errors.
To learn more about the uses of graph-based deep learning for computational histopathology, please read the article by Ahmedt-Aristizabal et al., A survey on graph-based deep learning for computational histopathology, Computerized Medical Imaging 95, 102027 (2022).
- Ahmedt-Aristizabal D, Armin MA, Denman S, Fookes C, Petersson L. A survey on graph-based deep learning for computational histopathology. Comput Med Imaging Graph. 2022;95:102027. doi:10.1016/j.compmedimag.2021.102027
- Pati P, Jaume G, Foncubierta-Rodríguez A, et al. Hierarchical graph representations in digital pathology. Med Image Anal. 2022;75:102264. doi:https://doi.org/10.1016/j.media.2021.102264
- Chen RJ, Lu MY, Wang J, et al. Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and Prognosis. IEEE Trans Med Imaging. 2020:1. doi:10.1109/TMI.2020.3021387
- Anand D, Gadiya S, Sethi A. Histographs: graphs in histopathology. In: Tomaszewski JE, Ward AD, eds. Medical Imaging 2020: Digital Pathology. Vol 11320. SPIE; 2020:150-155. doi:10.1117/12.2550114
- Jaume G, Pati P, Bozorgtabar B, et al. Quantifying Explainers of Graph Neural Networks in Computational Pathology.; 2021. doi:10.1109/CVPR46437.2021.00801
- Raju A, Yao J, Haq MM, Jonnagaddala J, Huang J. Graph attention multi-instance learning for accurate colorectal cancer staging. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; 2020:529-539.
- Zhao Y, Yang F, Fang Y, et al. Predicting lymph node metastasis using histopathological images based on multiple instance learning with deep graph convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.; 2020:4837-4846.
- Wang J, Chen RJ, Lu MY, Baras A, Mahmood F. Weakly supervised prostate tma classification via graph convolutional networks. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). IEEE; 2020:239-243.
Christos Evangelou, PhD
Christos received his Masters in Cancer Biology from Heidelberg University and PhD from the University of Manchester. After working as a scientist in cancer research for ten years, Christos decided to switch gears and start a career as a medical writer and editor. He is passionate about communicating science and translating complex science into clear messages for the scientific community and the wider public.