Let’s use deep learning to do things that our brains are not tuned to do. Let’s assist pathology in that way, rather than trying to create a virtual pathologist.
Interview with Dr. Mark Zarella
Director of Digital Pathology at Johns Hopkins Medicine, Baltimore, Maryland, United States
BIOSKETCH: Dr. Mark Zarella is focused on the deployment of digital pathology in clinical research and the development and analysis of novel techniques in imaging, diagnostics and artificial intelligence. He received his undergraduate degree in physics at the University of Massachusetts and a Ph.D. in Neuroscience at the State University of New York in 2011. His research was on cortical networks of the visual system, of which imaging and computation were important components. He carries that expertise forward to digital pathology, inspired heavily by human vision and methods in vision research to develop and explain computer vision and decision networks.
Dr. Zarella joined the Johns Hopkins faculty in 2020. Prior to joining Johns Hopkins, he was the technical director of Pathology Imaging and Informatics at Drexel University College of Medicine. He serves as a member of the board of directors of the Digital Pathology Association (DPA), a member of the College of American Pathologists (CAP) Digital and Computational Pathology Committee and has contributed to several white papers on the topics of whole slide imaging and computational pathology.
Interview by Jonathon Tunstall – 08, July, 2021
Published – 13, Oct, 2021
JT – Dr. Zarella, how does an electrical engineer become a director of pathology informatics?
MZ – That was a topsy turvy road and it was quite a big transition going from a strictly engineering discipline to a biological one. I started moving in that direction when I began studying neuroscience – neurophysiology in macaque monkeys to be precise. That was mainly wet lab work with quite a bit of imaging. My PhD mentor at the time was a pioneer in optical imaging in the neuroscience domain and I stayed in neuroscience for eight or nine years, gaining experience in the imaging techniques. After that I was lucky enough to get a faculty position in pathology at Drexel College of Medicine with the promise that I would do imaging, which was of course my interest. Even though I was jumping from neuroscience to pathology, it was still within the imaging domain and the work still had a heavy component of engineering. I was hired by a pathologist, Fernando Garcia, who was the AP director at the time, and he was really one of the pioneers of digital pathology. He had been using whole-slide imaging clinically since the late 90s and he really got the ball rolling for me in digital pathology.
JT – So Dr. Garcia was doing this early work by himself at Drexel?
MZ – Yes, he started in the late 90s with an older system and then eventually moved to an Aperio scanner, but it was a pretty mature digital pathology infrastructure by the time I moved to Drexel. He needed someone with an engineering background as not all pathologists can necessarily write code and develop their own instruments. I think it was a really great fit for both of us and I was lucky enough to learn from someone who had been doing digital pathology for years and years.
JT – I can see the link from electrical engineering into neuroscience because there are electrical processes going on in the brain as well.
MZ – Yes. Early on, my focus was on signal processing and from an electrical engineering standpoint I was doing a lot of analog circuit design. Moving into the brain was not such a big leap as the brain is just a big signal processor. Initially, I was looking specifically at the cerebral cortex and doing a lot of computationally minded studies and from there I jumped over to digital pathology. They do sound like very different disciplines, but in reality, they are very interrelated and from an investigative standpoint, they use the same skillsets. Many of my students that work with me today come from electrical engineering backgrounds.
JT – I do think that digital pathology attracts people from multiple backgrounds, and it makes this a very interesting domain to work in being surrounded by such a diversity of different talents.
MZ – I agree, and true success in this field occurs when you can harness all that diversity. Some of the roadblocks that we face happen because we only have only one viewpoint. We often have pathologists without an engineering background or conversely engineers without the pathology domain knowledge. I know it’s a little cliched to say, but I think that diversity and the collaborative approach is really essential in this area. You do of course find pathologists who have both skillsets, pathologists with extensive engineering skills for example, and they are real gems in this field.
JT – Tell me then how digital pathology is applied in studying neuroscience? Do neuroscience researchers build their own image analysis applications, or do they use commercial software packages?
MZ – I’ve really been out of the neuroscience world since I moved to Drexel in 2012 and I haven’t gone back in that direction, so I can’t say for sure. What I can say however is that a lot of what we were doing in neuroscience 10 years ago, from an imaging perspective, was in my view, a bit of a precursor to some of what we are seeing in digital pathology now. Especially that is with in vivo and ex vivo microscopy. For example, if you are interrogating brain circuits or using just optical imaging, as we were, to understand functional connectivity in the brain, then you can apply all those same technologies and the same computational and optical approaches to digital pathology. Whether you are talking about tissue in vivo, or excised tissue, its really the same sort of approach. So, I think these two disparate fields that often don’t talk to each other or don’t collaborate, can learn a lot from one other.
JT – Do you think the digital pathology community recognises that there is something to be learned from some of the in vivo / ex vivo techniques, other fields in general, or are we currently too focused on AI and pushing out the latest and greatest new algorithms?
MZ – I think there is a big potential and a lot of groups have actually realised that. Let me give you a couple of examples. You just mentioned one area that is really ripe for that commonality and that is AI. You apply AI algorithms for in vivo imaging and there are some differences but also a lot of similarities. When we talk about imaging, data repositories, and all the regulatory issues that we normally encounter in digital pathology, all of those same things exist for in vivo microscopy as well. This is why I believe there are a lot of lessons to be learned on both sides and there are some people (and I consider myself one of those people) that are fluent in both.
JT – You said that you have now moved away from neuroscience. Do you still conduct any neuroscience related research?
MZ – Yes. One area of research that I’m really interested in is gaze tracking, particularly with respect to pathology. We have looked at how pathologists, both experts and novices, view whole slide images, and we have developed some computational models that come right out of the neuroscience literature. These are designed to parse out the different components of learning, risk and all the cognitive factors that go into making a diagnosis. A lot of that is very neuroscience based.
JT – so looking at the way an individual reacts with the slide and how the visual cortex perceives the image?
MZ – Yes, exactly. There is a lot to learn from the pattern of eye movements, not just in terms of what people are looking at, but how they are looking at things. Whole slide imaging is also a little unique as well, due to the fact that they are moving pretty seamlessly through different magnifications with a scroll wheel on their mouse. As people do that, we can gain a few insights, and this has been studied and published by other groups as well. For example, novices and residents who are really starting to get the hang of looking at these slides, tend to look at higher magnification images whereas experts are extracting more information at lower magnification, lower resolution. Of course, the experts are also going faster and getting more answers correct. We published a model on this about two years ago, talking about how people can potentially operate in a kind of a super-resolution manner and glean a lot of information from low magnification images. Our hypothesis is that experts have learned over time that it is more efficient, and they have learned to extract as much information as possible from the low magnification images and the larger fields of view. The trainees usually don’t have those skills yet.
JT – So you are saying that you can build a model to distinguish between the observational behaviour of experts and the trainees?
MZ – I gave a talk a couple of years ago in the UK at the Digital Pathology Congress where I presented a model as a potential explanation for this observation. The model is called the Drift diffusion model, and the idea is that as you are making a decision, you are accumulating evidence. You continue to accumulate evidence and at some point, you reach a threshold where you execute the decision. Usually, this presents as a binary decision, malignant versus benign or A versus B. We have applied this mathematical model and one of its strengths is that by looking at behaviour, eye tracking etc., you can parse out the different cognitive factors that go into a person’s performance. We see that there are visual factors, cognitive factors, and risk factors. For example, some people are more careful, they want to ensure they have more evidence before making a decision, and this would be a strategic rather than a cognitive factor. There are usually about four or five factors that you can parse out using this model and this can tell us a few things, such as how trainees are progressing in their training. If someone is too conservative, their risk aversion may be prominent in this model, that might be something they could work on, or it could be an inherent trait. It doesn’t necessarily mean that they are taking longer examining the slide because they don’t know what they are looking at.
JT – I guess the model could also be used to teach the trainees the traits of the expert pathologist?
MZ – I think so, and perhaps if coupled with a whole slide imaging system, this could become an educational platform. There is the potential (although pathologists wouldn’t necessarily welcome it) to have an algorithm that looks over your shoulder as you are viewing images in a diagnostic setting. Certainly, we can envision a few applications, and we need to get creative about how to do it in a way that is useful in pathology. Education is probably the most obvious area of application.
JT – So we have been discussing your research here. Tell me a little about your day-to-day work at Johns Hopkins and how digital pathology is being applied in your routine workload.
MZ – At Johns Hopkins I manage the operational aspects of digital pathology. We are growing our scanning considerably at the moment with investments from the hospital. We are currently engaged in increasing our scanning of consults by 100-fold. We get a lot a lot of consults, but also a lot of patients are coming to us from other hospitals, and keeping a digital copy of the slides is very useful for a number of reasons. We are also currently focussing on building large digital slide research repositories, with the idea that we can build AI tools or some sort of analytics that will help us with a lot of different research questions. Building a data set like this from the ground up leads to a lot of possibilities. Those have been my focus areas from early on, increasing our scanning, building these research repositories and we want then to link these repositories through interfaces with some efficient methods of annotation.
JT – So, are you building your own analytics, rather than buying commercial packages?
MZ – We are doing a little of both. We do have some commercial packages. From a research perspective we have groups in biomedical engineering, computing and other engineering disciplines who are very interested in AI generally and who can help to build the analytics and handle the datasets. We have other groups within pathology who are working on AI based projects which are still very much at the research end. They will very much benefit from these repositories.
Another wing of my research is in explainable AI. Those efforts are not focused so much on the development of AI but in trying to understand what that AI is doing. Deep networks can be viewed as a black box, and there is a lot of advantage, in my view, in understanding what they are doing. There is a lot of non-standardization in pathology as you know, and we have to be able to predict when an algorithm may not be appropriate for a particular dataset or lab or a workflow. It’s important to know that in advance. We can’t take an algorithm even if it has been validated in 5 different places and assume its going to work in the 6th place. Explainable AI will help us understand the limitations of the algorithm and in my experience that is always the first question that pathologists ask when you present an algorithm to them and say, ‘this works.’ They will say, ‘well how does it work? Is it looking at the same visual features that I am looking at?’ As an example, we are looking at a few Gleason grading algorithms at the moment and many of the pathologists I’ve presented these to have asked, ‘is it looking at cells or morphology or architecture?’ and my answer is ‘I don’t know.’ We currently have a project on Gleason grading and another on breast cancer grading where we developed the algorithm and its not behaving as expected. It doesn’t appear to be using the image features that you would expect it to be using. We’re hoping to publish this study soon.
JT – So is the choice of features leading to false negatives? That’s what you want to avoid isn’t it when using that type of algorithm?
MZ – Well I have two things to say about that. Firstly, I hope that most developers are cognisant of that imbalance and are tuning their algorithms appropriately. I think most people would agree that we would rather have false positives than false negatives in most AI algorithms and it’s pretty standard to tune your algorithm. Secondly, I think explainable AI is useful in that respect too. Some of the early publications have focussed not so much on the underpinnings of the algorithm but on providing examples to the pathologist. They may have an algorithm and, say, one million areas of analysis, and then want to understand how the algorithm is working a little bit better. So, they provide a pathologist with the 100 highest scoring patches, or the 100 lowest scoring patches, and the pathologist can visually examine the differences between them and try to glean for him or herself what the algorithm may be doing. That method can be somewhat presumptuous as you can never establish causality with such an approach, but it is still useful, I think. Pathologists also appreciate this type of interaction because they can get a feel for what the algorithm night be doing.
JT – The price for the utility of the algorithm is tolerance of a certain number of false negatives, 0.5% or 2% or 5%, whatever that number is. There has to be that tolerance to use these algorithms, doesn’t there?
MZ – There does, and I think you always want to compare that rate with what pathologists are currently doing. If current pathologist false negatives exceed those of the AI tool, well the AI tool is still demonstrating some benefit. A lot of people think about this in terms of QA (I don’t really like that term) but that is about the AI tool checking on the Pathologist, or vice versa, the pathologist checking on the AI algorithm. That doesn’t necessarily improve efficiency, but having that double check helps to reduce the false negatives, and if that’s your goal, then the AI is still useful.
JT – When I consider quality control of algorithms, I can’t help thinking that there are so many variabilities from one lab to another, particularly in staining intensity and the quality of slide preparation. Doesn’t that mean that an algorithm should really be trained and used only in a single lab, and may not in fact, be transferrable from one lab to another?
MZ – I don’t necessarily agree with that. I think a lot of modern AI approaches can make some assumptions about what variability might look like. We can actually measure that variability. You mentioned staining for example. We can mimic that kind of variability during training, even if we have only simple datasets. There are a couple of approaches to that: The first is to use some sort of color normalization where you would take any future stain test data set, or a dataset that you might apply your algorithm to, and normalize it to try to mitigate some of those issues with stain variability. The second approach is to go in the opposite direction with a data augmentation strategy. There have been a couple of papers on this, and it involves introducing color variability into an otherwise very homogeneous dataset as a data augmentation procedure. A lot of people think of data augmentation as a method just to increase your sample size, but really it is a method to prevent a particular feature from being harnessed by the algorithm. For example, if you don’t want your algorithm to focus in on staining, then synthetically change your staining by recoloring it. There have been some methods to do this which have been published and they have worked pretty well, so I think this is one of the ways to mitigate some of this variability. Of course, the key to all of that is understanding the variability in the first place. There are a lot of different sources of variability and it’s not just down to staining or staining intensity. If you can clearly understand all the features of the variability which is being captured in the image analysis domain, then this type of data augmentation method can really help to cancel them out.
JT – I wonder then how good these algorithms will become in the future? Do you think we will see widespread screening algorithms that screen slides before the pathologist looks at them and then present the cancer cases to him or her? Does the role of the pathologist change?
MZ – I think it has to. I think it will be for the better but getting everyone on board with that notion now, at such an early stage, is a bit tricky. I do think that ultimately people are going to start embracing AI. I think they are going to understand that it will make their jobs easier, and I think they will also comprehend that people like us, who develop AI algorithms, that we do understand the shortcomings. We are not intending that this technology completely replace the pathologist. We just went to augment what they do. The analogy I like to use is that airplanes, as I understand, can fly themselves, but we still have pilots. They get a lot of help from computers, perhaps computers do the whole job, but the general flying public, they still want a pilot up front. I liken that to where we currently are in pathology. I think people will still want a pathologist overseeing everything, but I think people will also be very comfortable with the idea that computers are helping them to do their job.
JT – I like that analogy, it’s very true, and it brings another question to mind concerning the fact that many labs are not digital at all. I wonder if in the future, there will be two separate tracks in pathology, traditional pathology and digital pathology, and if so, how does that work? Will the traditional labs be outcompeted by the digital labs?
MZ – That’s an interesting question. My inclination is to say that they wouldn’t be. They would perhaps be operating less efficiently, but that doesn’t necessarily put them out of business. For example, if those labs are serving an underrepresented area where there isn’t a large academic medical center accessible to their patients. So, I think the two approaches can co-exist, but I don’t know exactly what that will look like.
JT – I’m thinking more of young pathologists coming into this science in 10 years’ time. Do they want to work in a traditional microscopy environment or a digital environment? I suspect it will be digital and so does it then becomes hard to source people to work in the traditional environment?
MZ – I’ve seen that happen actually. When I left Drexel, it was because Hahnemann Hospital closed. We had been using Quantitative IHC at Drexel for many years and some of the young pathologists had only ever done that. The ones that did their residency there and became faculty, did not know any other way. When people left and were interviewed by labs that didn’t use digital pathology they just weren’t used to that non-digital workflow. I got a lot of questions from my former colleagues afterwards. How do we implement whole slide imaging and quantitative IHC? Of course, it’s not trivial, it requires investment and a bit of knowhow, but clearly, they wanted to work in the digital environment. So, I think people who have grown up with digital pathology will want to continue working digitally; but they will also realise that it may not be feasible at their institute, and then they’ll make do.
JT – I guess we’ll see prices come down for both the scanners and the AI.
MZ – I think prices for scanners are already coming down. For the image analysis and AI, I wouldn’t say that commercial options are getting any cheaper but on the other hand we are now seeing free open-source tools entering the market. Interestingly, some of the image management systems are also now building integrations with these open-source applications. It’s quite possible that in ten years’ time, image analysis, at least basic image analysis, could be free.
JT – I wonder where this all goes in the future. It seems from what you are saying that your view of the future is a synergy between man and machine, rather than the AI completely taking over?
MZ – I think that is a true statement. I don’t think we will be able to replace pathologists anytime soon. A pathologist has a brain and a set of expertise that cannot possibly be captured by current AI tools. Deep learning in its current state is still pretty rudimentary, and so called complex neural networks simply can’t compete with the complexity of the human brain. Having said that there are certain things that the algorithms can be tuned to do quite quickly, more quickly than the human brain. I think the key is to understand the strengths of deep learning relative to the human brain and not just recapitulate what the brain is doing. Let’s use deep learning to do things that our brains are not tuned to do. Let’s assist pathology in that way, rather than trying to create a virtual pathologist.
JT – Dr. Zarella, we’ll leave it there. Thank you for your time today.