The advance of machines into medical diagnostics continues. Machines learn based upon a large dataset of images and linked conclusion - information on whether cancer or diabetic retinopathy or some other condition is present. It should therefore come as no surprise that having not provided information on how humans reach conclusions the machines found their own path, hidden from our understanding. This opacity and our inability to have faith in conclusions we cannot understand represents a profound stumbling block to adoption of these tools. The authors of a study in Radiology, are taking a new, well really old, approach to explanation.
Back in the day, before the cloud and yes, even before Google, say when IBM roamed the earth early investigators of artificial intelligence worked on Expert Systems. The idea was that by modeling the judgment of experts we could replicate their thought processes. Machines would mimic experts in a more rapid and quantifiable way. But it turns out the experts have a difficult time expressing their judgment - tacit or implicit knowledge, our “gut instinct”, information built upon experience is hard to express. Additionally, expert knowledge is jargonized, and the gist and context of expert’s words needs to be translated for the modelers. These systems have largely been abandoned but the researchers have returned to that old school approach, modeling the experts, in a much more effective way.
Reports dictated by radiologists provide a unique opportunity to teach machines because they combined how we speak and write (natural language), within a structured report. The dictation and transcription of radiology reports have been automated for some time and were designed in conjunction with radiologists, the experts providing a standard structure and templated responses – a window into their thinking. For example, a radiologist may after looking at a chest X-ray simple dictate normal, and the dictation system will record a default paragraph detailing in a structured way all the components that a radiologist considered in reporting normal.
Using radiology reports to detect expert’s thinking is a beautiful insight. By abstracting thousands of reports, machines can be given a crowd-sourced structured description of images, descriptions that could form the basis for explanations we can more readily comprehend. But first computers need to learn radiologist’s structured approach, the subject of this study.
Using about 96,000 radiologist reports on computerized tomographic (CT) scans of the head they created an algorithm to identify the pertinent findings in CT scans – a taxonomy of what radiologists felt was relevant and why. They manually labeled 1,004 reports to identify salient features for the algorithm to find, and then applied a series of natural language processing (NLP) algorithms to characterize those findings.
These NLP algorithms do simple stuff like word counts, or more sophisticated things like predicting the next word or phrase in a sentence. The data restructures radiology reports, the expert’s thought process, into an algorithm. The dream of expert systems, to mimic experts, had been accomplished, just not in the way it was originally conceived back in the day. The algorithm was then set loose on the remaining 95,000 reports. It was able to identify 90.25% of all pertinent and 92.59% of all critical CT findings. In the next phase of their work, they are planning on applying the algorithmically derived descriptions to CT images.
If the next phase is successful, we may have machines that can express the why of their decision making in terms that physicians can more readily accept and defend. It also points to a pathway for adaptation of other semi-structured medical reports to machine learning in a way that makes use of the hard gained implicit knowledge of experts. For expert systems, everything old is new again.
Disclaimer: Some of the work in the paper is contributed by researchers at Verily Life Sciences. I am a proud family member of a Verily employee.
Source: Natural Language Based Machine Learning Models for the Annotation of Clinical Radiology Reports Radiology DOI: 10.1148/radiol.2018171093