Diagnosis of Acute Poisoning using explainable artificial intelligence.
Academic Article
Overview
abstract
INTRODUCTION: Medical toxicology is the clinical specialty that treats the toxic effects of substances, for example, an overdose, a medication error, or a scorpion sting. The volume of toxicological knowledge and research has, as with other medical specialties, outstripped the ability of the individual clinician to entirely master and stay current with it. The application of machine learning/artificial intelligence (ML/AI) techniques to medical toxicology is challenging because initial treatment decisions are often based on a few pieces of textual data and rely heavily on experience and prior knowledge. ML/AI techniques, moreover, often do not represent knowledge in a way that is transparent for the physician, raising barriers to usability. Logic-based systems are more transparent approaches, but often generalize poorly and require expert curation to implement and maintain. METHODS: We constructed a probabilistic logic network to model how a toxicologist recognizes a toxidrome, using only physical exam findings. Our approach transparently mimics the knowledge representation and decision-making of practicing clinicians. We created a library of 300 synthetic cases of varying clinical complexity. Each case contained 5 physical exam findings drawn from a mixture of 1 or 2 toxidromes. We used this library to evaluate the performance of our probabilistic logic network, dubbed Tak, against 2 medical toxicologists, a decision tree model, as well as its ability to recover the actual diagnosis. RESULTS: The inter-rater reliability between Tak and the consensus of human raters was κ = 0.8432 for straightforward cases, 0.4396 for moderately complex cases, and 0.3331 for challenging cases. The inter-rater reliability between the decision tree classifier and the consensus of human raters was, κ = 0.2522 for straightforward cases, 0.1963 for moderately complex cases and 0.0331 for challenging cases. CONCLUSIONS: The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Our results are a proof-of-concept that, in a restricted domain, probabilistic logic networks can perform medical reasoning comparably to humans.