Identifying Nonpatient Authors of Patient Portal Secure Messages in Oncology: A Proof-of-Concept Demonstration of Natural Language Processing Methods. Academic Article uri icon

Overview

abstract

  • PURPOSE: Patient portal secure messages are not always authored by the patient account holder. Understanding who authored the message is particularly important in an oncology setting where symptom reporting is crucial to patient treatment. Natural language processing has the potential to detect messages not authored by the patient automatically. METHODS: Patient portal secure messages from the Memorial Sloan Kettering Cancer Center were retrieved and manually annotated as a predicted unregistered proxy (ie, not written by the patient) or a presumed patient. After randomly splitting the annotated messages into training and test sets in a 70:30 ratio, a bag-of-words approach was used to extract features and then a Least Absolute Shrinkage and Selection Operator (LASSO) model was trained and used for classification. RESULTS: Portal secure messages (n = 2,000) were randomly selected from unique patient accounts and manually annotated. We excluded 335 messages from the data set as the annotators could not determine if they were written by a patient or proxy. Using the remaining 1,665 messages, a LASSO model was developed that achieved an area under the curve of 0.932 and an area under the precision recall curve of 0.748. The sensitivity and specificity related to classifying true-positive cases (predicted unregistered proxy-authored messages) and true negatives (presumed patient-authored messages) were 0.681 and 0.960, respectively. CONCLUSION: Our work demonstrates the feasibility of using unstructured, heterogenous patient portal secure messages to determine portal secure message authorship. Identifying patient authorship in real time can improve patient portal account security and can be used to improve the quality of the information extracted from the patient portal, such as patient-reported outcomes.

publication date

  • December 1, 2022

Research

keywords

  • Natural Language Processing
  • Patient Portals

Identity

PubMed Central ID

  • PMC10476725

Scopus Document Identifier

  • 85144597229

Digital Object Identifier (DOI)

  • 10.1200/CCI.22.00071

PubMed ID

  • 36542818

Additional Document Info

volume

  • 6