Simulants: Synthetic Clinical Trial Data via Subject-Level Privacy-Preserving Synthesis. Academic Article uri icon

Overview

abstract

  • Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any individual trial due to a combination of regulatory, intellectual property, and patient privacy barriers. Synthetic clinical trial data that captures the analytical properties of the source data, could provide significant value for research and drug development by making insights widely available while protecting the privacy of the participants. We present a method "Simulants" for generating research-grade synthetic clinical trial data from a real data source. We compared the fidelity and privacy preservation performance of Simulants to the state-of-the-art deep learning synthesizers and found that Simulants had superior performance when applied to clinical trial data as assessed both by established metrics and when considering critical clinical features. We also demonstrate how Simulants' privacy settings may be configured to conform to specific privacy policies governing data sharing.

publication date

  • April 29, 2023

Research

keywords

  • Confidentiality
  • Privacy

Identity

PubMed Central ID

  • PMC10148292

PubMed ID

  • 37128411

Additional Document Info

volume

  • 2022