Objective: Survival analysis is the cornerstone of many healthcare applications in which the "survival" probability (eg, time free from a certain disease, time to death) of a group of patients is computed to guide clinical decisions. It is widely used in biomedical research and healthcare applications. However, frequent sharing of exact survival curves may reveal information about the individual patients, as an adversary may infer the presence of a person of interest as a participant of a study or of a particular group. Therefore, it is imperative to develop methods to protect patient privacy in survival analysis.
Materials and methods: We develop a framework based on the formal model of differential privacy, which provides provable privacy protection against a knowledgeable adversary. We show the performance of privacy-protecting solutions for the widely used Kaplan-Meier nonparametric survival model.
Results: We empirically evaluated the usefulness of our privacy-protecting framework and the reduced privacy risk for a popular epidemiology dataset and a synthetic dataset. Results show that our methods significantly reduce the privacy risk when compared with their nonprivate counterparts, while retaining the utility of the survival curves.
Discussion: The proposed framework demonstrates the feasibility of conducting privacy-protecting survival analyses. We discuss future research directions to further enhance the usefulness of our proposed solutions in biomedical research applications.
Conclusion: The results suggest that our proposed privacy-protection methods provide strong privacy protections while preserving the usefulness of survival analyses.
Keywords: Kaplan-Meier; actuarial; data privacy; data sharing; survival analysis.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.