Objective: Preprints have had a prominent role in the swift scientific response to COVID-19. Two years into the pandemic, we investigated how much preprints had contributed to timely data sharing by analyzing the lag time from preprint posting to journal publication.
Results: To estimate the median number of days between the date a manuscript was posted as a preprint and the date of its publication in a scientific journal, we analyzed preprints posted from January 1, 2020, to December 31, 2021 in the NIH iSearch COVID-19 Portfolio database and performed a Kaplan-Meier (KM) survival analysis using a non-mixture parametric cure model. Of the 39,243 preprints in our analysis, 7712 (20%) were published in a journal, after a median lag of 178 days (95% CI: 175-181). Most of the published preprints were posted on the bioRxiv (29%) or medRxiv (65%) servers, which allow authors to choose a subject category when posting. Of the 20,698 preprints posted on these two servers, 7358 (36%) were published, including approximately half of those categorized as biochemistry, biophysics, and genomics, which became published articles within the study interval, compared with 29% categorized as epidemiology and 26% as bioinformatics.
Keywords: COVID-19; Data availability; Journal publication; Preprint; Publication time.
© 2022. The Author(s).