Background: The lack of machine-interpretable representations of consent permissions precludes development of tools that act upon permissions across information ecosystems, at scale.
Objectives: To report the process, results, and lessons learned while annotating permissions in clinical consent forms.
Methods: We conducted a retrospective analysis of clinical consent forms. We developed an annotation scheme following the MAMA (Model-Annotate-Model-Annotate) cycle and evaluated interannotator agreement (IAA) using observed agreement (A o), weighted kappa (κw ), and Krippendorff's α.
Results: The final dataset included 6,399 sentences from 134 clinical consent forms. Complete agreement was achieved for 5,871 sentences, including 211 positively identified and 5,660 negatively identified as permission-sentences across all three annotators (A o = 0.944, Krippendorff's α = 0.599). These values reflect moderate to substantial IAA. Although permission-sentences contain a set of common words and structure, disagreements between annotators are largely explained by lexical variability and ambiguity in sentence meaning.
Conclusion: Our findings point to the complexity of identifying permission-sentences within the clinical consent forms. We present our results in light of lessons learned, which may serve as a launching point for developing tools for automated permission extraction.
Thieme. All rights reserved.