Objectives: To perform a requirements analysis of the barriers to conducting research linking of primary care, genetic and cancer data.
Methods: We extended our initial data-centric approach to include socio-cultural and business requirements. We created reference models of core data requirements common to most studies using unified modelling language (UML), dataflow diagrams (DFD) and business process modelling notation (BPMN). We conducted a stakeholder analysis and constructed DFD and UML diagrams for use cases based on simulated research studies. We used research output as a sensitivity analysis.
Results: Differences between the reference model and use cases identified study specific data requirements. The stakeholder analysis identified: tensions, changes in specification, some indifference from data providers and enthusiastic informaticians urging inclusion of socio-cultural context. We identified requirements to collect information at three levels: micro- data items, which need to be semantically interoperable, meso- the medical record and data extraction, and macro- the health system and socio-cultural issues. BPMN clarified complex business requirements among data providers and vendors; and additional geographical requirements for patients to be represented in both linked datasets. High quality research output was the norm for most repositories.
Conclusions: Reference models provide high-level schemata of the core data requirements. However, business requirements' modelling identifies stakeholder issues and identifies what needs to be addressed to enable participation.