Background: Fetal echocardiography is widely available, but normative data are not robust. In this pilot study, the authors evaluated (1) the feasibility of prespecified measurements in a normal fetal echocardiogram to inform study design and (2) measurement variability to assign thresholds of clinical significance and guide analyses in larger fetal echocardiographic Z score initiatives.
Methods: Images from predefined gestational age groups (16-20, >20-24, >24-28, and >28-32 weeks) were retrospectively analyzed. Fetal echocardiography expert raters attended online group training and then independently analyzed 73 fetal studies (18 per age group) in a fully crossed design of 53 variables; each observer repeated measures for 12 fetuses. Kruskal-Wallis tests were used to compare measurements across centers and age groups. Coefficients of variation (CoVs) were calculated at the subject level for each measurement as the ratio of SD to mean. Intraclass correlation coefficients were used to show inter- and intrarater reliabilities. Cohen's d > 0.8 was used to define clinically important differences. Measurements were plotted against gestational age, biparietal diameter, and femur length.
Results: Expert raters completed each set of measurements in a mean of 23 ± 9 min/fetus. Missingness ranged from 0% to 29%. CoVs were similar across age groups for all variables (P < .05) except ductus arteriosus mean velocity and left ventricular ejection time, which were both higher at older gestational age. CoVs were >15% for right ventricular systolic and diastolic widths despite fair to good repeatability (intraclass correlation coefficient > 0.5); ductal velocities and two-dimensional measures, left ventricular short-axis dimensions, and isovolumic times all had high CoVs and high interobserver variability despite good to excellent intraobserver agreement (intraclass correlation coefficient > 0.6). CoVs did not improve when ratios (e.g., tricuspid/mitral annulus) were used instead of linear measurements. Overall, 27 variables had acceptable inter- and intraobserver repeatability, while 14 had excessive variability between readers despite good intraobserver agreement.
Conclusions: There is considerable variability in fetal echocardiographic quantification in clinical practice that may affect the design of multicenter fetal echocardiographic Z score studies, and not all measurements may be feasible for standard normalization. As missingness was substantial, a prospective design will be needed. Data from this pilot study may aid in the calculation of sample sizes and inform thresholds for distinguishing clinically significant from statistically significant effects.
Keywords: Congenital heart disease; Fetal echocardiography; Prenatal diagnosis; Standardization.
Copyright © 2023 American Society of Echocardiography. Published by Elsevier Inc. All rights reserved.