Inter-method reliability of brainstem volume segmentation algorithms in preschoolers with ASD

Bosco, Paolo and Giuliano, Alessia and Delafield-Butt, Jonathan and Muratori, Filippo and Calderoni, Sara and Retico, Alessandra (2017) Inter-method reliability of brainstem volume segmentation algorithms in preschoolers with ASD. In: Organization for Human Brain Mapping (OHBM) Annual Meeting 2017, 2017-06-25 - 2017-06-29, Vancouver Convention Centre. (In Press)

[thumbnail of Bosco-etal-OHBM-2017-brainstem-volume-segmentation-algorithms]
Text. Filename: Bosco_etal_OHBM_2017_brainstem_volume_segmentation_algorithms.pdf
Accepted Author Manuscript

Download (895kB)| Preview


Introduction: The brainstem has a potential role in the pathophysiology of Autism Spectrum Disorders (ASD) (Roger, 2013). In particular, alterations in brainstem volume and their relationship with sensory/motor abnormalities have been suggested (Trevarthen & Delafield-Butt, 2013). However, the findings in volume alterations of subjects with ASD with respect to matched controls are controversial both in adults and children cohorts (Hardan, 2001; Piven, 1992; Kleiman, 1992). Moreover, the contribution to variability of brainstem volume measurements performed with different automated methods is still unclear. Methods: T1-weighted MRI brain scans of a cohort of 80 preschoolers (20 male controls, 20 male subjects with ASD, 20 female controls, 20 female subjects with ASD, mean age controls 49 months, std 12 months, mean age ASD 49 months, std 14) were processed with three different automated methods to measure the brainstem volume: Freesurfer 5.3 (Fischl, 2002), FSL-FIRST (Patenaude, 2011) and ANTs (Avants, 2011). Analysis of variance was then carried out taking into account gender and total brain volume in order to investigate potential brainstem volume differences between controls/ASD subjects for each method. A direct comparison of brainstem volume assessments in native space was then performed to assess inter-method reliability (correlation has been calculated by Pearson coefficient) and Dice similarity indexes were calculated to evaluate the segmentation agreement across methods. Results:The brainstem volume measurements are reported in scatter plots in Fig. 1 to show the agreement in terms of volume (in mm3) between different methods. The color represents the Dice similarity index (range 0-1 were 1 means total agreement) of the brainstem segmentations obtained by the methods under investigation. In Fig. 2 four examples of brainstem segmentations with the different methods are shown in sagittal view (brainstem segmentations are reported in red, green, blue for Freesurfer, FSL-FIRST and ANTs respectively). Pearson correlation coefficient between FSL-FIRST and Freesurfer brainstem volume assessments was 0.27 (p-value=0.02). It was 0.51 (p-value<0.001) and 0.87 (p-value<0.001) between FSL-FIRST vs. ANTs and ANTs vs. Freesurfer, respectively. Dice similarity index was 0.68 (std 0.12) between ANTs and FSL-FIRST segmentations, 0.81 (std 0.13) between ANTs and Freesurfer, 0.60 (std 0.17) between FSL- FIRST and Freesurfer. The analysis of variance did not find any significant differences in brainstem volumes for all the three segmentation methods between controls and ASD subjects (p-value>0.05).Conclusions:The inter-method reliability of automated algorithms for brainstem volume assessment is limited (the mean Dice similarity index barely reaches 0.8 in just one out of 3 comparisons). Inconsistencies across previous studies on brainstem and more in general the lack of evidence for brain biomarkers in ASD may in part be a result of this poor agreements in the extraction of structural features with different methods. Inter-method brainstem volume differences can be attributed to varying definitions of brainstem structure, the use of different templates (e.g. in our study only ANTs processed the brain scans by using an age-specific brain template) and the varying effects of imaging artifacts and acquisition settings. This study suggests that research on brain structure alterations should cross-validate findings across multiple methods before providing biological interpretations.