Toward a connectivity gradient-based framework for reproducible biomarker discovery
Hong SJ., Xu T., Nikolaidis A., Smallwood J., Margulies DS., Bernhardt B., Vogelstein J., Milham MP.
Despite myriad demonstrations of feasibility, the high dimensionality of fMRI data remains a critical barrier to its utility for reproducible biomarker discovery. Recent efforts to address this challenge have capitalized on dimensionality reduction techniques applied to resting-state fMRI, identifying principal components of intrinsic connectivity which describe smooth transitions across different cortical systems, so called “connectivity gradients”. These gradients recapitulate neurocognitively meaningful organizational principles that are present in both human and primate brains, and also appear to differ among individuals and clinical populations. Here, we provide a critical assessment of the suitability of connectivity gradients for biomarker discovery. Using the Human Connectome Project (discovery subsample=209; two replication subsamples= 209 × 2) and the Midnight scan club (n = 9), we tested the following key biomarker traits – reliability, reproducibility and predictive validity – of functional gradients. In doing so, we systematically assessed the effects of three analytical settings, including i) dimensionality reduction algorithms (i.e., linear vs. non-linear methods), ii) input data types (i.e., raw time series, [un-]thresholded functional connectivity), and iii) amount of the data (resting-state fMRI time-series lengths). We found that the reproducibility of functional gradients across algorithms and subsamples is generally higher for those explaining more variances of whole-brain connectivity data, as well as those having higher reliability. Notably, among different analytical settings, a linear dimensionality reduction (principal component analysis in our study), more conservatively thresholded functional connectivity (e.g., 95–97%) and longer time-series data (at least ≥20mins) was found to be preferential conditions to obtain higher reliability. Those gradients with higher reliability were able to predict unseen phenotypic scores with a higher accuracy, highlighting reliability as a critical prerequisite for validity. Importantly, prediction accuracy with connectivity gradients exceeded that observed with more traditional edge-based connectivity measures, suggesting the added value of a low-dimensional and multivariate gradient approach. Finally, the present work highlights the importance and benefits of systematically exploring the parameter space for new imaging methods before widespread deployment.