Abstract Scientific progress is based on the ability to compare opposing theories and thereby develop consensus among existing hypotheses or create new ones. We argue that data aggregation (i.e. combine data across studies or research groups) for neuroscience is an important tool in this process. An important prerequisite is the ability to directly compare fMRI results over studies. In this paper, we discuss how an observed effect size in an fMRI data-analysis can be transformed into a standardized effect size. We demonstrate how these enable direct comparison and data aggregation over studies. Furthermore, we also discuss the influence of key parameters in the design of an fMRI experiment (such as number of scans and the sample size) on (statistical) properties of standardized effect sizes. In the second part of the paper, we give an overview of two approaches to aggregate fMRI results over studies. The first corresponds to extending the two-level general linear model approach as is typically used in individual fMRI studies with a third level. This requires the parameter estimates corresponding to the group models from each study together with estimated variances and meta-data. Unfortunately, there is a risk of running into unit mismatches when the primary studies use different scales to measure the BOLD response. To circumvent, it is possible to aggregate (unitless) standardized effect sizes which can be derived from summary statistics. We discuss a general model to aggregate these and different approaches to deal with between-study heterogeneity. Furthermore, we hope to further promote the usage of standardized effect sizes in fMRI research.