Current statistical inference methods for task-fMRI suffer from two fundamental limitations. First, the focus is solely on detection of non-zero signal or signal change, a problem that is exacerbated for large scale studies (e.g. UK Biobank, N=40,000+) where the 'null hypothesis fallacy' causes even trivial effects to be determined as significant. Second, for any sample size, widely used cluster inference methods only indicate regions where a null hypothesis can be rejected, without providing any notion of spatial uncertainty about the activation. In this work, we address these issues by developing spatial Confidence Sets (CSs) on clusters found in thresholded Cohen's d effect size images. We produce an upper and lower CS to make confidence statements about brain regions where Cohen's d effect sizes have exceeded and fallen short of a non-zero threshold, respectively. The CSs convey information about the magnitude and reliability of effect sizes that is usually given separately in a t-statistic and effect estimate map. We expand the theory developed in our previous work on CSs for %BOLD change effect maps (Bowring et al., 2019) using recent results from the bootstrapping literature. By assessing the empirical coverage with 2D and 3D Monte Carlo simulations resembling fMRI data, we find our method is accurate in sample sizes as low as N=60. We compute Cohen's d CSs for the Human Connectome Project working memory task-fMRI data, illustrating the brain regions with a reliable Cohen's d response for a given threshold. By comparing the CSs with results obtained from a traditional statistical voxelwise inference, we highlight the improvement in activation localization that can be gained with the Confidence Sets.