Abstract |
Selection bias arises when the effects of selection of variables or models on subsequent statistical analyses are ignored, i.e., failure to take into account “double dipping” of the data when assessing statistical evidence. Eighty years ago, the prominent statistician and mathematical economist Harold Hotelling drew attention to this issue. In recent years, there has been a concerted effort to address the problem, giving rise to the nascent field of post-selection inference. In this talk I will give a review focusing on several post-selection inference problems: large-scale case-control studies, canonical correlation analysis in high dimensions, and screening high-dimensional predictors of survival outcomes. |