What statistical analysis of observational performance data can tell us and what it cannot: the case of Dutee Chand vs IAAF vs AFI

By Simon Franklin, Jonathan Ospina Betancurt @JonathanOspinaB and Silvia Camporesi @silviacamporesi

How can performance data resolve the arbitration of sensitive matters in the world of sports? In the absence of experimental data (i.e. clinical trials), researchers must build an argument based on associations in observational data. This data is often not widely available. Below, we’ll have a look at this using the Dutee Chand vs IAAF vs AFI case.

Bermon and Garnie (1) use correlations between free testosterone (fT) and athletic performance across 21 women’s sporting athletic disciplines (events) to claim that women with high fT have a performance advantage in a subset of five of those events. However, the application of statistical techniques, and interpretation of results, in such studies are not neutral, nor standardized. Researchers face many choices in the analysis of such data, and independent researchers may not reach the same conclusions from the same set of statistical associations.

Our reanalysis [forthcoming, Franklin et al BJSM 2018] of the available data presented by Bermon and Garnie suggests, at the very least, that further analysis is required to establish the claims made in the paper. The authors report that the advantage in athletic performance conferred by higher testosterone falls in the range of 1.8-4.5 %. But this range is determined by presenting only the five largest (significant) coefficient estimates. At least some of these estimates are likely to be false discoveries, given that over 20 tests were performed and multiple hypothesis testing corrections were not performed.

Unfortunately, without publically available raw data, it is not possible to perform all the desired robustness checks on the data. In lieu of access to such data, we performed a Fisher’s combination test using the p-values calculated from the published data. After performing such test we are unable to reject the global null hypothesis that all null hypotheses are true, i.e. the pattern of p-values is not inconsistent with there being no advantage to high fT women, in any one of the events. In simpler terms: it is possible that the correlations presented in the paper (even the largest ones) occurred simply by chance. A reader of the main results should therefore reasonably conclude that the effect of high testosterone may be considerably lower than concluded in the original paper.

In light of our re-analysis, we argue first that raw data used in such studies that will have direct implications for real world outcomes, should be made publically available for other researchers to analyse. Second, we conclude that the interpretation of estimated correlations should also be conducted with great caution, and be referred to independent statisticians.

While we do not claim to play the role of such an independent statistical arbitrator in this case (especially since we have not had access to the raw data), our statistical analysis already allows one to conclude that the article by Bermon and Garnie 2017 does not meet the standard of proof set by the Court of Arbitration for Sport (CAS), without further analysis. In this scenario, independent analysis is required.

Simon Franklin is a Postdoctoral Research Economist at the London School of Economics and received his PhD in Economics from the University of Oxford. His primary research interests include urban labour markets and housing policy, though he has a keen interest in elite athletic performance, and methodological questions related to statistical inference more generally. 

Jonathan Ospina Betancurt (@JonathanOspinaB) is a lecturer in Sport Science & Physical Activity at Universidad Isabel I, Spain. His most recent research examines Sex-differences in elite-performance and hyperandrogenic athletes. His fields of research are DSD and transgender athletes, ethics and values in sport. Is a JHSE Associate Editor.

Dr Silvia Camporesi (@silviacamporesi) is a lecturer in Bioethics & Society at King’s College London, where she directs the MSc in Bioethics & Society. Her latest book, “Bioethics Genetics and Sport”, co-authored with Mike McNamee, is forthcoming for Routledge in March 2018.


  1. Bermon S, Garnier P. Serum androgen levels and their relation to performance in track and field: mass spectrometry results from 2127 observations in male and female elite athletes. 

(Visited 615 times, 1 visits today)