SNP chips perform poorly for detecting very rare genetic variants

We are both geneticists interested in finding and understanding diseases caused by rare genetic variants. Genetic testing for these rare diseases can have profound clinical impact. For example, women with a rare pathogenic variant in the BRCA1 or BRCA2 genes may be advised to undergo bilateral prophylactic mastectomy.

A few years ago, we heard anecdotes from clinicians about patients who had received false positive results that had been used to schedule invasive medical procedures that were both unnecessary and unwarranted. False positive results from SNP chips for very rare genetic variants had also started to appear in the published literature. SNP chips are DNA microarrays originally designed to assay common genetic variants across the genome that are present in more than 1 in 100 individuals. They have been successfully used in genome-wide association studies and consumer genomic testing for over a decade. However, they have increasingly been used to assay thousands of rare genetic variants, which are much more challenging to accurately genotype using SNP chip technology, resulting in an increase in false positive results, i.e. where the genetic variant identified is not present in the individual.

Although the challenges of reliably detecting rare variants using SNP chips were known within the genetics community, we were surprised to find that a systematic evaluation of the performance of this assay had not been published. Because SNP chips are such a widely used and high-performing assay for common genetic variants, we were also surprised that the differing performance of SNP chips for detecting rare variants was not well appreciated in the wider research or medical communities. Luckily, we had recently received both SNP chip and genome-wide DNA sequencing data on 50,000 individuals through the UK Biobank—a population cohort of adult volunteers from across the UK. This large dataset allowed us to systematically investigate the performance of SNP chips across millions of genetic variants with a wide range of frequencies, down to those present in fewer than 1 in 50,000 individuals.

As expected, we found that although the SNP chips performed very well for most (common) variants, they performed much less well for rare variants, with low positive predictive values of around 16% for very rare variants in UK Biobank. Knowing that this result would likely vary with the design and manufacture of different SNP chips, as well as the quality of the sequencing data, we replicated our analysis in a small group of individuals who had shared their consumer genomics data online via the Personal Genome Project. We found that nearly every individual had at least one rare disease-causing variant that was falsely detected by the SNP chip. According to our research, it seems that a very rare, disease-causing variant detected using a SNP chip is more likely to be wrong than right. Although some consumer genomics companies perform sequencing to validate important results before releasing them to consumers, most consumers also download their “raw” SNP chip data for secondary analysis, and this raw data still contain these erroneous results. 

The implications of our findings are very simple: SNP chips perform poorly for detecting very rare genetic variants and the results should not be used in clinical practice without validation.

Caroline Wright is Professor in Genomic Medicine, University of Exeter. Twitter: @carolinefwright

Michael Weedon is Associate Professor in Bioinformatics and Human Genetics, University of Exeter. Twitter: @mnweedon

Competing interests: see research paper.