Embracing the Human Family: Consumer Genomics is the Future of Genomics Research
We’re all in this together and we are all a human family.
23andMe recently published its massive study on how genetics affects people's susceptibility and response to COVID-19. This study identified many genetic markers associated with COVID-19 severity: In other words, the aggregate effect of all these genetic markers has a lot to do with how severe your COVID-19 symptoms might be were you infected.
What makes this study special is not that it's about COVID-19 or even that it involved genetics. What makes this study truly of note is the size of its research sample and the speed at which it was collected. When it comes to genetics, as with a few other things, size really does matter.
23andMe has over 10 million customers – meaning over 10 million partially sequenced genomes. About 80% of those customers participate in its research program, where they occasionally answer online surveys. 23andMe's customer base’s size means its researchers could recruit research participants in the same geographic areas as the COVID-19 outbreak spread across the country.
23andMe recruited 1.05 million people for this study in only 4 months.
Let's put that in context of some of the major sources of genomic data used in genomic research:
UK Biobank: 500,000 participants from the UK
1000 Genomes Project: 2,504 individuals from 26 populations
Academia is too slow to get it done. Anyone with any experience in academic research knows that collecting data – especially data involving human beings – requires tons of paperwork, approval committees, and red tape. And that's before you actually begin recruiting people. There are good and bad reasons for this slow approval but the result is that getting enough people for a study is a huge undertaking. There's a reason that so many studies are based on easy-to-access research participants like college psychology students and doctors' patient pools. It's no surprise – and indeed often expected – that data collection alone might take months or years to complete. Recruiting 1.05 million people for a study in 4 months during a pandemic is as astounding as it is unheard of.
In genomics research, bigger sample sizes lead to more and better findings on the connections between our DNA and a variety of human conditions and traits. You can see in the plot below that as sample sizes increased over the years, we get more and more studies with more and more interesting discoveries.
Why does sample size matter? That's because genomics researchers use an analysis approach called genome-wide association study, or GWAS for short. In this kind of study, the trait of interest – for example, COVID-19 severity, obesity, or height – is associated with not a single gene (such as a "fat" gene or "genius" gene). Instead, lots and lots of genetic markers each contribute small addictive effects to the overall trait of interest. When sample sizes get bigger, it becomes easier to detect these small effects. This insight – that conditions or traits may be the sum of many small effects of many genes – combined with increased sample sizes due to the decreased cost of DNA sequencing, is why we see more genetics studies in the news. Researchers can make breakthroughs that were impossible in the past.
The largest consumer genomic databases are owned by AncestryDNA, 23andMe, and MyHeritage. The largest by far is AncestryDNA, which contains over 15 million genomes, followed by 23andMe at over 10 million. DNAGeektracks database sizes of the major genealogical testing companies and shows the increasing size of genomic databases over the last decade:
As of mid-2019, that's about 30 million genomes – it's surely bigger now. The example of 23andMe's COVID-19 study shows what’s possible. We know that many diseases have a genetic component to susceptibility and response, such as sickle cell anemia, HIV, and now COVID-19. But what about other diseases? What about those lucky people who never seem to get sick – are there genetic markers for that kind of strength? And why limit ourselves to disease? Are there genetic factors for people who seem to roll with the punches, whether it be endless lockdown or workplace stress?
The point of research should not be to generate publications for well-credential researchers or help Big Pharma make more drugs – beneficial as those may be. The point is to make life better. The 23andMe study showed that people who carry certain genetic markers are more susceptible to severe COVID-19 symptoms. While researchers are trying to understand the mechanisms of those markers, you and I have more pressing concerns: if you have these genetic markers, how would you change your behavior? Do you focus on reducing your non-genetic risk, such as losing those extra 20 pounds? Do you find a job that lets you work from home indefinitely?
When we released Covid Forecaster back in December 2020, our algorithm told our users whether they had any of the genetic markers known at the time to be associated with severe COVID-19 symptoms. We have been humbled by some of the feedback that we've received from our app. People have told us that they made sure to take extra safety precautions. Some have used their genetics as a factor in their decision to get one of the COVID-19 vaccines.
But what if we take things one step further? What if you can reach out to 50,000 of your closest genetic friends and ask what they would do or what they are already doing or what they have done to mitigate their genetic risks or play up their genetic strengths? As researchers investigate the innerworkings of DNA, imagine a world where we can complement or extend that research by crowdsourcing knowledge from the very people living with those genomes. Now that would be human research.