Impact of natural selection on global patterns of genetic variation, and association with clinical phenotypes, at genes involved in SARS-CoV-2 infection.
Human genomic diversity has been shaped by both ancient and ongoing challenges from viruses. The current coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has had a devastating impact on population health. However, genetic diversity and evolutionary forces impacting host genes related to SARS-CoV-2 infection are not well understood. We investigated global patterns of genetic variation and signatures of natural selection at host genes relevant to SARS-CoV-2 infection (angiotensin converting enzyme 2 [ACE2], transmembrane protease serine 2 [TMPRSS2], dipeptidyl peptidase 4 [DPP4], and lymphocyte antigen 6 complex locus E [LY6E]). We analyzed data from 2,012 eth- nically diverse Africans and 15,977 individuals of European and African ancestry with electronic health records and integrated with global data from the 1000 Genomes Pro- ject. At ACE2, we identified 41 nonsynonymous variants that were rare in most popula- tions, several of which impact protein function. However, three nonsynonymous variants (rs138390800, rs147311723, and rs145437639) were common among central African hunter-gatherers from Cameroon (minor allele frequency 0.083 to 0.164) and are on haplotypes that exhibit signatures of positive selection. We identify signatures of selection impacting variation at regulatory regions influencing ACE2 expression in mul- tiple African populations. At TMPRSS2, we identified 13 amino acid changes that are adaptive and specific to the human lineage compared with the chimpanzee genome. Genetic variants that are targets of natural selection are associated with clinical pheno- types common in patients with COVID-19. Our study provides insights into global variation at host genes related to SARS-CoV-2 infection, which have been shaped by natural selection in some populations, possibly due to prior viral infections.