The Genomic Landscape of the Peruvian Andes
WILEY. 2019: 176
View details for Web of Science ID 000458409602127
Association study in African-admixed populations across the Americas recapitulates asthma risk loci in non-African populations.
2019; 10 (1): 880
Asthma is a complex disease with striking disparities across racial and ethnic groups. Despite its relatively high burden, representation of individuals of African ancestry in asthma genome-wide association studies (GWAS) has been inadequate, and true associations in these underrepresented minority groups have been inconclusive. We report the results of a genome-wide meta-analysis from the Consortium on Asthma among African Ancestry Populations (CAAPA; 7009 asthma cases, 7645 controls). We find strong evidence for association at four previously reported asthma loci whose discovery was driven largely by non-African populations, including the chromosome 17q12-q21 locus and the chr12q13 region, a novel (and not previously replicated) asthma locus recently identified by the Trans-National Asthma Genetic Consortium (TAGC). An additional seven loci reported by TAGC show marginal evidence for association in CAAPA. We also identify two novel loci (8p23 and 8q24) that may be specific to asthma risk in African ancestry populations.
View details for PubMedID 30787307
- Association study in African-admixed populations across the Americas recapitulates asthma risk loci in non-African populations MOSBY-ELSEVIER. 2019: AB296
Multi-Ancestry Genome-Wide Association Study of Spontaneous Clearance of Hepatitis C Virus.
BACKGROUND & AIMS: Spontaneous clearance of hepatitis C virus (HCV) occurs in approximately 30% of infected persons and less often in populations of African ancestry. Variants in major histocompatibility complex (MHC) and in interferon lambda genes are associated with spontaneous HCV clearance but there have been few studies of these variants in persons of African ancestry. We performed a dense multi-ancestry genome-wide association study of spontaneous clearance of HCV, focusing on individuals of African ancestry.METHODS: We performed genotype analyses of 4423 people from 3 ancestry groups: 2201 persons of African ancestry (445 with HCV clearance and 1756 with HCV persistence), 1739 persons of European ancestry (701 with HCV clearance and 1036 with HCV persistence), and 486 multi-ancestry Hispanic persons (173 with HCV clearance and 313 with HCV persistence). Samples were genotyped using Illumina arrays and statistically imputed to 1000 Genomes Project. For each ancestry group, the association of single nucleotide polymorphisms with HCV clearance was tested by log-additive analysis and then a meta-analysis was performed.RESULTS: In the meta-analysis, significant associations with HCV clearance were confirmed at the interferon lambda gene locus IFNL4-IFNL3 (19q13.2; P=5.99x10-50) and the MHC locus 6p21.32 (P=1.15x10-21). We also associated HCV clearance with polymorphisms in the G-protein-coupled receptor 158 gene (GPR158) at 10p12.1 (P=1.80x10-07). These 3 loci had independent, additive effects of HCV clearance, and account for 6.8% and 5.9% of the variance of HCV clearance in persons of European and African ancestry, respectively. Persons of African or European ancestry carrying all 6 variants were 24-fold and 11-fold, respectively, more likely to clear HCV infection compared to individuals carrying none or 1 of the clearance-associated variants.CONCLUSIONS: In a meta-analysis of data from 3 studies, we found variants in MHC genes, IFNL4-IFNL3, and GPR158 to increase odds of HCV clearance in patients of European and African ancestry. These findings could increase our understanding of immune response to and clearance of HCV infection.
View details for PubMedID 30593799
Standardized biogeographic grouping system for annotating populations in pharmacogenetic research.
Clinical pharmacology and therapeutics
The varying frequencies of pharmacogenetic alleles between populations have important implications for the impact of these alleles in different populations. Current population grouping methods to communicate these patterns are insufficient as they are inconsistent and fail to reflect the global distribution of genetic variability. To facilitate and standardize the reporting of variability in pharmacogenetic allele frequencies, we present seven geographically-defined groups: American, Central/South Asian, East Asian, European, Near Eastern, Oceanian, and Sub-Saharan African, and two admixed groups: African American/Afro-Caribbean and Latino. These nine groups are defined by global autosomal genetic structure and based on data from large-scale sequencing initiatives. We recognize that broadly grouping global populations is an oversimplification of human diversity and does not capture complex social and cultural identity. However, these groups meet a key need in pharmacogenetics research by enabling consistent communication of the scale of variability in global allele frequencies and are now used by PharmGKB. This article is protected by copyright. All rights reserved.
View details for PubMedID 30506572
Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies.
G3 (Bethesda, Md.)
The emergence of very large cohorts in genomic research has facilitated a focus on genotype-imputation strategies to power rare variant association. These strategies have benefited from improvements in imputation methods and association tests, however little attention has been paid to ways in which array design can increase rare variant association power. Therefore, we developed a novel framework to select tag SNPs using the reference panel of 26 populations from Phase 3 of the 1000 Genomes Project. We evaluate tag SNP performance via mean imputed r2 at untyped sites using leave-one-out internal validation and standard imputation methods, rather than pairwise linkage disequilibrium. Moving beyond pairwise metrics allows us to account for haplotype diversity across the genome for improve imputation accuracy and demonstrates population-specific biases from pairwise estimates. We also examine array design strategies that contrast multi-ethnic cohorts versus single populations, and show a boost in performance for the former can be obtained by prioritizing tag SNPs that contribute information across multiple populations simultaneously. Using our framework, we demonstrate increased imputation accuracy for rare variants (frequency<1%) by 0.5-3.1% for an array of one million sites and 0.7-7.1% for an array of 500,000 sites, depending on the population. Finally, we show how recent explosive growth in non-African populations means tag SNPs capture on average 30% fewer other variants than in African populations. The unified framework presented here will enable investigators to make informed decisions for the design of new arrays, and help empower the next phase of rare variant association for global health.
View details for PubMedID 30131328
Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe.
Proceedings of the National Academy of Sciences of the United States of America
2018; 115 (26): 6774?79
The extent to which prehistoric migrations of farmers influenced the genetic pool of western North Africans remains unclear. Archaeological evidence suggests that the Neolithization process may have happened through the adoption of innovations by local Epipaleolithic communities or by demic diffusion from the Eastern Mediterranean shores or Iberia. Here, we present an analysis of individuals' genome sequences from Early and Late Neolithic sites in Morocco and from Early Neolithic individuals from southern Iberia. We show that Early Neolithic Moroccans (5,000 BCE) are similar to Later Stone Age individuals from the same region and possess an endemic element retained in present-day Maghrebi populations, confirming a long-term genetic continuity in the region. This scenario is consistent with Early Neolithic traditions in North Africa deriving from Epipaleolithic communities that adopted certain agricultural techniques from neighboring populations. Among Eurasian ancient populations, Early Neolithic Moroccans are distantly related to Levantine Natufian hunter-gatherers (9,000 BCE) and Pre-Pottery Neolithic farmers (6,500 BCE). Late Neolithic (3,000 BCE) Moroccans, in contrast, share an Iberian component, supporting theories of trans-Gibraltar gene flow and indicating that Neolithization of North Africa involved both the movement of ideas and people. Lastly, the southern Iberian Early Neolithic samples share the same genetic composition as the Cardial Mediterranean Neolithic culture that reached Iberia 5,500 BCE. The cultural and genetic similarities between Iberian and North African Neolithic traditions further reinforce the model of an Iberian migration into the Maghreb.
View details for PubMedID 29895688
GENETIC LINKS BETWEEN SYMPTOMATIC ENTAMOEBA HISTOLYTICA INFECTION AND INFLAMMATORY BOWEL DISEASE
AMER SOC TROP MED & HYGIENE. 2017: 388
View details for Web of Science ID 000412851502759
Genetic identification of a common collagen disease in puerto ricans via identity-by-descent mapping in a health system
Achieving confidence in the causality of a disease locus is a complex task that often requires supporting data from both statistical genetics and clinical genomics. Here we describe a combined approach to identify and characterize a genetic disorder that leverages distantly related patients in a health system and population-scale mapping. We utilize genomic data to uncover components of distant pedigrees, in the absence of recorded pedigree information, in the multi-ethnic BioMe biobank in New York City. By linking to medical records, we discover a locus associated with both elevated genetic relatedness and extreme short stature. We link the gene, COL27A1, with a little-known genetic disease, previously thought to be rare and recessive. We demonstrate that disease manifests in both heterozygotes and homozygotes, indicating a common collagen disorder impacting up to 2% of individuals of Puerto Rican ancestry, leading to a better understanding of the continuum of complex and Mendelian disease.
View details for PubMedID 28895531
Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome
A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.
View details for DOI 10.1038/srep46398
View details for Web of Science ID 000399985900001
View details for PubMedID 28429804
Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations
AMERICAN JOURNAL OF HUMAN GENETICS
2017; 100 (4): 635-649
The vast majority of genome-wide association studies (GWASs) are performed in Europeans, and their transferability to other populations is dependent on many factors (e.g., linkage disequilibrium, allele frequencies, genetic architecture). As medical genomics studies become increasingly large and diverse, gaining insights into population history and consequently the transferability of disease risk measurement is critical. Here, we disentangle recent population history in the widely used 1000 Genomes Project reference panel, with an emphasis on populations underrepresented in medical studies. To examine the transferability of single-ancestry GWASs, we used published summary statistics to calculate polygenic risk scores for eight well-studied phenotypes. We identify directional inconsistencies in all scores; for example, height is predicted to decrease with genetic distance from Europeans, despite robust anthropological evidence that West Africans are as tall as Europeans on average. To gain deeper quantitative insights into GWAS transferability, we developed a complex trait coalescent-based simulation framework considering effects of polygenicity, causal allele frequency divergence, and heritability. As expected, correlations between true and inferred risk are typically highest in the population from which summary statistics were derived. We demonstrate that scores inferred from European GWASs are biased by genetic drift in other populations even when choosing the same causal variants and that biases in any direction are possible and unpredictable. This work cautions that summarizing findings from large-scale GWASs may have limited portability to other populations using standard approaches and highlights the need for generalized risk prediction methods and the inclusion of more diverse individuals in medical genomics.
View details for DOI 10.1016/j.ajhg.2017.03.004
View details for Web of Science ID 000398389600006
View details for PubMedID 28366442
Strategies for Enriching Variant Coverage in Candidate Disease Loci on a Multiethnic Genotyping Array
2016; 11 (12)
Investigating genetic architecture of complex traits in ancestrally diverse populations is imperative to understand the etiology of disease. However, the current paucity of genetic research in people of African and Latin American ancestry, Hispanic and indigenous peoples in the United States is likely to exacerbate existing health disparities for many common diseases. The Population Architecture using Genomics and Epidemiology, Phase II (PAGE II), Study was initiated in 2013 by the National Human Genome Research Institute to expand our understanding of complex trait loci in ethnically diverse and well characterized study populations. To meet this goal, the Multi-Ethnic Genotyping Array (MEGA) was designed to substantially improve fine-mapping and functional discovery by increasing variant coverage across multiple ethnicities at known loci for metabolic, cardiovascular, renal, inflammatory, anthropometric, and a variety of lifestyle traits. Studying the frequency distribution of clinically relevant mutations, putative risk alleles, and known functional variants across multiple populations will provide important insight into the genetic architecture of complex diseases and facilitate the discovery of novel, sometimes population-specific, disease associations. DNA samples from 51,650 self-identified African ancestry (17,328), Hispanic/Latino (22,379), Asian/Pacific Islander (8,640), and American Indian (653) and an additional 2,650 participants of either South Asian or European ancestry, and other reference panels have been genotyped on MEGA by PAGE II. MEGA was designed as a new resource for studying ancestrally diverse populations. Here, we describe the methodology for selecting trait-specific content for use in multi-ethnic populations and how enriching MEGA for this content may contribute to deeper biological understanding of the genetic etiology of complex disease.
View details for DOI 10.1371/journal.pone.0167758
View details for Web of Science ID 000392754300044
View details for PubMedID 27973554
View details for PubMedCentralID PMC5156387
Enabling improved low frequency variant imputation in multi-ethnic studies
WILEY-BLACKWELL. 2015: 593
View details for Web of Science ID 000363340500177
A Multi-Ethnic Genotyping Array for the Next Generation of Association Studies
WILEY-BLACKWELL. 2015: 551?52
View details for Web of Science ID 000363340500062
Relative performance of gene- and pathway-level methods as secondary analyses for genome-wide association studies
Despite the success of genome-wide association studies (GWAS), there still remains "missing heritability" for many traits. One contributing factor may be the result of examining one marker at a time as opposed to a group of markers that are biologically meaningful in aggregate. To address this problem, a variety of gene- and pathway-level methods have been developed to identify putative biologically relevant associations. A simulation was conducted to systematically assess the performance of these methods. Using genetic data from 4,500 individuals in the Wellcome Trust Case Control Consortium (WTCCC), case-control status was simulated based on an additive polygenic model. We evaluated gene-level methods based on their sensitivity, specificity, and proportion of false positives. Pathway-level methods were evaluated on the relationship between proportion of causal genes within the pathway and the strength of association.The gene-level methods had low sensitivity (20-63%), high specificity (89-100%), and low proportion of false positives (0.1-6%). The gene-level program VEGAS using only the top 10% of associated single nucleotide polymorphisms (SNPs) within the gene had the highest sensitivity (28.6%) with less than 1% false positives. The performance of the pathway-level methods depended on their reliance upon asymptotic distributions or if significance was estimated in a competitive manner. The pathway-level programs GenGen, GSA-SNP and MAGENTA had the best performance while accounting for potential confounders.Novel genes and pathways can be identified using the gene and pathway-level methods. These methods may provide valuable insight into the "missing heritability" of traits and provide biological interpretations to GWAS findings.
View details for DOI 10.1186/s12863-015-0191-2
View details for Web of Science ID 000352474200001
View details for PubMedID 25887572
View details for PubMedCentralID PMC4391470
Genome-wide association study of hepatitis C virus- and cryoglobulin-related vasculitis
GENES AND IMMUNITY
2014; 15 (7): 500-505
The host genetic basis of mixed cryoglobulin vasculitis is not well understood and has not been studied in large cohorts. A genome-wide association study was conducted among 356 hepatitis C virus (HCV) RNA-positive individuals with cryoglobulin-related vasculitis and 447 ethnically matched, HCV RNA-positive controls. All cases had both serum cryoglobulins and a vasculitis syndrome. A total of 899?641 markers from the Illumina HumanOmni1-Quad chip were analyzed using logistic regression adjusted for sex, as well as genetically determined ancestry. Replication of select single-nucleotide polymorphisms (SNPs) was conducted using 91 cases and 180 controls, adjusting for sex and country of origin. The most significant associations were identified on chromosome 6 near the NOTCH4 and MHC class II genes. A genome-wide significant association was detected on chromosome 6 at SNP rs9461776 (odds ratio=2.16, P=1.16E-07) between HLA-DRB1 and DQA1: this association was further replicated in additional independent samples (meta-analysis P=7.1 × 10(-9)). A genome-wide significant association with cryoglobulin-related vasculitis was identified with SNPs near NOTCH4 and MHC Class II genes. The two regions are correlated and it is difficult to disentangle which gene is responsible for the association with mixed cryoglobulinemia vasculitis in this extended major histocompatibility complex region.
View details for DOI 10.1038/gene.2014.41
View details for Web of Science ID 000343960500009
View details for PubMedID 25030430
Admixture analysis of spontaneous hepatitis C virus clearance in individuals of African descent.
Genes and immunity
2014; 15 (4): 241-246
Hepatitis C virus (HCV) infects an estimated 3% of the global population with the majority of individuals (75-85%) failing to clear the virus without treatment, leading to chronic liver disease. Individuals of African descent have lower rates of clearance compared with individuals of European descent and this is not fully explained by social and environmental factors. This suggests that differences in genetic background may contribute to this difference in clinical outcome following HCV infection. Using 473 individuals and 792,721 single-nucleotide polymorphisms (SNPs) from a genome-wide association study (GWAS), we estimated local African ancestry across the genome. Using admixture mapping and logistic regression, we identified two regions of interest associated with spontaneous clearance of HCV (15q24, 20p12). A genome-wide significant variant was identified on chromosome 15 at the imputed SNP, rs55817928 (P=6.18 × 10(-8)) between the genes SCAPER and RCN. Each additional copy of the African ancestral C allele is associated with 2.4 times the odds of spontaneous clearance. Conditional analysis using this SNP in the logistic regression model explained one-third of the local ancestry association. Additionally, signals of selection in this area suggest positive selection due to some ancestral pathogen or environmental pressure in African, but not in European populations.
View details for DOI 10.1038/gene.2014.11
View details for PubMedID 24622687
Variants in HAVCR1 Gene Region Contribute to Hepatitis C Persistence in African Americans
JOURNAL OF INFECTIOUS DISEASES
2014; 209 (3): 355-359
To confirm previously identified polymorphisms in HAVCR1 that were associated with persistent hepatitis C virus (HCV) infection in individuals of African and of European descent, we studied 165 subjects of African descent and 635 subjects of European descent. Because the association was only confirmed in subjects of African descent (rs6880859; odds ratio, 2.42; P = .01), we then used 379 subjects of African descent (142 with spontaneous HCV clearance) to fine-map HAVCR1. rs111511318 was strongly associated with HCV persistence after adjusting for IL28B and HLA (adjusted P = 8.8 × 10(-4)), as was one 81-kb haplotype (adjusted P = .0006). The HAVCR1 genomic region is an independent genetic determinant of HCV persistence in individuals of African descent.
View details for DOI 10.1093/infdis/jit444
View details for Web of Science ID 000329921700009
View details for PubMedID 23964107
Genome-Wide Association Study of Spontaneous Resolution of Hepatitis C Virus Infection: Data From Multiple Cohorts
ANNALS OF INTERNAL MEDICINE
2013; 158 (4): 235-?
Chinese translationHepatitis C virus (HCV) infections occur worldwide and either spontaneously resolve or persist and markedly increase the person's lifetime risk for cirrhosis and hepatocellular carcinoma. Although HCV persistence occurs more often in persons of African ancestry and persons with genetic variants near interleukin-28B (IL-28B), the genetic basis is not well-understood.To evaluate the host genetic basis for spontaneous resolution of HCV infection.2-stage, genome-wide association study.13 international multicenter study sites.919 persons with serum HCV antibodies but no HCV RNA (spontaneous resolution) and 1482 persons with serum HCV antibodies and HCV RNA (persistence).Frequencies of 792 721 single nucleotide polymorphisms (SNPs).Differences in allele frequencies between persons with spontaneous resolution and persistence were identified on chromosomes 19q13.13 and 6p21.32. On chromosome 19, allele frequency differences localized near IL-28B and included rs12979860 (overall per-allele OR, 0.45; P = 2.17 × 10-30) and 10 additional SNPs spanning 55 000 base pairs. On chromosome 6, allele frequency differences localized near genes for HLA class II and included rs4273729 (overall per-allele OR, 0.59; P = 1.71 × 10-16) near DQB1*03:01 and an additional 116 SNPs spanning 1 090 000 base pairs. The associations in chromosomes 19 and 6 were independent and additive and explain an estimated 14.9% (95% CI, 8.5% to 22.6%) and 15.8% (CI, 4.4% to 31.0%) of the variation in HCV resolution in persons of European and African ancestry, respectively. Replication of the chromosome 6 SNP, rs4272729, in an additional 745 persons confirmed the findings (P = 0.015).Epigenetic effects were not studied.IL-28B and HLA class II are independently associated with spontaneous resolution of HCV infection, and SNPs marking IL-28B and DQB1*03:01 may explain approximately 15% of spontaneous resolution of HCV infection.
View details for Web of Science ID 000315580300014
View details for PubMedID 23420232
Polymorphisms in Toll-like receptor genes influence antibody responses to cytomegalovirus glycoprotein B vaccine.
BMC research notes
2012; 5: 140-?
Congenital Cytomegalovirus (CMV) infection is an important medical problem that has yet no current solution. A clinical trial of CMV glycoprotein B (gB) vaccine in young women showed promising efficacy. Improved understanding of the basis for prevention of CMV infection is essential for developing improved vaccines.We genotyped 142 women previously vaccinated with three doses of CMV gB for single nucleotide polymorphisms (SNPs) in TLR 1-4, 6, 7, 9, and 10, and their associated intracellular signaling genes. SNPs in the platelet-derived growth factor receptor (PDGFRA) and integrins were also selected based on their role in binding gB. Specific SNPs in TLR7 and IKBKE (inhibitor of nuclear factor kappa-B kinase subunit epsilon) were associated with antibody responses to gB vaccine. Homozygous carriers of the minor allele at four SNPs in TLR7 showed higher vaccination-induced antibody responses to gB compared to heterozygotes or homozygotes for the common allele. SNP rs1953090 in IKBKE was associated with changes in antibody level from second to third dose of vaccine; homozygotes for the minor allele exhibited lower antibody responses while homozygotes for the major allele showed increased responses over time.These data contribute to our understanding of the immunogenetic mechanisms underlying variations in the immune response to CMV vaccine.
View details for DOI 10.1186/1756-0500-5-140
View details for PubMedID 22414065
Identification of functional genetic variation in exome sequence analysis.
2011; 5: S13-?
Recent technological advances have allowed us to study individual genomes at a base-pair resolution and have demonstrated that the average exome harbors more than 15,000 genetic variants. However, our ability to understand the biological significance of the identified variants and to connect these observed variants with phenotypes is limited. The first step in this process is to identify genetic variation that is likely to result in changes to protein structure and function, because detailed studies, either population based or functional, for each of the identified variants are not practicable. Therefore algorithms that yield valid predictions of a variant's functional significance are needed. Over the past decade, several programs have been developed to predict the probability that an observed sequence variant will have a deleterious effect on protein function. These algorithms range from empirical programs that classify using known biochemical properties to statistical algorithms trained using a variety of data sources, including sequence conservation data, biochemical properties, and functional data. Using data from the pilot3 study of the 1000 Genomes Project available through Genetic Analysis Workshop 17, we compared the results of four programs (SIFT, PolyPhen, MAPP, and VarioWatch) used to predict the functional relevance of variants in 101 genes. Analysis was conducted without knowledge of the simulation model. Agreement between programs was modest ranging from 59.4% to 71.4% and only 3.5% of variants were classified as deleterious and 10.9% as tolerated across all four programs.
View details for DOI 10.1186/1753-6561-5-S9-S13
View details for PubMedID 22373437