Elsevier

Translational Research

Volume 159, Issue 2, February 2012, Pages 64-79
Translational Research

Review Article
Molecular genetic studies of complex phenotypes

https://doi.org/10.1016/j.trsl.2011.08.001Get rights and content

The approach to molecular genetic studies of complex phenotypes evolved considerably during the recent years. The candidate gene approach, which is restricted to an analysis of a few single-nucleotide polymorphisms (SNPs) in a modest number of cases and controls, has been supplanted by the unbiased approach of genome-wide association studies (GWAS), wherein a large number of tagger SNPs are typed in many individuals. GWAS, which are designed on the common disease-common variant hypothesis (CD-CV), identified several SNPs and loci for complex phenotypes. However, the alleles identified through GWAS are typically not causative but rather in linkage disequilibrium (LD) with the true causal variants. The common alleles, which may not capture the uncommon and rare variants, account only for a fraction of heritability of the complex traits. Hence, the focus is being shifted to rare variants–common disease (RV-CD) hypothesis, surmising that rare variants exert large effect sizes on the phenotype. In conjunctional with this conceptual shift, technologic advances in DNA sequencing techniques have dramatically enhanced whole genome or whole exome sequencing capacity. The sequencing approach affords identification of not only the rare but also the common variants. The approach—whether used in complementation with GWAS or as a stand-alone approach—could define the genetic architecture of the complex phenotypes. Robust phenotyping and large-scale sequencing studies are essential to extract the information content of the vast number of DNA sequence variants (DSVs) in the genome. To garner meaningful clinical information and link the genotype to a phenotype, the identification and characterization of a large number of causal fields beyond the information content of DNA sequence variants would be necessary. This review provides an update on the current progress and limitations in identifying DSVs that are associated with phenotypic effects.

Section snippets

Complexity of the Nuclear Genome

The human nuclear genome (the genome) is an apparently simple and yet an exceedingly complex structure. The genome contains 3.2 billion nucleotides, which are composed of four repeating units that are ordered seemingly in random and are packed inside the nucleus as a 2-m-long polymer covered by the octomeric units of histones. A complex system orchestrates the accessibility of the double-stranded DNA to various proteins that regulate DNA synthesis and gene expression in response to internal and

Diversity of the Human Genomes

Humans are genetically diverse. They differ in approximately 0.1% of their genomes. The single-nucleotide polymorphism (SNP) database (dbSNP, Build 132) lists more than 37 million variants among humans. With the exception of identical twins, no two humans have identical genomes. Every genome contains approximately 4 million DSVs that affect half of the genes in each genome collectively, and many are private (Table I).5, 6, 7, 8, 9, 10, 11, 12 Most DSVs in the genome are SNPs, but structural

Etiologic Complexity of Complex Phenotypes

The plethora of DSVs in the genome and the multilayer regulation of gene expression and function are indicative of the intricacy of the determinants of the complex phenotypes.17 The clinical phenotypes are presumed to result from the additive effects of and interactions among multiple causative alleles with various genomics and environmental factors. In a complex phenotype, the effect sizes of the involved alleles are expected to vary and to follow a gradient that ranges from minimal or

Genetic Approaches to Complex Phenotypes

The full spectrum of allele frequency in a population is expected to follow a gradient ranging from private to extremely common alleles.25 Conventionally, however, the variants are categorized into 3 classes based on their minor allele frequencies (MAFs) in the population. Common and rare variants are those that have population MAFs of >5% and <1%, respectively. Variants that have population MAFs from 1% to 5% are considered uncommon or infrequent. As observed for the genetic causes of

Family studies

Complex phenotypes often show familial aggregation, in part because of shared genetic risk factors. When a single allele exerts a large effect on the phenotype, familial segregation will follow a Mendelian pattern of inheritance. However, the contribution of additional variants to the phenotype can give variable expressivity or incomplete penetrance.75 Unlike the phenotypes with a Mendelian pattern of inheritance, wherein the presence of the variant causes the phenotype, albeit with a variable

Design of Genetic Studies

Genetic studies aim to detect and quantify the risk of a disease or effectiveness of a specific therapy or the risk of adverse side effect at an individual level and yet must depend on group data to attain such information. Consequently, the robustness of the group data is imperative for appropriately extending the group data to an individual. Various factors determine robustness of the group data and the applicability of the findings to an individual, including sample size of the study,

Perspective

The GWAS, which has all but replaced the candidate gene approach, is built on the CD-CV hypothesis. It has been exceedingly successful in identifying many DSVs that are associated with the complex phenotypes. However, the direct clinical utility of the findings is usually limited, as the identified alleles have modest effect sizes on the phenotype, as anticipated. However, the lack or a paucity of the clinical use of GWAS should not lessen value of the main contribution of GWAS in providing

References (87)

  • E.S. Lander

    Initial impact of the sequencing of the human genome

    Nature

    (2011)
  • M. Blaxter

    Revealing the dark matter of the genome

    Science

    (2010)
  • S. Levy et al.

    The diploid genome sequence of an individual human

    PLoSBiol

    (2007)
  • C. Gunter

    Genomics: a picture worth 1000 Genomes

    Nat Rev Genet

    (2010)
  • Pennisi E. Genomics

    1000 Genomes Project gives new map of genetic diversity

    Science

    (2010)
  • E.R. Gamazon et al.

    Comprehensive survey of SNPs in the Affymetrix exon array using the 1000 Genomes dataset

    PLoS ONE

    (2010)
  • J. Wang et al.

    The diploid genome sequence of an Asian individual

    Nature

    (2008)
  • D.A. Wheeler et al.

    The complete genome of an individual by massively parallel DNA sequencing

    Nature

    (2008)
  • J.I. Kim et al.

    A highly annotated whole-genome sequence of a Korean individual

    Nature

    (2009)
  • J.M. Kidd et al.

    Mapping and sequencing of structural variation from eight human genomes

    Nature

    (2008)
  • P.H. Sudmant et al.

    Diversity of human copy number variation and multicopy genes

    Science

    (2010)
  • R.E. Mills et al.

    Mapping copy number variation by population-scale genome sequencing

    Nature

    (2011)
  • E.E. Eichler et al.

    Missing heritability and strategies for finding the underlying causes of complex disease

    Nat Rev Genet

    (2010)
  • R.M. Durbin et al.

    A map of human genome variation from population-scale sequencing

    Nature

    (2010)
  • A.J. Marian et al.

    Strategic approaches to unraveling genetic causes of cardiovascular diseases

    Circ Res

    (2011)
  • A.L. Barabasi et al.

    Network medicine: a network-based approach to human disease

    Nat Rev Genet

    (2011)
  • C.E. Romanoski et al.

    Network for activation of human endothelial cells by oxidized phospholipids: a critical role of heme oxygenase 1

    Circ Res

    (2011 Jul 7)
  • S. Kathiresan et al.

    Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans

    Nat Genet

    (2008)
  • S. Kathiresan et al.

    Common variants at 30 loci contribute to polygenic dyslipidemia

    Nat Genet

    (2009)
  • S. Debette et al.

    Identification of cis- and trans-acting genetic variants explaining up to half the variation in circulating vascular endothelial growth factor levels

    Circ Res

    (2011 Jul 14)
  • P. Innocenti et al.

    Experimental evidence supports a sex-specific selective sieve in mitochondrial genome evolution

    Science

    (2011)
  • B. Lemos et al.

    Polymorphic Y chromosomes harbor cryptic variation with manifold functional consequences

    Science

    (2008)
  • A.J. Marian

    Nature's genetic gradients and the clinical phenotype

    Circ Cardiovasc Genet

    (2009)
  • N. Risch et al.

    The future of genetic studies of complex human diseases

    Science

    (1996)
  • E.T. Cirulli et al.

    Uncovering the roles of rare variants in common disease through whole-genome sequencing

    Nat Rev Genet

    (2010)
  • W. Bodmer et al.

    Common and rare variants in multifactorial susceptibility to common diseases

    Nat Genet

    (2008)
  • J.K. Pritchard et al.

    The allelic architecture of human disease genes: common disease-common variant…or not?

    Hum Mol Genet

    (2002)
  • G. Rodriguez et al.

    Molecular Genetic and functional characterization implicate muscle-restricted coiled-coil gene (MURC) as a causal gene for familial dilated cardiomyopathy

    Circ Cardiovasc Genet

    (2011 Jun 3)
  • S.P. Dickson et al.

    Rare variants create synthetic genome-wide associations

    PLoS Biol

    (2010)
  • M.I. McCarthy et al.

    Genome-wide association studies for complex traits: consensus, uncertainty and challenges

    Nat Rev Genet

    (2008)
  • X. Gao et al.

    Avoiding the high Bonferroni penalty in genome-wide association studies

    Genet Epidemiol

    (2010)
  • C. Newton-Cheh et al.

    Genome-wide association study identifies eight loci associated with blood pressure

    Nat Genet

    (2009)
  • T.M. Teslovich et al.

    Biological, clinical and population relevance of 95 loci for blood lipids

    Nature

    (2010)
  • Cited by (0)

    Supported by Grant R01-088498 from NHLBI, Grant R21 AG038597-01 from NIA, Burroughs Wellcome Award in Translational Research #1005907, the TexGen Fund from Greater Houston Community Foundation and George and Mary Josephine Hamman Foundation.

    View full text