Skip to main content

Advertisement

Log in

Identification of common genetic variants that account for transcript isoform variation between human populations

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

In addition to the differences between populations in transcriptional and translational regulation of genes, alternative pre-mRNA splicing (AS) is also likely to play an important role in regulating gene expression and generating variation in mRNA and protein isoforms. Recently, the genetic contribution to transcript isoform variation has been reported in individuals of recent European descent. We report here results of an investigation of the differences in AS patterns between human populations. AS patterns in 176 HapMap lymphoblastoid cell lines derived from individuals of European and African ancestry were evaluated using the Affymetrix GeneChip® Human Exon 1.0 ST Array. A variety of biological processes such as response to stimulus and transcription were found to be enriched among the differentially spliced genes. The differentially spliced genes also include some involved in human diseases that have different prevalence or susceptibility between populations. The genetic contribution to the population differences in transcript isoform variation was then evaluated by a genome-wide association using the HapMap genotypic data on single nucleotide polymorphisms (SNPs). The results suggest that local and distant genetic variants account for a substantial fraction of the observed transcript isoform variation between human populations. Our findings provide new insights into the complexity of the human genome as well as the health disparities between the two populations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Abecasis GR, Cardon LR, Cookson WO (2000a) A general test of association for quantitative traits in nuclear families. Am J Hum Genet 66:279–292

    Article  PubMed  CAS  Google Scholar 

  • Abecasis GR, Cookson WO, Cardon LR (2000b) Pedigree tests of transmission disequilibrium. Eur J Hum Genet 8:545–551

    Article  PubMed  CAS  Google Scholar 

  • Affymetrix Inc. (2006) Identifying and validating alternative splicing events. Affymetrix Technical Note

  • Affymetrix Inc. (2007) Human Gene 1.0 ST Array Performance. Affymetrix GeneChip Gene and Exon Array Whitepaper Collection

  • Alberts R, Terpstra P, Li Y, Breitling R, Nap JP, Jansen RC (2007) Sequence polymorphisms cause many false cis eQTLs. PLoS ONE 2:e622

    Article  PubMed  Google Scholar 

  • Applied Biosystems (2004) Guide to performing relative qualification of gene expression using Real-Time quantitative PCR. Technical Note

  • Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29

    Article  PubMed  CAS  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B (57):289–300

  • Brinkman BM (2004) Splice variants as cancer biomarkers. Clin Biochem 37:584–594

    Article  PubMed  CAS  Google Scholar 

  • Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M, Spielman RS (2003) Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet 33:422–425

    Article  PubMed  CAS  Google Scholar 

  • Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4:P3

    Article  PubMed  Google Scholar 

  • Diabetes Epidemiology Research International Study Group (1988) Geographic patterns of childhood insulin-dependent diabetes mellitus. Diabetes Epidemiology Research International Group. Diabetes 37:1113–1119

    Article  Google Scholar 

  • Duan S, Huang RS, Zhang W, Bleibel WK, Roe CA, Clark TA, Chen TX, Schweitzer AC, Blume JE, Cox NJ, Dolan ME (2008a) Genetic architecture of transcript-level variation in humans. Am J Hum Genet 82: 1101–13

  • Duan S, Zhang W, Bleibel WK, Cox NJ, Dolan ME (2008b) SNPinProbe_1.0: a database for filtering out probes in the Affymetrix GeneChip® Human Exon 1.0 ST array potentially affected by SNPs. Bioinformation 2:469–470

    PubMed  Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863–14868

    Article  PubMed  CAS  Google Scholar 

  • Faustino NA, Cooper TA (2003) Pre-mRNA splicing and human disease. Genes Dev 17:419–437

    Article  PubMed  CAS  Google Scholar 

  • Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB, Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y et al (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861

    Article  PubMed  CAS  Google Scholar 

  • Gardina PJ, Clark TA, Shimada B, Staples MK, Yang Q, Veitch J, Schweitzer A, Awad T, Sugnet C, Dee S, Davies C, Williams A, Turpaz Y (2006) Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array. BMC Genomics 7:325

    Article  PubMed  Google Scholar 

  • Gilad Y, Rifkin SA, Bertone P, Gerstein M, White KP (2005) Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles. Genome Res 15:674–680

    Article  PubMed  CAS  Google Scholar 

  • Huang da W, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA (2007) The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol 8:R183

    Article  PubMed  Google Scholar 

  • Huang RS, Kistner EO, Bleibel WK, Shukla SJ, Dolan ME (2007) Effect of population and gender on chemotherapeutic agent-induced cytotoxicity. Mol Cancer Ther 6:31–36

    Article  PubMed  CAS  Google Scholar 

  • Hull J, Campino S, Rowlands K, Chan MS, Copley RR, Taylor MS, Rockett K, Elvidge G, Keating B, Knight J, Kwiatkowski D (2007) Identification of common genetic variation that modulates alternative splicing. PLoS Genet 3:e99

    Article  PubMed  Google Scholar 

  • International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796

    Article  Google Scholar 

  • International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320

    Article  Google Scholar 

  • Ioannidis JP, Ntzani EE, Trikalinos TA (2004) ‘Racial’ differences in genetic effects for complex diseases. Nat Genet 36:1312–1318

    Article  PubMed  CAS  Google Scholar 

  • Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4:249–264

    Article  PubMed  Google Scholar 

  • Jordan WJ, Eskdale J, Lennon GP, Pestoff R, Wu L, Fine DH, Gallagher G (2005) A non-conservative, coding single-nucleotide polymorphism in the N-terminal region of lactoferrin is associated with aggressive periodontitis in an African-American, but not a Caucasian population. Genes Immun 6:632–635

    Article  PubMed  CAS  Google Scholar 

  • Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32:D277–D280

    Article  PubMed  CAS  Google Scholar 

  • Kurian AK, Cardarelli KM (2007) Racial and ethnic differences in cardiovascular disease risk factors: a systematic review. Ethn Dis 17:143–152

    PubMed  Google Scholar 

  • Kwan T, Benovoy D, Dias C, Gurd S, Serre D, Zuzan H, Clark TA, Schweitzer A, Staples MK, Wang H, Blume JE, Hudson TJ, Sladek R, Majewski J (2007) Heritability of alternative splicing in the human genome. Genome Res 17:1210–1218

    Article  PubMed  CAS  Google Scholar 

  • Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R, Majewski J (2008) Genome-wide analysis of transcript isoform variation in humans. Nat Genet 40:225–231

    Article  PubMed  CAS  Google Scholar 

  • Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

    Article  PubMed  CAS  Google Scholar 

  • McGarvey PB, Huang H, Barker WC, Orcutt BC, Garavelli JS, Srinivasarao GY, Yeh LS, Xiao C, Wu CH (2000) PIR: a new resource for bioinformatics. Bioinformatics 16:290–291

    Article  PubMed  CAS  Google Scholar 

  • McKusick VA (1998) Mendelian inheritance in man. A catalog of human genes and genetic disorders, 12th edn. Johns Hopkins University Press, Baltimore

    Google Scholar 

  • Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430:743–747

    Article  PubMed  CAS  Google Scholar 

  • Noble RC, Miller BR (1980) Auxotypes and antimicrobial susceptibilities of Neisseria gonorrhoeae in black and white patients. Br J Vener Dis 56:26–30

    PubMed  CAS  Google Scholar 

  • Novoyatleva T, Tang Y, Rafalska I, Stamm S (2006) Pre-mRNA missplicing as a cause of human disease. Prog Mol Subcell Biol 44:27–46

    Article  PubMed  CAS  Google Scholar 

  • Pruitt KD, Tatusova T, Maglott DR (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35:D61–D65

    Article  PubMed  CAS  Google Scholar 

  • R Development Core Team (2005) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132:365–386

    PubMed  CAS  Google Scholar 

  • Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34:374–378

    PubMed  CAS  Google Scholar 

  • Sorek R, Shamir R, Ast G (2004) How prevalent is functional alternative splicing in the human genome? Trends Genet 20:68–71

    Article  PubMed  CAS  Google Scholar 

  • Sorof JM, Hawkins EP, Brewer ED, Boydstun II, Kale AS, Powell DR (1998) Age and ethnicity affect the risk and outcome of focal segmental glomerulosclerosis. Pediatr Nephrol 12:764–768

    Article  PubMed  CAS  Google Scholar 

  • Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG (2007) Common genetic variants account for differences in gene expression among ethnic groups. Nat Genet 39:226–231

    Article  PubMed  CAS  Google Scholar 

  • Storey JD, Madeoy J, Strout JL, Wurfel M, Ronald J, Akey JM (2007) Gene-expression variation within and among human populations. Am J Hum Genet 80:502–509

    Article  PubMed  CAS  Google Scholar 

  • Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, Lyle R, Hunt S, Kahl B, Antonarakis SE, Tavare S, Deloukas P, Dermitzakis ET (2005) Genome-wide associations of gene expression variation in humans. PLoS Genet 1:e78

    Article  PubMed  Google Scholar 

  • Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, Beazley C, Ingle CE, Dunning M, Flicek P, Koller D, Montgomery S, Tavare S, Deloukas P, Dermitzakis ET (2007) Population genomics of human gene expression. Nat Genet 39:1217–1224

    Article  PubMed  CAS  Google Scholar 

  • Tessier MC, Qu HQ, Frechette R, Bacot F, Grabs R, Taback SP, Lawson ML, Kirsch SE, Hudson TJ, Polychronakos C (2006) Type 1 diabetes and the OAS gene cluster: association with splicing polymorphism or haplotype? J Med Genet 43:129–132

    Article  PubMed  CAS  Google Scholar 

  • Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan A, Narechania A (2003) PANTHER: a library of protein families and subfamilies indexed by function. Genome Res 13:2129–2141

    Article  PubMed  CAS  Google Scholar 

  • Thorisson GA, Smith AV, Krishnan L, Stein LD (2005) The International HapMap Project Web site. Genome Res 15:1592–1593

    Article  PubMed  CAS  Google Scholar 

  • Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J, McKusick VA, Zinder N, Levine AJ, Roberts RJ, Simon M, Slayman C, Hunkapiller M, Bolanos R, Delcher A, Dew I, Fasulo D, Flanigan M, Florea L, Halpern A, Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, Abu-Threideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M, Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Di Francesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian AE, Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman TJ, Higgins ME, Ji RR, Ke Z, Ketchum KA, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, Lu F, Merkulov GV, Milshina N, Moore HM, Naik AK, Narayan VA, Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, Shue B, Sun J, Wang Z, Wang A, Wang X, Wang J, Wei M, Wides R, Xiao C, Yan C et al (2001) The sequence of the human genome. Science 291:1304–1351

    Article  PubMed  CAS  Google Scholar 

  • Westfall PH, Young SS (1993) Resampling-based multiple testing: examples and methods for p-value adjustment. Wiley Publishers, New York

    Google Scholar 

  • Wright S (1950) Genetical structure of populations. Nature 166:247–249

    Article  PubMed  CAS  Google Scholar 

  • Zhang W, Dolan ME (2008a) Ancestry-related differences in gene expression: findings may enhance understanding of health disparities between populations. Pharmacogenomics 9:489–492

    Article  PubMed  Google Scholar 

  • Zhang W, Dolan ME (2008b) Beyond the HapMap genotypic data: prospects of deep resequencing projects. Curr Bioinform 3

  • Zhang W, Dolan ME (2008c) On the challenges of the HapMap resource. Bioinformation 2:238–239

    PubMed  Google Scholar 

  • Zhang W, Bleibel WK, Roe CA, Cox NJ, Eileen Dolan M (2007) Gender-specific differences in expression in human lymphoblastoid cell lines. Pharmacogenet Genomics 17:447–450

    Article  PubMed  CAS  Google Scholar 

  • Zhang W, Duan S, Kistner EO, Bleibel WK, Huang RS, Clark TA, Chen TX, Schweitzer AC, Blume JE, Cox NJ, Dolan ME (2008a) Evaluation of genetic variation contributing to differences in gene expression between populations. Am J Hum Genet 82:631–640

    Article  PubMed  CAS  Google Scholar 

  • Zhang W, Ratain MJ, Dolan ME (2008b) The HapMap resource is providing new insights into ourselves and its application to pharmacogenomics. Bioinform Biol Insights 2:15–23

    PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This Pharmacogenetics of Anticancer Agents Research (PAAR) Group (http://www.pharmacogenetics.org) study was supported by NIH/NIGMS grants U01 GM61393 and U01 GM61374. We are grateful to Dr. Jeong-Ah Kang for maintaining cell lines, Cheryl A. Roe for reviewing the manuscript and Drs. James Fackenthal and Emily Kistner for helpful discussion. T.A.C., T.X.C., A.C.S., and J.E.B. are employees of Affymetrix, Inc.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nancy J. Cox.

Additional information

W. Zhang and S. Duan contributed equally to this work.

Electronic supplementary material

Below is the link to the electronic supplementary material.

439_2008_601_MOESM1_ESM.xls

Supplemental Table S1. Differentially spliced probesets between the CEU and YRI samples, Supplemental Table S2. Associated SNP’s with the differentially spliced probesets, Supplemental Table S3. Primers used in the validation of differentially spliced probesets (XLS 746 kb)

439_2008_601_MOESM2_ESM.tif

Supplemental Fig. 1 The workflow chart. Exon-level expression of the CEU and YRI trios was measured using theAffymetrix Human Exon 1.0ST array. Splicing index values were compared between the twopopulations to identify differential probesets. Genetic variants associated with the differentialprobesets and enriched pathways and/or Gene Ontology terms were then identified.254x190mm (72 x 72 DPI) (TIFF 110 kb)

439_2008_601_MOESM3_ESM.tif

Supplemental Fig. 2 PS3527423 (PARP2) is differentially spliced among 54 unrelated CEU samples. Blue indicates thepresence of both spliced and unspliced isoforms (300bp and 400bp); Red indicates unsplicedisoforms (400bp); Black indicates spliced isoforms (300bp).254x190mm (72 x 72 DPI) (TIFF 116 kb)

439_2008_601_MOESM4_ESM.tif

Supplemental Fig. 3 Validation of transcript isoform variation between populations using quantitative Real-Time PCR. Therelative expression between the exon of interest (EOI) and the neighboring, housekeeping exon(HKE) was shown. (A) MPRL43 (probeset: 3303658) has a lower ratio (one-tail t test p = 0.02) inCEU, consistent with the trend of SI (CEU SI mean = -0.02; YRI SI mean = 0.06). (B) MTMR4(probeset: 3764493) has a lower ratio (one-tail t test p = 0.05) in CEU consistent with the trend ofSI (CEU SI mean = 0.004; YRI SI mean = 0.024).254x190mm (72 x 72 DPI) (TIFF 64 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, W., Duan, S., Bleibel, W.K. et al. Identification of common genetic variants that account for transcript isoform variation between human populations. Hum Genet 125, 81–93 (2009). https://doi.org/10.1007/s00439-008-0601-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-008-0601-x

Keywords

Navigation