ReviewImpact of human genome sequencing for in silico target discovery
Section snippets
Nature of the human genomic sequence
In 1995, an international public consortium outlined the plan to complete the human genome sequence. The agreed strategy was to use genomic clones from a physical map and to subject the chosen clones to shotgun sequencing. Several rounds of sequencing are necessary to achieve the target of 99.99% accuracy and ten times coverage for the high quality or finished sequence. Because of the progress in sequencing technology and in response to the pressure from private sequencing initiatives, the
How many opportunities?
The human genome sequence contains sufficient information to identify novel opportunities or genes for therapeutic intervention. Until recently, the main source of novelty came from expressed sequence tags (ESTs) 7, 8. EST collections have proven to be useful for identifying novel targets, for example, cathepsin K 9. However, rare transcripts or genes with a limited pattern of expression, although attractive as drug targets, are the most difficult to identify using EST technology. Because of
Discovery genomics: a strategy to identify in silico novel targets
Perhaps the simplest strategy to identify potential drug targets from the human genome is to use bioinformatics tools and the numerous sequence databases available, in a process known as ‘discovery genomics’ 30 (Fig. 1). Gene predictions from ab initio or sequence similarity programs can be used against databases of known effective drug target classes, such as GPCRs, proteases, ion channels, hormone nuclear receptors, kinases, and so on, to uncover novel members of these gene families. Choosing
Beyond the human genome sequence: functional genomics and bioinformatics
The challenge does not lie in finding and creating long lists of genes but in validating novel transcripts and picking winners for drug discovery early in the validation process. Various functional genomics platforms and approaches can be used to functionally characterize genes or reveal pathways. Most of them will require bioinformatics support for data management and analysis.
Conclusion
With the first draft of the human genome available, the identification of the majority of human genes is becoming a genuine possibility. Some of these genes will be similar to known gene drug targets; others will be completely novel. However, in most cases, functional analysis will be a necessity. Bioinformatics together with in silico analysis will play a major role in analysing, managing and connecting data to improve functional annotation using genomics methods. At present, computational
Acknowledgements
I thank colleagues from the Genome Informatics and Analysis Group at GlaxoSmithKline (Stevenage, UK) for their comments and suggestions.
References (80)
The Merck Gene Index project
Drug Discov. Today
(1999)Cathepsin K, but not cathepsins B, L or S, is abundantly expressed in human osteoclasts
J. Biol. Chem.
(1996)Basic alignment search tool
J. Mol. Biol.
(1990)- et al.
Prediction of complete gene structures in human genomic DNA
J. Mol. Biol.
(1997) A novel family of mammalian taste receptors
Cell
(2000)The impact of genomics on drug discovery
Prog. Med. Chem.
(2000)Identification and characterization of a novel human cortistatin-like peptide
Biochem. Biophys. Res. Commun.
(1997)Cloning and functional expression of a human orthologue of rat vanilloid receptor-1
Pain
(2000)A 4-Mb high-density single nucleotide polymorphism-based map around human ApoE
Genomics
(1998)Functional discovery via a compendium of expression profiles
Cell
(2000)
Promoter analysis of co-regulated genes in the yeast genome
Comput. Chem.
The DNA sequence of human chromosome 22
Nature
The DNA sequence of human chromosome 21
Nature
Human whole-genome shotgun sequencing
Genome Res.
Shotgun sequencing of the human genome
Science
Whole-genome random sequencing and assembly of Haemophilus influenzae
Science
A whole-genome assembly of Drosophila
Science
Rapid cDNA sequencing (expressed sequence tags) from a directionally cloned human infant brain cDNA library
Nat. Genet.
How many genes in the human genome?
Nat. Genet.
Analysis of expressed sequence tags indicates 35 000 human genes
Nat. Genet.
Gene index analysis of the human genome estimates approximately 120 000 genes
Nat. Genet.
Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence
Nat. Genet.
The genome sequence of Drosophila melanogaster
Science
Gene recognition via spliced alignment
Proc. Natl. Acad. Sci. U. S. A.
Dynamite: a flexible code generating language for dynamic programming methods used in sequence comparison
ISMB
Locating protein-coding regions in DNA sequences by a multiple sensor-neural approach
Proc. Natl. Acad. Sci. U. S. A.
An assessment of gene prediction accuracy in large DNA sequences
Genome Res.
From bioinformatics to computational biology
Genome Res.
Computational methods for exon detection
Mol. Biotechnol.
Eukaryotic promoter recognition
Genome Res.
Gene finding approaches for eukaryotes
Genome Res.
Comparing the success of different prediction software in sequence analysis: a review
Briefings in Bioinformatics
Genomic sciences and the medicine of tomorrow
The role of innovation in drug development
Nat. Biotechnol.
Drug discovery: a historical perspective
Science
Pharmacogenetics and the practice of medicine
Nature
Ligand-receptor pairing via tree comparison
J. Comp. Biol.
Assessing the protease and protease inhibitor content of the human genome
J. Pept. Sci.
A new dynamic tool to perform assembly of expressed sequence tags, ESTs
Comp. Appl. Biosci.
Cited by (20)
Progress and problems in the exploration of therapeutic targets
2006, Drug Discovery TodayCase study: Use of a library of antisense inhibitors for gene functionalization and drug target validation
2004, Drug Discovery Today: TARGETSBioinformatics for the genomic sciences and towards systems biology. Japanese activities in the post-genome era
2002, Progress in Biophysics and Molecular BiologyCitation Excerpt :Also hundreds of pathogenic microbial genomes each of which has hundreds or thousands of genes will be identified. These achievements will help increase the number of drug targets (Sanseau, 2001). Secondly, the Structure-Based Drug Design (SBDD) will become more active.
In silico identification of novel therapeutic targets
2002, Drug Discovery TodayFrom symptomatic treatments to causative therapy?
2001, Current Opinion in Chemical Biology