Identification of six novel genes by experimental validation of GeneMachine predicted genes

Gene. 2002 Feb 6;284(1-2):203-13. doi: 10.1016/s0378-1119(01)00897-6.

Abstract

In silico gene identification from finished and unfinished human genome sequence has become critically important in many projects seeking to gain insights into the gene content of genomic regions implicated in diseases. To establish limitations and criteria for in silico gene identification, and to identify novel genes of potential relevance to human prostate cancer and melanoma, 3 Mb of chromosome 1 sequence have been analyzed using GeneMachine. This program is a software suite comprising of sequence similarity programs and four gene identification programs. A total of 49 potential transcripts were selected and 37 of them were selected for experimental validation. We verified 16 of the predicted genes by experimental analysis. The comparison of the predicted transcripts with their cloned forms helped to refine predicted gene models as well as to identify splice variants for several of them. Although sequences matching with ten of our verified genes have been recently deposited in the GenBank, six of them remain novel. Our studies support the feasibility of identifying novel genes from regions of interest using draft human genome sequence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Human, Pair 1 / genetics
  • Cloning, Molecular
  • DNA / chemistry
  • DNA / genetics
  • DNA, Complementary / chemistry
  • DNA, Complementary / genetics
  • Exons / genetics
  • Female
  • Gene Expression
  • Genes / genetics*
  • Genome, Human
  • Humans
  • Male
  • Molecular Sequence Data
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Software*

Substances

  • DNA, Complementary
  • RNA, Messenger
  • DNA

Associated data

  • GENBANK/AF387611
  • GENBANK/AF387612
  • GENBANK/AF387613
  • GENBANK/AF387614
  • GENBANK/AF387615
  • GENBANK/AF387616
  • GENBANK/AF387617
  • GENBANK/AF387618
  • GENBANK/AF387619
  • GENBANK/AF387620