Knowledge discovery by automated identification and ranking of implicit relationships

Bioinformatics. 2004 Feb 12;20(3):389-98. doi: 10.1093/bioinformatics/btg421. Epub 2004 Jan 22.

Abstract

Motivation: New relationships are often implicit from existing information, but the amount and growth of published literature limits the scope of analysis an individual can accomplish. Our goal was to develop and test a computational method to identify relationships within scientific reports, such that large sets of relationships between unrelated items could be sought out and statistically ranked for their potential relevance as a set.

Results: We first construct a network of tentative relationships between 'objects' of biomedical research interest (e.g. genes, diseases, phenotypes, chemicals) by identifying their co-occurrences within all electronically available MEDLINE records. Relationships shared by two unrelated objects are then ranked against a random network model to estimate the statistical significance of any given grouping. When compared against known relationships, we find that this ranking correlates with both the probability and frequency of object co-occurrence, demonstrating the method is well suited to discover novel relationships based upon existing shared relationships. To test this, we identified compounds whose shared relationships predicted they might affect the development and/or progression of cardiac hypertrophy. When laboratory tests were performed in a rodent model, chlorpromazine was found to reduce the progression of cardiac hypertrophy.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Abstracting and Indexing / methods
  • Algorithms*
  • Artificial Intelligence*
  • Computing Methodologies
  • Database Management Systems
  • Databases, Bibliographic*
  • Information Storage and Retrieval / methods*
  • MEDLINE
  • Natural Language Processing*
  • Pattern Recognition, Automated
  • Periodicals as Topic*
  • Semantics
  • Terminology as Topic*