Increasing amounts of biological data are accumulating in the pharmaceutical industry and academic institutions. However, data does not equal actionable information, and guidelines for appropriate data capture, harmonization, integration, mining, and visualization need to be established to fully harness its potential. Here, we describe ongoing efforts at Merck & Co. to structure data in the area of chemogenomics. We are integrating complementary data from both internal and external data sources into one chemogenomics database (Chemical Genetic Interaction Enterprise; CHEMGENIE). Here, we demonstrate how this well-curated database facilitates compound set design, tool compound selection, target deconvolution in phenotypic screening, and predictive model building.
Copyright © 2017 Elsevier Ltd. All rights reserved.