Protein database and quantitative analysis considerations when integrating genetics and proteomics to compare mouse strains

J Proteome Res. 2011 Jul 1;10(7):2905-12. doi: 10.1021/pr200133p. Epub 2011 May 9.

Abstract

Decades of genetics research comparing mouse strains has identified many regions of the genome associated with quantitative traits. Microarrays have been used to identify which genes in those regions are differentially expressed and are therefore potentially causal; however, genetic variants that affect probe hybridization lead to many false conclusions. Here we used spectral counting to compare brain striata between two mouse strains. Using strain-specific protein databases, we concluded that proteomics was more robust to sequence differences than microarrays; however, some proteins were still significantly affected. To generate strain-specific databases, we used a complete database that contained all of the putative genetic isoforms for each protein. While the increased proteome coverage in the databases led to a 6.8% gain in peptide assignments compared to a nonredundant database, it also necessitated the development of a strategy for grouping similar proteins due to a large number of shared peptides. Of the 4563 identified proteins (2.1% FDR), there were 1807 quantifiable proteins/groups that exceeded minimum count cutoffs. With four pooled biological replicates per strain, we used quantile normalization, ComBat (a package that adjusts for batch effects), and edgeR (a package for differential expression analysis of count data) to identify 101 differentially expressed proteins/groups, 84 of which had a coding region within one of the genomic regions of interest identified by the Portland Alcohol Research Center.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alcohol Drinking / genetics
  • Algorithms
  • Amino Acid Sequence
  • Animals
  • Behavior, Animal / drug effects
  • Corpus Striatum / chemistry*
  • Databases, Protein
  • Ethanol / administration & dosage
  • Male
  • Mass Spectrometry
  • Mice
  • Mice, Inbred C57BL / genetics*
  • Mice, Inbred DBA / genetics*
  • Molecular Sequence Data
  • Multigene Family
  • Open Reading Frames
  • Polymorphism, Single Nucleotide
  • Protein Isoforms / analysis*
  • Protein Isoforms / chemistry
  • Protein Isoforms / genetics
  • Proteins / analysis*
  • Proteins / chemistry
  • Proteins / genetics
  • Proteome / genetics*
  • Proteomics / methods*
  • Quantitative Trait Loci*
  • Species Specificity

Substances

  • Protein Isoforms
  • Proteins
  • Proteome
  • Ethanol