Background: The laboratory mouse is the most commonly used model for studying variation in complex traits relevant to human disease. Here we present the whole-genome sequences of two inbred strains, LG/J and SM/J, which are frequently used to study variation in complex traits as diverse as aging, bone-growth, adiposity, maternal behavior, and methamphetamine sensitivity.
Results: We identified small nucleotide variants (SNVs) and structural variants (SVs) in the LG/J and SM/J strains relative to the reference genome and discovered novel variants in these two strains by comparing their sequences to other mouse genomes. We find that 39% of the LG/J and SM/J genomes are identical-by-descent (IBD). We characterized amino-acid changing mutations using three algorithms: LRT, PolyPhen-2 and SIFT. We also identified polymorphisms between LG/J and SM/J that fall in regulatory regions and highly informative transcription factor binding sites (TFBS). We intersected these functional predictions with quantitative trait loci (QTL) mapped in advanced intercrosses of these two strains. We find that QTL are both over-represented in non-IBD regions and highly enriched for variants predicted to have a functional impact. Variants in QTL associated with metabolic (231 QTL identified in an F16 generation) and developmental (41 QTL identified in an F34 generation) traits were interrogated and we highlight candidate quantitative trait genes (QTG) and nucleotides (QTN) in a QTL on chr13 associated with variation in basal glucose levels and in a QTL on chr6 associated with variation in tibia length.
Conclusions: We show how integrating genomic sequence with QTL reduces the QTL search space and helps researchers prioritize candidate genes and nucleotides for experimental follow-up. Additionally, given the LG/J and SM/J phylogenetic context among inbred strains, these data contribute important information to the genomic landscape of the laboratory mouse.