Background: Soybean (Glycine max) has been bred for thousands of years to produce seeds rich in protein for human and animal consumption, making them an appealing bioreactor for producing valuable recombinant proteins at high levels. However, the effects of expressing recombinant protein at high levels on bean physiology are not well understood. To address this, we investigated whether gene expression within transgenic soybean seed tissue is altered when large amounts of recombinant proteins are being produced and stored exclusively in the seeds. We used RNA-Seq to survey gene expression in three transgenic soybean lines expressing recombinant protein at levels representing up to 1.61 % of total protein in seed tissues. The three lines included: ST77, expressing human thyroglobulin protein (hTG), ST111, expressing human myelin basic protein (hMBP), and 764, expressing a mutant, nontoxic form of a staphylococcal subunit vaccine protein (mSEB). All lines selected for analysis were homozygous and contained a single copy of the transgene.
Methods: Each transgenic soybean seed was screened for transgene presence and recombinant protein expression via PCR and western blotting. Whole seed mRNA was extracted and cDNA libraries constructed for Illumina sequencing. Following alignment to the soybean reference genome, differential gene expression analysis was conducted using edgeR and cufflinks. Functional analysis of differentially expressed genes was carried out using the gene ontology analysis tool AgriGO.
Results: The transcriptomes of nine seeds from each transgenic line were sequenced and compared with wild type seeds. Native soybean gene expression was significantly altered in line 764 (mSEB) with more than 3000 genes being upregulated or downregulated. ST77 (hTG) and ST111 (hMBP) had significantly less differences with 52 and 307 differentially expressed genes respectively. Gene ontology enrichment analysis found that the upregulated genes in the 764 line were annotated with functions related to endopeptidase inhibitors and protein synthesis, but suppressed expression of genes annotated to the nuclear pore and to protein transport. No significant gene ontology terms were detected in ST77, and only a few genes involved in photosynthesis and thylakoid functions were downregulated in ST111. Despite these differences, transgenic plants and seeds appeared phenotypically similar to non-transgenic controls. There was no correlation between recombinant protein expression level and the quantity of differentially expressed genes detected.
Conclusions: Measurable unscripted gene expression changes were detected in the seed transcriptomes of all three transgenic soybean lines analyzed, with line 764 being substantially altered. Differences detected at the transcript level may be due to T-DNA insert locations, random mutations following transformation or direct effects of the recombinant protein itself, or a combination of these. The physiological consequences of such changes remain unknown.