While often represented as static entities, gene networks are highly context-dependent. Here, we developed a multi-task learning strategy to yield context-specific representations of gene network dynamics. We assembled a corpus comprising ~103 million human single-cell transcriptomes from a broad range of tissues and diseases and performed a two stage pretraining, first with non-malignant cells to generate a foundational model and then with continual learning on cancer cells to tune the model to the cancer domain. We performed multi-task learning with the foundational model to learn context-specific representations of a broad range of cell types, tissues, developmental stages, and diseases. We then leveraged the cancer-tuned model to jointly learn cell states and predict tumor-restricting factors within the colorectal tumor microenvironment. Model quantization allowed resource-efficient fine-tuning and inference while preserving biological knowledge. Overall, multi-task learning enables context-specific disease modeling that can yield contextual predictions of candidate therapeutic targets for human disease.