A variety of regression schemes have been proposed on images or shapes, although available methods do not handle them jointly. In this paper, we present a framework for joint image and shape regression which incorporates images as well as anatomical shape information in a consistent manner. Evolution is described by a generative model that is the analog of linear regression, which is fully characterized by baseline images and shapes (intercept) and initial momenta vectors (slope). Further, our framework adopts a control point parameterization of deformations, where the dimensionality of the deformation is determined by the complexity of anatomical changes in time rather than the sampling of the image and/or the geometric data. We derive a gradient descent algorithm which simultaneously estimates baseline images and shapes, location of control points, and momenta. Experiments on real medical data demonstrate that our framework effectively combines image and shape information, resulting in improved modeling of 4D (3D space + time) trajectories.