Macromolecular systems are composed of a very large number of atomic degrees of freedom. There is strong evidence suggesting that structural changes occurring in large biomolecular systems at long time scale dynamics may be captured by models coarser than atomistic, although a suitable or optimal coarse-graining is a priori unknown. Here we propose a systematic approach to learning a coarse representation of a macromolecule from microscopic simulation data. In particular, the definition of effective coarse variables is achieved by partitioning the degrees of freedom both in the structural (physical) space and in the conformational space. The identification of groups of microscopic particles forming dynamical coherent states in different metastable states leads to a multiscale description of the system, in space and time. The application of this approach to the folding dynamics of two proteins provides a revised view of the classical idea of prestructured regions (foldons) that combine during a protein-folding process and suggests a hierarchical characterization of the assembly process of folded structures.