How can the components of visual comprehension be characterized as brain activity? Making sense of a dynamic visual world involves perceiving streams of activity as discrete units such as eating breakfast or walking the dog. In order to parse activity into distinct events, the brain relies on both the perceptual (bottom-up) data available in the stimulus as well as on expectations about the course of the activity based on previous experience with, or knowledge about, similar types of activity (top-down data). Using fMRI, we examined the contribution of bottom-up and top-down processing to the comprehension of action streams by contrasting familiar action sequences with those having exactly the same perceptual detection and motor responses (yoked control), but no visual action familiarity. New methods incorporating structural equation modeling of the data yielded distinct patterns of interactivity among brain areas as a function of the degree to which bottom-up and top-down data were available.