Audio-visual integration in the human brain influences perception and precision of motor tasks. We tested audio-visual integration during height estimation when presenting video clips of counter movement jumps (CMJ), using sparse sampling fMRI at 3T. Employing the technique of "sonification", we created artificial auditory-visual motion events by transforming the ground reaction force of the CMJs into the auditory domain, modulating frequency and amplitude of the standard pitch "A" (440 Hz). We combined these "sonificated" movements with either concordant or discordant visual movement displays. We hypothesized that processing of concordant audio-visual stimuli would enhance neural activity in audio-visual integration areas. Therefore, four conditions were compared: 1. unimodal visual, 2. unimodal auditory, 3. auditory+visual concordant, and 4. auditory+visual discordant. The unimodal conditions, when compared against each other, resulted in expected activation maxima in primary visual and auditory cortex, respectively. Enhanced activation was found in area V5/MT bilaterally for the concordant multimodal, as compared to both unimodal, conditions. This effect was specific for the concordant bimodal condition, as evidenced by a direct comparison between concordant and discordant bimodal conditions. Using "sonification", we provide evidence that area V5/MT is modulated by concordant auditory input, albeit the artificial nature of the stimuli, which argues for a role of this region in multimodal motion integration, beyond the pure visual domain. This may explain previous behavioral evidence of facilitatory effects exerted by auditory motion stimuli on the perception of visual motion, and may provide the basis for future applications in motor learning and rehabilitation.