MVA studies in the L2 reading context have investigated the effects of annotations, in either dynamic video or static picture mode, on L2 vocabulary learning. While Al-Seghayer (2001) demonstrated the efficacy of the dynamic video mode over the static picture mode, Chun and Plass (1996) found the reverse, and Akbulut (2007) observed no significant differences between the two. The contradictory findings in these studies could either be attributed to the different videos and images used or to the different assessment tools administered (Mohsen & Balakumar, 2011). MVA studies in the listening context, however, have not yet compared the efficacy of the various modes as an aid to L2 vocabulary acquisition.