Expressing an abstract textual representation of a video clip in these features (e.g., many systems used color-based filtering) is difficult. Letting a human identify the right clip by visual inspection eliminates this “gap” thus making it more likely to find the correct clip. What is surprising though is how fast the expert could perform this task, especially when keeping in mind the huge size of the storyboard spanning almost 600 screens in total. Even the third place in the visual tasks (Fig. 5, top) is remarkable, especially considering that eight of ten tasks were solved correctly (in tie with the 2nd system and only one correct solution short of the 1st one).