1. INTRODUCTION
Given the wide audience that is reached by the YouTube (www.youtube.com) video sharing service, the candidates involved in the 2008 United States presidential election race have been making use of this medium, creating channels for election video material they want to disseminate . The popularity of this medium is so large that YouTube has devoted a separate section to the election material named YouChoose (www.youtube.com/youchoose). The corpus of election video material is peculiar in that it is rich in speech, many of it is long form content (videos can be an hour long) and the in- formation retained within is dense. This complicates the task for the end user when attempting to find relevant video material and navigating within a found video. To make this task easier, we developed an audio indexing system allowing both search and navigation of this video material based on content. This is similar to previous audio indexing work [1, 2, 3, 4] however here the focus is on video material, the content of the index is controlled indirectly by the video producers who manage the content on their channels and most importantly, the system is to be designed to scale allowing it to be applied beyond the election video domain. New video material found on the can- didate channels is transcribed, videos are then indexed to facilitate search and a user interface allows the end user to navigate through the search results. If the content mangers of the channels take a video down, we remove them from our index. The user interface to interact with this corpus is available for general use on the YouChoose site as well as a Google Labs product (labs.google.com/gaudi). This paper describes the development of this system in more detail. Section 2 describes the system running data aquisition, data processing and the serving infrastructure for the user interface and search and retrieval.
บทนำGiven the wide audience that is reached by the YouTube (www.youtube.com) video sharing service, the candidates involved in the 2008 United States presidential election race have been making use of this medium, creating channels for election video material they want to disseminate . The popularity of this medium is so large that YouTube has devoted a separate section to the election material named YouChoose (www.youtube.com/youchoose). The corpus of election video material is peculiar in that it is rich in speech, many of it is long form content (videos can be an hour long) and the in- formation retained within is dense. This complicates the task for the end user when attempting to find relevant video material and navigating within a found video. To make this task easier, we developed an audio indexing system allowing both search and navigation of this video material based on content. This is similar to previous audio indexing work [1, 2, 3, 4] however here the focus is on video material, the content of the index is controlled indirectly by the video producers who manage the content on their channels and most importantly, the system is to be designed to scale allowing it to be applied beyond the election video domain. New video material found on the can- didate channels is transcribed, videos are then indexed to facilitate search and a user interface allows the end user to navigate through the search results. If the content mangers of the channels take a video down, we remove them from our index. The user interface to interact with this corpus is available for general use on the YouChoose site as well as a Google Labs product (labs.google.com/gaudi). This paper describes the development of this system in more detail. Section 2 describes the system running data aquisition, data processing and the serving infrastructure for the user interface and search and retrieval.
การแปล กรุณารอสักครู่..