Figure 2: Example of topics associated with a video,
and their corresponding weights.
and compute a vector dot product to determine a similarity
of a pair. Related videos will
then be ranked by their similarity score to the watch video.
2.2 Topics Versus Co-View Graph
It is important to note that the topical video representation
described in the previous section has very different
characteristics from the co-view video representation, used
in previous work on related video discovery [7]. In the coview
video representation, each video is represented by a
node in the co-view graph. A node is then linked to the
other nodes in the graph if it is often viewed with them in
the same session [7]. In this approach, related video suggestion
is done based on the nodes that are in the proximity to
the watch video in the co-view graph.
Therefore, in the co-view approach two videos are potentially
related if and only if they have a strong connection
in the co-view graph, i.e., they were often watched in the
same session. This approach works well for popular videos
with many views and high node connectivity. However, it
is much less reliable for videos with little or no views. For
these videos, using the co-view approach may lead to spurious
results, or yield no candidates for suggestions.
In contrast, the topical video representation does not require
an explicit co-view information to deem two video related.
Instead, if two videos share (some of) the same topics
they will be related, even if they were never watched in the
same session before.
This semantic approach enables discovery of fresh, diverse
and relevant content. It has been shown that implicit
user feedback is often influenced by presentation bias [32],
and click metrics do not fully correlate with relevance [21].
Topic-based video suggestion can therefore limit the rich get
richer effect that can potentially arise when using solely the
co-view information and disregarding the video content.
2.3 Topic Indexing
Video representation using topic weight vectors is akin
to the bag-of-words document representation often used in
the information retrieval applications. Therefore, we index
the topic video representations in a standard inverted index
structure [19] for efficient retrieval. Each video is represented
using a topic weight vector, and indexed as an entry
in the posting lists of its corresponding topics.