I. INTRODUCTION
Humanoid interfaces can utilize bodily expressions like human
beings. One of the useful bodily expressions is “hand gesture.”
It is known that not only in face-to-face interaction, but also in
human-humanoid interaction, hand gestures by humanoid
robots and animated agents improve the comprehensibility of
conversation content [1, 2]. Therefore, gesture generation is
one of the most important tasks in building humanoid
interfaces. Specifically, in direction giving conversations and
guiding the exhibition by humanoids, expressing gestures is
indispensable. However, automatic gesture generation is a
very difficult task. One possible way is to manually define a
gesture dictionary, which assigns a gesture shape to each word
and registers the gesture shape as a part of the dictionary
information. The limitation of this method is that manually
assigning a gesture shape to each word would require
enormous cost and effort. Thus, in this study, we propose a
method that automatically generates iconic gestures [3] by
employing image processing and machine learning techniques.
We also build a virtual agent system that takes a sentence as
the system input and produces hand gesture animations that
are synchronized with synthetic speech.