IV. GESTURE GENERATION
We implemented an automatic gesture generation mechanism
in a virtual agent system. We created a gesture dictionary that
defines the typical shape and size of each object word. The
shape was automatically assigned using the method proposed
in Section III. The size parameter was manually assigned
using a scale from 1 to 5. For example, scale parameter 3
(middle) was assigned to basketball, and the scale for a sponge
ball was set to 1 (small). In addition, three types of iconic
gesture animations: drawing a circle, a rectangle and a straight
line, were pre-defined. We used Unity as the animation engine
(Fig. 2). The gesture generation process flow is described as
follows.
(1) When a sentence is input by the user, a morphological
analyzer detects nouns. For each noun, the system refers to the
gesture dictionary. If a noun matches a dictionary entry, the
gesture shape and the scale are determined based on the shape
and size parameters. Then, this information, together with the
sentence, is sent to the agent module.
(2) The agent module has a text-to-speech function that used
Microsoft Speech API (SAPI) and can generate lip sync
animation. This module selects agent gesture animations
according to the gesture shape and the scale specified above.
The module also determines the animation time schedule
including gestures and visemes for lip sync. Thus, the system
allows our virtual agent to speak a sentence with lip sync and
appropriate gestures at proper timing