the MMDAgent as a voice navigation model. The proposed
system separates the entire navigation route into the
sections between one intersection and the next intersection
to turn right or left. We call each section a STEP section.
The proposed system navigates in the following order: