Although the proposed approach yields promising results, several open challenges are still remain. First, the ambiguity in the descriptions of sounds has not been completely solved yet; an audio clip can be labeled with multiple semantic labels and onomatopoeias, while we have considered only one semantic label and onomatopoeia for each audio clip. Therefore, in the future, we will extend our work so that an audio signal can have multiple tags. Real user annotations, which may include emotional information, will be included as well. Second, two axes in the proposed iADL are not necessarily
orthogonal; there exist some degree of redundancy in representing a sound on the iADL, with which was not dealt in this work. In future work, we will investigate the dependency between the two axes of the iADL: semantic labels and onomatopoeias.