The input data is
the one-dimensional combination of instantaneous spectrum at
power peak and the power pattern in time domain. Since for
almost environmental sounds, their spectrum changes are not
remarkable compared with speech or voice, the combination of
power and frequency pattern will preserve the major features
of environmental sounds but with drastically reduced data.
Two experiments were conducted using an original database
and a database created by the RWCP.