“Image to sound” and “sound to image” transformation has been the subject of many studies. One of them was published by Peter B. L. Meijer, where he developed a system including a pipelined special purpose computer with a regular television camera. He implemented an algorithm that preserves visual information by ensuring 1 to 1 mapping between image and sound. The main advantage of this system is the development of a low-cost portable prototype conversion system having a power dissipation suitable for battery operation. The author concludes that the further development towards a practical application of the system still awaits a thorough evaluation with blind persons.
Another paper written by S. Matta et al [2] makes a comparison study between existing methods. The authors examined various image to sound conversion techniques according to a personalized testing process. The volunteers have asked to correlate binary images with corresponding sounds, and results showed that 75% of volunteers could not associate images with the corresponding sounds.
A. Cazan et al [3] designed a lowpower & portable system for image to sound conversion in order to help visually impaired people. A two phase testing mechanism which includes road signs and plates adapted to blind people is developed. The authors have implemented a system which is based on existing algorithms and techniques. After the initial training phase, volunteers were asked to associate images with corresponding sounds. The results showed that 40% of volunteers could successfully identify the correct pairs of image and sound.
Another paper was written by D.Yang et al, in order to develop a real time assistive system for blind people. The name of the device was given after its algorithm named EBCOT which stands for Embedded Block Coding with Optimized Truncation. The main idea of this algorithm is to apply two-tier coding and optimal wavelet base. The ability of coding with embedded block and minimum ratedistortion are the main features of this algorithm. The system receives real time images via a regular camera, processes them and the resultant sounds are transferred to the headphones. Finally they compared their system with existing compression methods, such as SPIHT and EZW. The results showed that EBCOT algorithm has the highest time efficiency among existing methodologies.
Another paper written by P. Codognet and G. Nouno presents a real time system that generates sound according to the blinking lights which were placed into the highest skyscrapers in Tokyo. Red Light Spotters Project encompassed artistic creation process embedding image tracking and beat prediction algorithms. The key idea was to achieve an emergent rhythmic process for the musical creation and generative music. Results showed that the system could be applicable to any other city under one condition, the necessity of rhythmic flow of lights. One of the studies related with sound to image mapping was written by K. Abe et al.