One of the main problems when working with audio recordings is labeling of the data, since without properly labeled data, testing is impossible. It is difficult to recognize all notes played by all instruments in each recording, and if numerous instruments are playing, this task is becoming infeasible. Even if a score is available for a given piece of music, still, the real performance actually differs from the score because of human interpretation, imperfections of tempo, minor mistakes, and so on. Soft and short notes pose further difficulties, since they might not be heard, and grace notes leave some freedom to the performer - therefore, consecutive onsets may not correspond to consecutive notes in the score. As a result, some notes can be omitted. The problem of score following is addressed in [28].