ISAD 2 – Informed Sound Activity Detection in Music and Audio Signals

The aim of the ISAD 2 project is the development of explainable and comprehensible deep-learning models to enable a better understanding of the structural and acoustic properties of sound sources – whether they are music or environmental sounds.

ISAD 2 builds on the results of the ISAD project. In ISAD, researchers explored fundamental Music Information Retrieval (MIR) techniques for detecting characteristic sound events present in a given music recording. Here, the focus was on informed approaches that exploit musical knowledge in the form of score information, instrument samples, or musically salient sections. Central tasks were the detection of audio sections with a specific timbre or instrument, identifying monophonic themes in complex polyphonic music recordings, and classifying music genres or playing styles based on melodic contours. The developed recognition methods were experimentally tested and evaluated in the context of complex music scenarios including instrumental Western classical music, jazz, and opera recordings.

In the second phase of the project ISAD 2 the project goals will be significantly extended. Not only musical data but also the recognition of environmental and ambient sounds will be considered as a second complex audio domain. As a central methodology, the advantages of model-based and data-driven methods will be explored and combined to learn task-specific sound event representations.  In addition, research will be conducted into how sound events can be recorded and analyzed on different time scales and with respect to hierarchically arranged categories such as membership in specific instrument families. Different time scales are important, for example, to be able to recognize certain repetition patterns of sounds in long-term recordings.


  • German Research Foundation (DFG) - AB 675/2-2