Sounds and noises surround us everywhere in our daily lives – as disturbing noise, as the soothing rustle of leaves or as the warning sound of sirens on the street. Humans possess not only the ability to distinguish between important and unimportant sounds but also to derive crucial information about their surroundings through sound interpretation based on their experiences.
"Machine listening" is a subfield of artificial intelligence that aims to replicate this human capability by automatically capturing and interpreting information from environmental sounds. This involves combining signal processing techniques and machine learning and developing algorithms for the analysis, source separation, and classification of music, speech, and environmental sounds. Source separation allows for the decomposition of complex acoustic scenes into their components, i.e., individual sound sources, while classification identifies sounds and assigns them to predefined sound sources or classes.
The developed technologies and solutions find applications in various areas:
- Bioacoustics: Identifying animal species, studying behavioral patterns, or monitoring environmental impacts based on acoustic characteristics
- Noise monitoring: recording noise data, identifying noise sources and planning noise protection measures
- Logistics and traffic monitoring: Counting and classifying vehicles, analyzing traffic flows to improve emergency response planning, and implement traffic management measures
- Safety surveillance (construction sites, public events): Detecting hazardous situations, vandalism, or break-ins acoustically