Automatic Music Analysis

Audio and Visual Content Analysis

Audio signal processing and machine learning for music analysis

Audio signal processing and machine learning are revolutionizing music analysis. From audio matching to music annotation and similarity search, from automatic music transcription to music generation, new application possibilities are emerging for broadcast monitoring, music search and recommendation, music production, and music learning programs.

Our goal is to enable quick and customized access to musical content through our approaches and techniques for automatic music analysis. We are developing practical solutions that are applicable in various domains such as the music industry, entertainment, education, and music production. In addition to enhancing existing technologies, we aim to showcase new application possibilities for automatic music analysis and contribute to the evolution of algorithms and methods.

News and upcoming events

 

Press Release / 12.4.2024

Advertising monitoring for SWR radio programs

Our audio matching replaces manual checking of broadcast commercials

 

Event / 12.3.2024

DataTech 2024

Join our presentation »Digital Traces: Verification of Audio-Visual Content« ath the Data Technology Seminar 2024 – EBU's annual flagship event for practitioners in data and AI for media.

 

New project

A musical question-and-answer game with AI

Development of an AI-based composition app

Understanding Music

How do I quickly find a suitable music piece in a large music catalog? Can I automatically receive recommendations for the perfect beat that harmonizes well with a music production I'm currently working on? Which programs in my archive are the most successful? These are typical questions where our technologies for automatic music analysis can help.

Audio signal processing and machine learning have fundamentally changed music analysis. The multidisciplinary research field "Music Information Retrieval" encompasses algorithms and techniques for extracting musical information from audio data, transforming it into interpretable formats. The results are applied in areas such as broadcast monitoring, music search and recommendation, music production, content tracking, and music eductaion.

AI-based music analysis technologies

General challenges in automatic music analysis include processing large amounts of data, considering musical diversity and context, robustness to variations in recording quality, and the efficient deployment of real-time processing for various applications.

Audio-Matching

Audio matching via audio fingerprinting enables the identification of specific audio recordings in music collections and streams. Media content is compared and matched based on acoustic fingerprints. Audio matching is used for analyzing music usage in broadcast monitoring, content tracking applications, archive maintenance, as well as in music search engines and recommendation systems.
 

At Fraunhofer IDMT, we research how to further improve the accuracy and efficiency of audio matching techniques in order to enable more precise detection and identification of media content.

Annotation and similarity search for music

Annotation and similarity search for music facilitate the organization of music collections and simplify access to musical content. The use of metadata allows for versatile search and recommendation systems, automating the discovery of suitable music or musical elements. This is applicable, for example, in end-user streaming services or music production.


We are working on enhancing annotation and similarity search, particularly for large and diverse music collections, while also considering to user preferences and contextual information.

Automatic music transcription

Automatic music transcription involves converting music signals into symbolic music notation and extracting musical structures such as melodies, chords, and rhythms. These techniques are used in music learning programs, music game development, and music theoretical studies.


The specific challenges of automatic music transcription lie in precisely, reliably, and real-time capturing complex musical structures, even in polyphonic musical pieces or situations with background and ambient noise.

Automatic music generation

Automatic music generation involves the development of algorithms and AI systems capable of creating their own original musical pieces or parts thereof. It provides automated support in the music production process and during live performances, for instance, by generating melodies based on harmonies. This emerging field  introduces new creative approaches to music composition and production.


However, automatic music generation is still a relatively young research field and requires further progress to produce realistic and coherent musical results that meet the expectations of music creators and listeners. At Fraunhofer IDMT, we are researching ways to make the AI composition process transparent and controllable. Our aim is to support the creative collaboration between music creators and AI.

 

Research project

Music Automaton (Musik-Automat)

Development of an AI-based composition app

 

Research project

ISAD 2

Develop explainable and comprehensible deep learning models to better understand sound source characteristics of music, environmental and ambient sounds

 

Research project

AI4Media

Center of excellence for AI in media – Our contributions: Audio forensics, audio provenance analysis, music analysis, privacy and recommendation systems

 

Reference project

Jamahook – AI Sound Matching

Search engine for loops and beats based on SoundsLike

 

Reference project

SWR Media Services

Audio matching software for automatic advertising monitoring of SWR radio programs

 

Research project

MusicBricks

Musical Building Blocks for Digital Makers and Content Creators: Transfer state-of-the-art ICT to Creative SMEs in order to develop novel business models.

 

Research project

SyncGlobal

Global music search applied to cross-modal synchronization with video content

 

Research project

GlobalMusic2one

Adaptive, hybrid search technologies for global music portfolios

Research project

MuSEc

Audio analysis and PET for the MusicDNA sustainable eco system

 

Research project

Emused

Interactive app for learning how to improvise on a musical instrument

 

Research project

MiCO

Platform for multimodal and context-based analysis, into which a wide variety of analysis components for different media types can be integrated

Products

 

SoundsLike

AI-based Tagging and Search for Large Music Catalogs

 

Audio Matching

Detect a given audio query within a stream or file – even under noisy conditions or with a very short query

 

Speech and Music Detector

Software tool for automatic detection of music and speech sequences to optimize broadcasting programs or provide accurate accounting for copyright agencies

 

Automatic Music Transcription

Convert musical signals into notes for music games and music learning programs

Jahr
Year
Titel/Autor:in
Title/Author
Publikationstyp
Publication Type
2023 An Introduction to Unsupervised Domain Adaptation in Sound and Music Processing
Bittner, Franca; Abeßer, Jakob
Konferenzbeitrag
Conference Paper
2023 Automatic Note-Level Score-to-Performance Alignments in the ASAP Dataset
Peter, Silvan David; Cancino-Chacón, Carlos Eduardo; Foscarin, Francesco; McLeod, Andrew; Henkel, Florian; Karystinaios, Emmanouil; Widmer, Gerhard
Zeitschriftenaufsatz
Journal Article
2023 Uncertainty in Semi-Supervised Audio Classification - A Novel Extension for FixMatch
Grollmisch, Sascha; Cano, Estefanía; Lukashevich, Hanna; Abeßer, Jakob
Konferenzbeitrag
Conference Paper
2023 Knowledge Transfer from Neural Networks for Speech Music Classification
Kehling, Christian; Cano, Estefanía
Konferenzbeitrag
Conference Paper
2023 How reliable are posterior class probabilities in automatic music classification?
Lukashevich, Hanna; Grollmisch, Sascha; Abeßer, Jakob; Stober, Sebastian; Bös, Joachim
Konferenzbeitrag
Conference Paper
2023 Deep Learning-Based Music Instrument Recognition: Exploring Learned Feature Representations
Taenzer, Michael; Mimilakis, Stylianos Ioannis; Abeßer, Jakob
Konferenzbeitrag
Conference Paper
2022 Multi-input Architecture and Disentangled Representation Learning for Multi-dimensional Modeling of Music Similarity
Ribecky, Sebastian; Abeßer, Jakob; Lukashevich, Hanna
Zeitschriftenaufsatz
Journal Article
2022 JSD: A Dataset for Structure Analysis in Jazz Music
Balke, Stefan; Reck, Julian; Weiß, Christof; Abeßer, Jakob; Müller, Meinard
Zeitschriftenaufsatz
Journal Article
2022 Periodicity Pitch Perception Part III: Sensibility and Pachinko Volatility
Feldhoff, F.; Toepfer, H.; Harczos, Tamás; Klefenz, Frank
Zeitschriftenaufsatz
Journal Article
2022 Multi-pitch Estimation meets Microphone Mismatch: Applicability of Domain Adaptation
Bittner, Franca; Gonzalez Rodriguez, Marcel; Richter, Maike; Lukashevich, Hanna; Abeßer, Jakob
Konferenzbeitrag
Conference Paper
2022 Three Metrics for Musical Chord Label Evaluation
McLeod, Andrew; Suermondt, Xavier; Rammos, Yannis; Herff, Steffen A.; Rohrmeier, Martin A.
Konferenzbeitrag
Conference Paper
2022 Towards Interpreting and Improving the Latent Space for Musical Chord Recognition
Nadar, Christon-Ragavan; Taenzer, Michael; Abeßer, Jakob
Konferenzbeitrag
Conference Paper
2021 A Benchmark Dataset to Study Microphone Mismatch Conditions for Piano Multipitch Estimation on Mobile Devices
Abeßer, Jakob; Bittner, Franca; Richter, Maike; Gonzalez Rodriguez, Marcel; Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2021 Jazz Bass Transcription Using a U-Net Architecture
Abeßer, J.; Müller, M.
Zeitschriftenaufsatz
Journal Article
2021 A Novel Dataset for Time-Dependent Harmonic Similarity between Chord Sequences
Bittner, Franca; Abeßer, Jakob; Nadar, Christon-Ragavan; Lukashevich, Hanna; Kramer, Patrick
Vortrag
Presentation
2021 Predominant Jazz Instrument Recognition. Empirical Studies on Neural Network Architectures
Mimilakis, Stylianos I.; Abeßer, Jakob; Chauhan, Jaydeep; Pillai, Prateek Pradeep; Taenzer, Michael
Konferenzbeitrag
Conference Paper
2021 Improving Semi-Supervised Learning for Audio Classification with FixMatch
Grollmisch, Sascha; Cano, Estefanía
Zeitschriftenaufsatz
Journal Article
2021 Informing Piano Multi-Pitch Estimation with Inferred Local Polyphony Based on Convolutional Neural Networks
Taenzer, Michael; Mimilakis, Stylianos I.; Abeßer, Jakob
Zeitschriftenaufsatz
Journal Article
2021 Towards Deep Learning Strategies for Transcribing Electroacoustic Music
Abeßer, J.; Nowakowski, M.; Weiß, C.
Konferenzbeitrag
Conference Paper
2021 Ensemble Size Classification in Colombian Andean String Music Recordings
Grollmisch, S.; Cano, E.; Mora Ángel, F.; López Gil, G.
Konferenzbeitrag
Conference Paper
2020 Cross-Version Singing Voice Detection in Opera Recordings: Challenges for Supervised Learning
Mimilakis, Stylianos Ioannis; Weiss, Christof; Arifi-Müller, Vlora; Abeßer, Jakob; Müller, Meinard
Konferenzbeitrag
Conference Paper
2019 Automatic Chord Recognition in Music Education Applications
Grollmisch, Sascha; Cano, Estefanía
Konferenzbeitrag
Conference Paper
2019 Analysis and Visualisation of Music
Wunsche, B.C.; Müller, S.; Tänzer, M.
Konferenzbeitrag
Conference Paper
2019 ACMUS-MIR: A new annotated data set of Andean Colombian music
Mora-Ángel, Fernando; López Gil, Gustavo A.; Cano, Estefanía; Grollmisch, Sascha
Konferenzbeitrag
Conference Paper
2019 Towards CNN-based Acoustic Modeling of Seventh Chords for Automatic Chord Recognition
Nadar, Christon-Ragavan; Abeßer, Jakob; Grollmisch, Sascha
Konferenzbeitrag
Conference Paper
2019 Musical Source Separation
Cano, E.; FitzGerald, D.; Liutkus, A.; Plumbley, M.D.; Stöter, F.-R.
Zeitschriftenaufsatz
Journal Article
2019 Ensemble size classification in Colombian Andean string music recordings
Grollmisch, Sascha; Cano, Estefanía; Mora-Ãngel, Fernando; López Gil, Gustavo A.
Konferenzbeitrag
Conference Paper
2019 Microtiming analysis in traditional shetland fiddle music
Cano, E.; Beveridge, S.
Konferenzbeitrag
Conference Paper
2019 Investigating CNN-based Instrument Family Recognition for Western Classical Music Recordings
Mimilakis, Stylianos I.; Taenzer, Michael; Abeßer, Jakob; Weiss, Christof; Müller, Meinard; Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2019 Fundamental Frequency Contour Classification: A Comparison Between Hand-Crafted and CNN-Based Features
Abeßer, Jakob; Müller, Meinard
Konferenzbeitrag
Conference Paper
2018 Harmonic-percussive source separation with deep neural networks and phase recovery
Mimilakis, S.I.; Drossos, K.; Magron, P.; Virtanen, T.
Konferenzbeitrag
Conference Paper
2018 Reducing interference with phase recovery in DNN-based monaural singing voice separation
Mimilakis, S.I.; Magron, P.; Drossos, K.; Virtanen, T.
Konferenzbeitrag
Conference Paper
2018 Computational Corpus Analysis: A Case Study on Jazz Solos
Weiß, Christof; Balke, Stefan; Abeßer, Jakob; Müller, Meinard
Konferenzbeitrag
Conference Paper
2018 Jazz Solo Instrument Classification with Convolutional Neural Networks, Source Separation, and Transfer Learning
Gomez, Juan S.; Abeßer, Jakob; Cano, Estefanía
Konferenzbeitrag
Conference Paper
2018 Retrieval of Song Lyrics from Sung Queries
Kruspe, A.M.; Goto, M.
Konferenzbeitrag
Conference Paper
2018 Music Technology and Education
Cano, E.; Dittmar, C.; Abeßer, J.; Kehling, C.; Grollmisch, S.
Aufsatz in Buch
Book Article
2018 MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation
Drossos, K.; Serdyuk, D.; Virtanen, T.; Bengio, Y.; Mimilakis, S.I.; Schuller, G.
Konferenzbeitrag
Conference Paper
2018 Improving Bass Saliency Estimation using Label Propagation and Transfer Learning
Abeßer, Jakob; Balke, Stefan; Müller, Meinard
Konferenzbeitrag
Conference Paper
2018 Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask
Mimilakis, S.I.; Drossos, K.; Santos, J.F.; Virtanen, T.; Bengio, Y.; Schuller, G.
Konferenzbeitrag
Conference Paper
2018 The dimensions of perceptual quality of sound source separation
Cano, Estefanía; Liebetrau, Judith; Fitzgerald, Derry; Brandenburg, Karlheinz
Konferenzbeitrag
Conference Paper
2017 Computational methods for tonality-based style analysis of classical music audio recordings
Weiß, Christof
Dissertation
Doctoral Thesis
2017 Deep learning for jazz walking bass transcription
Abeßer, Jakob; Balke, Stefan; Frieler, Klaus; Pfleiderer, Martin; Müller, Meinard
Konferenzbeitrag
Conference Paper
2017 Soundslike - automatic content-based music annotation and recommendation for large databases
Grollmisch, Sascha; Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2017 Score-informed analysis of tuning, intonation, pitch modulation, and dynamics in jazz solos
Abeßer, Jakob; Frieler, Klaus; Cano, Estefanía; Pfleiderer, Martin; Zaddach, Wolf-Georg
Zeitschriftenaufsatz
Journal Article
2017 Instrument-centered music transcription of solo bass guitar recordings
Abeßer, Jakob; Schuller, Gerald
Zeitschriftenaufsatz
Journal Article
2017 A recurrent encoder-decoder approach with skip-filtering connections for monaural singing voice separation
Mimilakis, S.I.; Drossos, K.; Virtanen, T.; Schuller, G.
Konferenzbeitrag
Conference Paper
2017 Data-driven solo voice enhancement for jazz music retrieval
Balke, Stefan; Dittmar, Christian; Abeßer, Jakob; Müller, Meinard
Konferenzbeitrag
Conference Paper
2017 Exploring sound source separation for acoustic condition monitoring in industrial scenarios
Cano, Estefanía; Nowak, Johannes; Grollmisch, Sascha
Konferenzbeitrag
Conference Paper
2017 Automatic speech/music discrimination for broadcast signals
Kruspe, Anna M.; Zapf, Dominik; Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2016 Towards evaluating multiple predominant melody annotations in jazz recordings
Balke, Stefan; Driedger, Jonathan; Abeßer, Jakob; Dittmar, Christian; Müller, Meinard
Konferenzbeitrag
Conference Paper
2016 Retrieval of textual song lyrics from sung inputs
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2016 Automatic best take detection for electric guitar and vocal studio recordings
Bönsel, Carsten; Abeßer, Jakob; Grollmisch, Sascha; Mimilakis, Stylianos-Ioannis
Konferenzbeitrag
Conference Paper
2016 New sonorities for jazz recordings: Separation and mixing using deep neural networks
Cano, Estefanía; Abeßer, Jakob; Schuller, Gerald; Mimilakis, Stylianos-Ioannis
Konferenzbeitrag
Conference Paper
2016 Bootstrapping a system for phoneme recognition and keyword spotting in unaccompanied singing
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2016 Midlevel analysis of monophonic jazz solos: A new approach to the study of improvisation
Frieler, Klaus; Pfleiderer, Martin; Zaddach, Wolf-Georg; Abeßer, Jakob
Zeitschriftenaufsatz
Journal Article
2016 Phonotactic language identification for singing
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2015 Training phoneme models for singing with "songified" speech data
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2015 On the impact of key detection performance for identifying classical music styles
Weiß, Christof; Schaab, Maximilian
Konferenzbeitrag
Conference Paper
2015 Score-informed analysis of intonation and pitch modulation in jazz solos
Abeßer, Jakob; Cano, Estefanía; Frieler, Klaus; Pfleiderer, Martin; Zaddach, Wolf-Georg
Konferenzbeitrag
Conference Paper
2015 Tonal complexity features for style classification of classical music
Weiß, Christof; Müller, Meinard
Konferenzbeitrag
Conference Paper
2015 Keyword spotting in singing with duration-modeled HMMs
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2014 Pitch-informed solo and accompaniment separation towards its use in music education applications
Cano, Estefanía; Schuller, Gerald; Dittmar, Christian
Zeitschriftenaufsatz
Journal Article
2014 Automatic style classification of jazz records with respect to rhythm, tempo, and tonality
Eppler, Arndt; Männchen, Andreas; Abeßer, Jakob; Weiss, Christof; Frieler, Klaus
Konferenzbeitrag
Conference Paper
2014 Improving singing language identification through i-vector extraction
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2014 Phase-based harmonic/percussive separation
Cano, Estefanía; Plumbley, Mark; Dittmar, Christian
Konferenzbeitrag
Conference Paper
2014 Performer profiling as a method of examining the transmission of Scottish traditional music
Beveridge, Scott; Gibson, Ronnie; Cano, Estefanía
Konferenzbeitrag
Conference Paper
2014 Score-informed tracking and contextual analysis of fundamental frequency contours in trumpet and saxophone jazz solos
Abeßer, Jakob; Pfleiderer, Martin; Frieler, Klaus; Zaddach, Wolf-Georg
Konferenzbeitrag
Conference Paper
2014 A GMM Approach to Singing Language Identification
Kruspe, Anna M.; Abeßer, Jakob; Dittmar, Christian
Konferenzbeitrag
Conference Paper
2014 Timbre-invariant audio features for style analysis of classical music
Weiss, Christof; Mauch, Matthias; Dixon, Simon
Konferenzbeitrag
Conference Paper
2014 Confidence measures in automatic music classification
Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2014 Keyword spotting in a-capella singing
Kruspe, Anna M.
Konferenzbeitrag
Conference Paper
2014 Automatic tablature transcription of electric guitar recordings by estimation of score- and instrument-related parameters
Kehling, Christian; Abeßer, Jakob; Dittmar, Christian; Schuller, Gerald
Konferenzbeitrag
Conference Paper
2014 Real-time transcription and separation of drum recordings based on NMF decomposition
Dittmar, Christian; Gärtner, Daniel
Konferenzbeitrag
Conference Paper
2014 Exploring phrase form structures. Pt.II: Monophonic jazz solos
Frieler, Klaus; Zaddach, Wolf-Georg; Abeßer, Jakob
Konferenzbeitrag
Conference Paper
2014 Dynamics in jazz improvisation - score-informed estimation and contextual analysis of tone intensities in trumpet and saxophone solos
Abeßer, Jakob; Cano, Estefanía; Frieler, Klaus; Pfleiderer, Martin
Konferenzbeitrag
Conference Paper
2014 A mid-level approach to local tonality analysis: Extracting key signatures from audio
Weiss, C.; Cano, E.; Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2014 A multiple-expert framework for instrument recognition
Abeßer, J.; Dittmar, C.; Lukashevich, H.; Grasis, M.
Konferenzbeitrag
Conference Paper
2014 Automatic competency assessment of rhythm performances of ninth-grade and tenth-grade pupils
Abeßer, J.; Dittmar, C.; Grollmisch, S.; Hasselhorn, J.; Lehmann, A.
Konferenzbeitrag
Conference Paper
2014 Quantifying and visualizing tonal complexity
Weiss, Christof; Müller, Meinard
Konferenzbeitrag
Conference Paper
2014 Chroma-based scale matching for audio tonality analysis
Weiss, Christof; Habryka, Julian
Konferenzbeitrag
Conference Paper
2011 Server based pitch detection for web applications
Dittmar, C.; Grollmisch, S.; Cano, E.; Dressler, K.
Konferenzbeitrag
Conference Paper
2010 Automatic Detection of Audio Effects in Guitar and Bass Recordings
Abeßer, J.; Stein, M.; Dittmar, C.; Schuller, G.
Konferenzbeitrag
Conference Paper
2009 Feature Selection vs. Feature Space Transformation in Music Genre Classification Framework
Lukashevich, H.
Konferenzbeitrag
Conference Paper
2009 Feature selection vs. Feature Space Transformation in automatic music genre classification tasks
Lukashevich, Hanna
Konferenzbeitrag
Conference Paper
2004 Further steps towards drum transcription of polyphonic music
Dittmar, Christian; Uhle, Christian
Konferenzbeitrag
Conference Paper
2003 Using a Physiological Ear Model for Automatic Melody Transcription and Sound Source Recognition
Heinz, T.; Brückmann, A.
Konferenzbeitrag
Conference Paper
Diese Liste ist ein Auszug aus der Publikationsplattform Fraunhofer-Publica

This list has been generated from the publication platform Fraunhofer-Publica