Technologies and Solutions

© istock.com/Andrey Tsidvintsev

Fraunhofer IDMT develops various state-of-the-art technologies for the automatic analysis of audio-visual material, which can be integrated in existing systems and applications, thereby allowing broadcasters and digital archives to fully exploit the potential of their content.

A/V Matching and Phylogeny

The fingerprinting-based technologies robustly detect matches and partial matches including very short segments in audio and video datasets. This can be used for metadata propagation, rights clearance, de-duplication, finding content that stems from the same raw material and event, as well as for monitoring and radio or TV stream analysis.

Audio Phylogeny Analysis automatically detects parent-child relationships in a set of near-duplicate audio items or segments. This helps to calculate the processing history within a set of transcoded copies and to detect the best-quality „root“ original item or segment, which is especially useful for de-duplication purposes.

Metadata Extraction and Enrichment

Fraunhofer IDMT provides various technologies for the automatic, efficient annotation of A/V content items to ensure that material can be found and used, or recommended to users. This includes music annotation (e.g. detection of tempo, mood, genre, etc.), speech and music detection, video shot detection and key frame extraction, video motion analysis and object/actor recognition.

Audio

Video

Audio Forensics

The acquisition, coding and editing of audio material leaves characteristic footprints within material which can be analyzed with our audio forensics tools. This includes automatic detection of recording devices, detection of traces of previous encoding steps, and editing detection, all of which can support the assessment of the credibility of user-generated content.

Automatic Quality Control (QC)

The A/V Error Detection Libraries provide automatic detection of audiovisual errors for A/V production, content management, and archiving. Detectable errors include visual artifacts such as blocking, blurring, ringing, noise, freeze, field order, etc. or audio artifacts like clipping, dropout, phase shift, channel similarity or previous coding steps / low-bitrate encoding.

Multimodal Analysis and Recommendation

To support meaningful queries, media systems often require many heterogeneous analysis components. This poses many challenges regarding efficient integration and flexible orchestration of extractors. We have conducted several projects in this domain and can support you with related challenges.

Media Security and Privacy

We have developed a unique approach that creates strongly decoupled virtual IDs to real IDs in a system, thereby effectively hiding data sources. This can be used for privacy-preserving data analysis and recommendation purposes, and for cooperative data usage among competitors.