Fraunhofer presents various AI-based content analysis tools at International Broadcasting Convention 2019

Whether it is about operating an intelligent media archive, providing real-time subtitles for TV shows or analyzing entire radio or TV programs,– with the help of artificial intelligence, media content can be systematically analyzed. This not just facilitates daily routines of media professionals, but also allows offering personalized content to media consumers – and doing so in a privacy-preserving manner. At IBC 2019, taking place on September 13 – 17 in Amsterdam, Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS and Fraunhofer Institute for Digital Media Technology IDMT will be presenting various tools for content analysis at Fraunhofer’s booth B80 in hall 8.

Program analysis tool by Fraunhofer IDMT
© Fraunhofer IDMT
With the Fraunhofer IDMT program analysis, radio programs can be analyzed according to the proportion of speech and music, the duration and temporal placement of radio contributions and their repetitions.
Audio Mining system of Fraunhofer IAIS
© Fraunhofer IAIS
With the help of Fraunhofer IAIS’s Audio Mining system, it is possible to crawl through audio or video tracks in order to systematically search for original utterances made by people.

Using artificial intelligence (AI) and machine learning algorithms, state-of-the-art technologies can be used to automatically extract metadata from audio, image, video, and text material. Content analysis tools are more and more requested today, as most data inventories are too large for being manually annotated. To meet this demand, the data analysis and media technology experts of Fraunhofer IAIS and Fraunhofer IDMT have developed special tools for various application scenarios and media formats. Not just media professionals in editorial departments or archives of TV networks and radio stations, but also consumers can benefit from such tools.

High-capacity Audio Mining system each day analyzes about 2,000 hours of archived content at one of Germany’s major national TV and radio broadcasters

Detecting specific soundbites in recorded audio or video material can be a very tedious endeavor for journalists or editors. With the help of Fraunhofer IAIS’s Audio Mining system, it is possible to crawl through audio or video tracks in order to systematically search for original utterances made by people. The tool takes advantage of deep learning, allowing speech-to-text conversion of material recorded during or for a radio or TV show. “Each broadcast is completely available as a text file, in which single words or strings of words can be detected within a fraction of a second. For each word, the time it was uttered during the recording is exactly identifiable. Users may then mark a certain word or a string of words in the text in order to get to the respective soundbite and cut it out from the overall recording”, explains Christoph Schmidt, head of the Speech Technologies division at Fraunhofer IAIS.

Another feature offered by the system is speech recognition in combination with speaker clustering, which allows distinguishing utterances of different individuals within a recording. Searching for content within archives thereby becomes substantially easier, as the tool is capable of responding also to complex user requests (such as searching for “statements by Angela Merkel on nuclear power phase-out”). Likewise, the system allows users to jump to a certain sequence of utterances (made by a certain person during a talk show, for example) simply by a mouse-click. Among the TV and radio networks using the tool on a regular basis is ARD, one of Germany’s major national TV and radio broadcasters featuring also a number of regional branches, where a total of approximately 2,000 hours of archived material needs to be analyzed on a day-to-day basis.

The experts of Fraunhofer IAIS are currently working on developing their Audio Mining technologies further to become a full-blown dialog system – and thereby an intelligent assistant capable of responding to spoken commands or questions.

Besides the analysis of preproduced content, other application scenarios are live events during which speech needs to be converted to text in real time. For example, the live recognition tool is used in the regional parliament of the state of Saxony for providing real-time subtitles during debates. In the future, the tool could be used for different types of live TV broadcasts (such as one-on-one interviews or talk shows), but also by providers of streaming services, relieving media professionals of time-consuming transcription processes.

AI-based program analysis tool for radio broadcaster

Analyzing not just individual items, but entire programs broadcasted by radio stations is what Fraunhofer IDMT‘s AI-based program analysis tool is made for. The tool can e.g. help to trace how a certain news story is aired by different stations (i.e. at what time and with what modifications made to it), to what extent certain program elements recur during a day or week, or to estimate to what extent a program differs from programs of other stations. Radio stations can use this information to optimize their programs accordingly.

One of the features of Fraunhofer IDMT’s tool is partial matching, which detects reuse of jingles, commercial spots, news stories, or pieces of music during a defined period of time. “By looking at the number of repetitions of elements and when they occur during a day, it is possible to make conclusions about a program’s content and to compare programs with each other”, explains Patrick Aichroth, head of the Media Distribution and Security group at Fraunhofer IDMT.

For this purpose, the tool also includes music detection and music analysis, in terms of automatically identifying genre, tempo and other attributes.

Privacy-aware personalization services

Radio and TV broadcasters increasingly aim at offering listeners and viewers personalized content, but user privacy also has become an aspect of critical importance.

Recommendations are typically generated either on the basis of content metadata (content-based), or on the basis of collaborative user and usage analysis (collaborative-filtering). Each method has its advantages: While the former e.g. allows providing recommendations across media barriers and formats (i.e. images, text, audio and video), the latter supports consideration of individual user feedback.

To leverage the advantages of both methods, the experts of Fraunhofer IDMT combine them to so-called “hybrid recommendation” approaches. In addition, they use a patented method which allows to perform personalization without infringing on the user’s data sovereignty, by strongly decoupling real user identities from pseudonyms which are used for analysis, thereby securely hiding the real identity.

Both Fraunhofer institutes are currently working on bringing their technologies and tools together to multiply the possibilities and application options for the media industry.

About Fraunhofer IAIS

The Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS is one of the leading scientific institutes in the fields of Artificial Intelligence, Machine Learning and Big Data in Germany and Europe. With its approximately 300 employees, the institute supports companies in the optimization of products, services, processes and structures as well as in the development of new digital business models. Fraunhofer IAIS thus shapes the digital transformation of our working and living environment. In the Speech Technologies business unit, the institute develops technologies such as automatic speech recognition, speaker recognition and speech synthesis that are tailored to the needs of its customers.