Software adapts speech to ambient noise level

Press Release / 2.2.2016

Oldenburg. Loudspeaker announcements at railway stations are often incomprehensible, since the surroundings are noisy. With new software, the clarity of such announcements can be considerably improved. A microphone picks up ambient noise and adjusts the spoken messages perfectly to the noise level. Even calls over mobile phones will be understood more easily with the help of this technology.

Since the beginning of the year 2016 Zurich Opera House works with SpatialSound Wave.
© Fraunhofer IDMT/Daniel Schmidt

If a freight train rattles past, passengers usually only understand about half of an announcement such as "The train to Frankfurt am Main will be departing today fromplatform...". Researchers from the Oldenburg-based Project Group Hearing, Speech and Audio Technology of the Fraunhofer Institute for Digital Media Technology IDMT have developed a software that significantly improves the intelligibility of speech – even for the voices of speakers at conferences or conversations on mobile phones.

 

Microphone analyzes noise levels

The trick of the ADAPT DRC software is that the ambient noise is continually analyzed via a microphone, and the speech is adjusted to it in real time. "It is not enough to simply make the voice louder over the loudspeaker or mobile phone to drown out the noise," says project manager Dr. Jan Rennies-Hochmuth. Such technologies are already used today in car radios, making the voice louder, but not necessarily more easily understood, because, at high volumes, the speakers reach their limits and start to rattle. "Speech is much more complex," says Rennies-Hochmuth. Firstly, it is important to reinforce certain pitches, the frequencies, in a targeted fashion. Vowels are relatively deep, long-drawn-out word components that are easy to understand. Consonants like "p", "t" and "k", however, are very short and have higher frequencies. Even though they are very important for understanding what is said, it is generally not easy to understand them as well in noisy environments. For example, the consonants influence whether a recipient who is listening to an announcement in German thinks he has heard the word "Kasse" or "Tasse" (in English, "checkout" or "cup"). "Our algorithms are able to prioritize certain frequencies and to reinforce, at the right time, precisely those which are particularly disturbed by the ambient noise," adds Rennies-Hochmuth.

 

Amplifying quiet speech components

Secondly, the software takes into account the parts of the speech signal which are of different volumes. Since spoken language is composed of loud and quiet  parts, experts use the term "voice dynamics". Speech intelligibility increases particularly when loud parts are systematically subdued and quiet parts are specifically amplified. This technique is called Dynamic Range Compression (DRC). This is also of interest if, for example, you make a call using a mobile phone when you are on a noisy street. The ADAPT DRC software has already been developed to the point of application maturity and is available to industrial partners. Since modern conference equipment or mobile phones already have built-in microphones, the devices already possess the technology which is necessary to be able to record the ambient noise. For speaker systems at railway stations or airports, additional microphones would first have to be installed.

 

People with impaired hearing also benefit

"As studies at the IDMT have shown, the new software also makes it easier for people with impaired hearing to understand loudspeaker announcements or the voices on their mobile phones. Usually, being hard of hearing differs for each person who is affected by the condition, so hearing aids have to be adjusted individually. In this respect, we were pleasantly surprised that the use of the ADAPT DRC software to improve loudspeaker announcements or voices over mobile phones or headsets appears to generally improve the understanding of spoken language for people who are hard of hearing", says Dr. Jan Rennies-Hochmuth, Head of the "Personalized Hearing Systems" Group at IDMT.

 

About the Project Group Hearing, Speech and Audio Technology of the Fraunhofer Institute for Digital Media Technology IDMT

The goal of the Project Group for Hearing, Speech and Audio Technology of the Fraunhofer IDMT is to implement scientific findings on auditory perception of normal and impaired hearing in technological applications. Scientists carry out applied research and development on behalf of industrial companies and public institutions in the fields of telecommunications, multimedia, health and care services, building technology, transport, industrial production and security. The project group was established in Oldenburg in 2008 as a branch of the Fraunhofer Institute for Digital Media Technology IDMT. Through scientific cooperations, it has close links with the hearing research facilities in Oldenburg and is also partner in the cluster of excellence »Hearing4all«.