SI-Live - Technology patented by Fraunhofer IDMT

Better communication in phone calls and video conferences

© Fraunhofer IDMT
Demo live display of speech intelligibility

Optimal voice transmission and seamless transitions between talkers are paramount for telephone and video conferencing systems to achieve high user acceptance and experience. In everyday work, it is crucial for efficient and successful communication. However, speech intelligibility is often impaired by e.g. background noise in the microphone signal, sub-optimal technical settings or interferences in the transmission channel. A particular problem of telecommunication is that talkers are often unaware that they cannot be well understood at the other end of the line. This often results in interruptions when communication partners must actively indicate that there are intelligibility problems.

Live analysis & display of speech intelligibility

The patented software solution SI-Live developed at the Fraunhofer IDMT in Oldenburg analyzes the current intelligibility of the microphone signal in real-time to estimate if the outgoing speech is clear enough for the conversation partners. The analysis can be displayed and converted to messages to the talker, prompting him or her e.g. to modify the technical settings, change the microphone position, remove interfering noise or speak more clearly. This enables the talker to take action without having to be interrupted by the communication partners.

The foundation of SI-Live is a multidimensional analysis of the outgoing voice signal based on state-of-the-art perceptual models. These models detect a variety of possible speech disturbances and assess their impact on communication quality. On the basis of the recorded microphone Demo Live display of speech intelligibility signal, the SI-Live algorithm estimates physical parameters, such as the speech and noise levels or the amount of reverberation, as well as various metrics derived from automatic speech recognition technology in short time intervals. Since no reference signal is required, the assessment can be conducted online and can be employed for live speech in any voice communication system.