Interview with Jan Wellmann
At the IFA 2018 in Berlin Telekom presented its smart speaker. Via the »Hello Magenta« wake-up call, the voice assistant enables the user to access basic functions, such as Google services, as well as control Telekom’s smart home application and the television. It can also be connected directly to the landline and used as a telephone. It is also possible to use Alexa alongside Telekom’s own voice service. The audio technology for the device was developed by the Fraunhofer IDMT in Oldenburg. Interview with Jan Wellmann, Head of Audio System Technology.
Mr Wellmann, from finding the initial idea to product design, building the demonstrator and manufacturing the first prototype in Taiwan – you’ve been involved in developing Telekom’s smart speaker from start to finish and were able to contribute the Fraunhofer IDMT’s expertise. What impact did this early involvement have on the product?
Working closely together with the product designers and suppliers made it possible for us to define early on where we would have to or be allowed to comprise in order to arrive at the best price-performance ratio for the solution. The question for us was: How can we get as much sound as possible in a very small enclosure? And the sound quality should not just be suitable for listening to music. Instead, we also wanted to cover different use cases with it, for example pleasant-sounding voice services or telephony.
What were the greatest challenges?
In order to optimize voice control for the device, it must be able to hear well over very short but also over very long distances. This means we had to ask ourselves: How can we accommodate the microphones in the enclosure in such a way that we really have good coverage through controllable directional microphoning for different distances and rooms? To optimize the system, we went to our acoustics laboratory (which is actually a living room with standardized acoustics), where we could then simply test all kinds of reproducible scenarios. In this way, we were able to optimize the positions and algorithms of the microphone. We used four MEMS microphones for the directional microphone array. The signals from the four microphones are calculated in such a way that we have a directional characteristic in a very specific direction. We then had to filter out the noise from the loudspeaker itself – by means of what is known as echo cancellation – and we also further developed some of our algorithms in order to remove background noise. By being able to optimize both the recording side and the playback side and aligning them with the enclosure, we’ve achieved a quality that is very good, especially for hands-free equipment.
What do you particularly enjoy about your project with Telekom?
Team performance in Oldenburg and the trusting working relationship with Telekom. Our first job on the project was to produce the signal-processing algorithms for the microphoning and for the loudspeakers for the playback. The whole thing then just grew and grew. For example, we were able to contribute our know-how in the area of acoustic measuring and testing. We also designed end-of-line tests for mass production together with the manufacturer as well as developing and building special hardware for it. What’s more, we suggested and conducted tests which at first glance have nothing at all to do with audio but simply contribute to reliability, such as temperature trials and measurements related to distortion factor or enclosure tolerances. This made us very happy because it helped our client to a considerable degree and customer satisfaction is very important to us.
What happens next in your Audio System Technology group as far as voice-enabled devices are concerned? What can we expect in future?
Intelligent assistants will remain with us. On the one hand in a B2C context and the entertainment and communication sector. On the other hand, we also see the topic in conjunction with B2B services.
»We want to pursue our strategy of ›We make smart things listen‹ and boost voice-enabled devices across all markets.«#
At present, many devices are still unable to speak or hear. This is, however, something which would be very attractive for our customers or our customers’ customers. Voice control facilitates, for example, the safe and simple operation of machines, including an emergency call function. In addition, it’s possible to offer services based on voice control that would have been inconceivable in the past. Here, voice assistants and smart speakers are just the beginning and in future we’ll find speech recognition in many devices. We’re a one-stop shop for such projects – from idea to product.