Speaker authentication is important in all situations where people speak and need to be clearly recognised. This can be the case in human-machine interaction as well as in conversations between several people.
Just a few seconds of audio material are sufficient to identify the person speaking with the help of intelligent algorithms. Newly generated and already known data are compared in order to confirm or disprove that they are similar. It is possible, for example, to detect whether the same person is speaking in different audio recordings.
However, by distinguishing the individual speakers we can not only assess who is speaking in the recording at a given time but also where and how many people can be heard in the recording overall. In addition, we can identify the language spoken in the audio file.
When it is necessary to identify exactly who is speaking in a production environment at a given moment, the intelligent algorithms of Fraunhofer IDMT in Oldenburg come into play. Especially when certain machines may only be operated by authorised users, it is important to know who is giving the command. If the machine recognises that the operator is unauthorised, it will not be activated.
To give additional persons access, our speaker recognition system makes it possible to create a new SpeakerID within a few seconds. The new operator is then also able to execute voice commands on the machine. In our industrial working group “Audio Technology for Intelligent Production AiP”, we are working together with industrial partners on possible applications for this technology in practice.
If individual speakers can be identified, this also represents a possibility, for example, to search systematically in media archives. If the same person is speaking in several recordings, this can be filtered out. It is possible to identify for how long each person is speaking and, in this way, filter out a specific speaker. When only looking for content in a particular foreign language, thanks to the intelligent algorithms this can also be extracted.
In security-critical areas
Similar to a fingerprint, it is possible to identify an individual person via their voice. In combination with other biometric identifiers, such as facial recognition, speaker recognition can be used in security-critical areas. For example, it can be used in forensics to determine the identity of a speaker in sound recordings.
Across groups, we at Fraunhofer IDMT are looking at a wide range of potential applications for our technologies. Speaker authentication can also be used for monitoring speech and voice disorders. It can be used at the same time to check how speech therapy measures are progressing.