Have you ever tried to make a mental image of a person whom you have never seen but just herd his/her voice. for eg A radio jockey whom you listen every day on the radio. You certainly have but those are just your thoughts on how the person looks like, defining the beauty of the person’s face on the complete basis of his/her voice is a bit impossible.
AI or (artificial intelligence) is the one which makes this fairly impossible task possible. AI can generate people faces after listening to their voices. The digital photograph of the person’s face, generated by AI is not always the same as of the speaker of the voice. But is fairly good at pointing on different aspects of a person like gender, age, ethnicity, and features that are shared by many people.
The technology called by the name Speech2face a neutral network-a computer that thinks in the way humans do. Trained by scientists on millions of educational videos from the internet that showed over 100,000 different people talking.
From its training, it has learned to correlate different aspects of persons face with the voice of a person and by analyzing the voice it can tell various features of a person like gender, age, ethnicity, as well as certain cranial features like the shape of the head and width of the nose and so when the researchers fed persons audio clip to the system, it generates the image of each speaker’s face with reasonable accuracy.
Ascertain features like hairstyle, facial hair and other different features of the face are quite unpredictable by the voice of the person the developers insist that the main goal of the system was not to make an exact image of a person but rather to capture dominant facial traits of the person that are correlated with the input Speech.
The details of the study were recently published online in the preprint journal arXiv.In a paper published on IEEE Xplore, the researchers say this technology could one day find a range of useful applications, such as generating faces for video calls without the need for cameras.
This system requires improvements as it is gender biased that always associates low-pitched voices with male faces and high-pitched voices with female faces. The system is also prone to the occasional error, with roughly 6 percent of the faces it created being of the wrong gender and some of the wrong ethnicity.
In the end, it is to say that AI is making things possible that humans were never able to do and the new development that AI can generate people face after listening to their voices should have major implications in generating faces for video calls without the need for cameras.