China’s leading Artificial Intelligence (AI) company YITU Technology has unveiled the most precise Mandarin speech recognition system in the market, showcasing unprecedented levels of accuracy achieved by such technology thus far.
“Performances of currently available technologies on the market are mixed and can only fulfill a few basic features. YITU’s goal in starting its own speech recognition research was to address challenges in this promising industry”
YITU’s system achieved a character error rate (CER) of 3.71 percent when recognizing speeches in Mandarin, 20 percent lower than that of current best of its kind. The results are test out based on the world’s largest dataset of Mandarin speech corpus, AISHELL-2. CER is a common index in speech recognition in Mandarin as the character is the most basic element in the Chinese language. It is the equivalent of word error rate in the field of English speech recognition.
YITU’s system scored the top marks under multiple scenarios in terms of accuracy, speed and the ability to transcribe when a speech is combined with both English and Chinese. YITU achieved such results using self-developed innovations in data collecting and data labeling, as well as tools like training systems and algorithm models.
Different scenarios and some conditions like accents, for instance, has long plagued scientists as it has heavily affected accuracy. YITU researchers innovated with the most common scenarios like phone calls, voice commands, dialogues in quiet environment, audio programs, and speeches with accents to refine its capabilities and train the AI system.
As human-machine interactions play an increasingly pivotal role in shaping people’s lives, speech recognition is widely regarded to be the first and primary entry to be applied to commercial scenarios. As such, speech recognition is widely regarded as a frontier that global AI companies cannot afford to miss. YITU’s speech recognition system will be used to augment and expand the landscape of multiple businesses offered by the company, most of which is based on its state-of-the-art computer vision technology.
Speech recognition technology had an early start but was initially hindered not only by its high cost, huge data amount and lack of researchers, but also by the many scenarios that can get too complex for the systems to recognize and apply. Lu Hao, Chief Innovation Officer of YITU, said “Performances of currently available technologies on the market are mixed and can only fulfill a few basic features. YITU’s goal in starting its own speech recognition research was to address challenges in this promising industry”.
Having reached this milestone achievement, YITU will continue to invest and make contributions in speech recognition to push the technology further and better.
Read More: The Future of AI: Are Jobs Under Threat?