About
Client specializes in dialectal speech technology solutions for Dialectal Arabic and other under-resourced languages. Automatic speech recognition & text-to-speech are some of the areas the client focuses on. The client is part of the Mohammed Bin Rashid Innovation Fund (MBRIF), an initiative launched by the UAE Ministry of Finance to support innovation in the UAE. The client set out to find a partner that would meet its solution development & scaling needs and immediately found traction with us.
The Challenge
- Arabic is considered as one of the challenging languages to be used in speech recognition systems due to its large lexical variety and complicated morphology.
- One of the significant challenges is the automatic detection & conversion of over 19 Arabic dialects.
- Building lexicons for various use cases such as media, call centers & education.
- Support for multiple file types – wav, mp3, mp4, acc, and more.
- Ability for both real-time as well as batch processing.
Solution
We developed the product based on automatic speech recognition, machine translation, and Natural Language Processing (NLP). The product can be deployed on the cloud, on premise as well as hybrid model. The Arabic speech recognition models have the leading accuracy across the board. We also built speech to text features such as speaker detection, language switching, time stamps, and diarization. The solution was winner of the 2021 GITEX Future Stars’ Supernova Challenge held in Dubai and was named as the Best AI innovator for its cutting-edge Arabic speech and voice technology. It won from over 700 entries. The solution can provide highly trained and tailored transcriptions to the clients’ customers with greater than 90% accuracy levels. By pushing our models to perform under complex, real-life conditions with background noise, multiple speakers and diverse accents, the clients’ customers achieve vastly improved accuracy rates without compromising on the speed of the transcription. The clients’ customers use built-in reporting to look for keywords and phrases in collected audio data rather than a faulty outputted transcript, enabling them to seamlessly pinpoint specific timestamps and gather helpful insight. Having Sakha as a technology partner that innovates rapidly and delivers quality, the partnership with the client continues to open up new opportunities at scale.