A breakthrough in language technology has emerged from Pakistan, where a 23-year-old software developer from Hyderabad, Fahad Maqsood Qazi, has developed the first-ever artificial intelligence-based tools for the Sindhi language. These tools enable text-to-speech (TTS) and speech-to-text (STT) functions in Sindhi—a landmark achievement for a language spoken by nearly 40 million people globally but long overlooked in the realm of AI-driven language services.
Qazi began the project in 2023 while working on an AI dubbing system for Flis Technologies, a company he co-founded. It was during this work that he realized the complete absence of foundational AI tools for the Sindhi language. Unlike more globally dominant languages like English or Mandarin, Sindhi had no public tools available for speech recognition or voice synthesis, making it largely invisible in the digital AI era.
Determined to fill this gap, Qazi began building his own dataset from scratch. He sourced hours of Sindhi audio from YouTube, audiobooks, and news broadcasts, and manually transcribed them to create a training base for his AI models. During this time, he discovered that Google employee Asad Memon had enabled Sindhi support on Mozilla’s Common Voice platform. Qazi merged this open-source dataset with his own, providing a robust foundation for his machine learning models.
By January 2024, Qazi had completed the first working versions of both TTS and STT models for Sindhi. Realizing the language also lacked a tokenizer—a basic software component that breaks down sentences into individual words or characters for AI processing—he developed one himself. The addition of a tokenizer was critical, as it allowed the language to be processed by machine learning systems, enabling better accuracy and functionality.
The implications of this work go far beyond software development. In many countries where Sindhi-speaking diaspora communities live, the language is not formally taught, particularly to younger generations. This has led to a gradual erosion of reading and writing skills in Sindhi. Qazi hopes that his tools will bridge that gap, making it easier for these communities to stay connected to their linguistic heritage through voice-based technology.
He emphasized that the tools could help both the tech-savvy and the tech-shy engage with the language. A child or adult who cannot read Sindhi can now listen to stories or information via TTS. Conversely, someone unfamiliar with writing the language can speak into a phone or computer and have their words transcribed using STT. This is particularly significant for older generations or individuals with limited literacy, who may struggle to use digital devices in their native language.
In March 2024, Qazi uploaded his models to HuggingFace, a collaborative platform for AI models used by developers worldwide. By making his work open-source, he hopes to encourage further development in Sindhi language technology. Researchers, developers, and language activists can now build upon his models, enabling a broader ecosystem of applications that include translation tools, educational content, and even voice-controlled interfaces.
Qazi stressed that for Sindhi to remain relevant in the modern world, it must be accessible across digital platforms.
“Without access to tools like these, Sindhi could be excluded from digital spaces.”
“Now it can be part of systems like voice interfaces, educational resources, and translation tools.”
This accomplishment marks a new chapter for Sindhi language inclusion in the AI era. By building the foundational tools himself, Qazi has not only addressed a glaring digital gap but has also laid the groundwork for a more inclusive future where regional languages are part of global technological advancement.