Meta has introduced Omnilingual ASR, a new multilingual automatic speech recognition system that supports more than 1,600 languages natively and expands to over 5,400 languages through zero shot in context learning. This marks a return to open source releases for the company after several restricted licensing models in recent years. The system can transcribe spoken audio into text by processing paired audio and text examples during inference time, even for languages it has not encountered during training. It is available under the Apache 2.0 license and is accessible through Meta’s website, Github, and a demonstration page on Hugging Face. Along with the models, Meta has published a technical paper and a large speech corpus of more than 350 underserved languages under open licenses.
Meta stated on its AIatMeta account on X that its aim is to remove language barriers and widen digital access, and the Omnilingual ASR suite is designed to support speech to text functions across areas such as voice assistance, transcription tools, subtitle generation, and oral archive preservation. The system includes several model families, such as wav2vec 2.0 for speech representation learning, CTC based models for supervised training, and LLM based models for advanced transcription. A zero shot model variant allows new languages to be supported by using only a small number of example audio and text pairs. The models follow an encoder to decoder structure that converts raw audio signals into language agnostic representations before decoding them into written output.
The system’s scale sets it apart from competing products. OpenAI’s Whisper supports 99 languages, while Meta’s system directly supports more than 1,600 and can extend to thousands more. Benchmarks indicate that Omnilingual ASR achieves character error rates under ten percent in more than seventy percent of supported languages, including more than five hundred that have not been included in ASR tools before. Meta’s research highlights the advantage of this extended coverage for communities whose languages have often been missing from digital speech technologies. The suite’s largest model requires high end hardware for inference, while smaller models can run on lower power devices, making them feasible for both enterprise level setups and compact deployments.
The release arrives during a significant transition period for Meta’s AI division. Llama 4, released earlier in 2025, faced poor enterprise uptake and contributed to organisational restructuring. Mark Zuckerberg appointed Alexandr Wang as Chief AI Officer and approved new recruitment efforts to strengthen the research pipeline. Omnilingual ASR is positioned as a corrective step by providing a practical and accessible contribution in language technology with transparent data and permissive licensing. It aligns with Meta’s wider efforts to direct investment toward foundational AI, supported by its new AI accelerators and infrastructure improvements announced in September, as well as renewed access to public dataset training across Europe after regulatory adjustments. This approach marks a shift toward cohesive platform development rather than fragmented updates.
The dataset behind Omnilingual ASR was created in collaboration with community organisations across Africa and Asia. African Next Voices, Mozilla Common Voice, and Lanfrica were among the contributors, enabling the inclusion of hundreds of low resource languages. The recordings consist of natural speech gathered through culturally familiar prompts, and transcriptions follow established writing standards. Meta emphasises that expanding speech recognition to thousands of languages requires local partnerships, and the open source release is intended to allow communities to personalise the models with their own data. All resources, including code and datasets, are available through Github, Hugging Face, and Meta’s AI blog, with installation support through PyPI.
Follow the SPIN IDG WhatsApp Channel for updates across the Smart Pakistan Insights Network covering all of Pakistan’s technology ecosystem.