CW Pakistan
  • Legacy
    • Legacy Editorial
    • Editor’s Note
  • Academy
  • Wired
  • Cellcos
  • PayTech
  • Business
  • Ignite
  • Digital Pakistan
  • PSEB
    • DFDI
    • Indus AI Week
  • PASHA
  • TechAdvisor
  • GamePro
  • Partnerships
  • PCWorld
  • Macworld
  • Infoworld
  • TechAdvisor
0
0
0
0
0
Subscribe
CW Pakistan
CW Pakistan CW Pakistan
  • Legacy
    • Legacy Editorial
    • Editor’s Note
  • Academy
  • Wired
  • Cellcos
  • PayTech
  • Business
  • Ignite
  • Digital Pakistan
  • PSEB
    • DFDI
    • Indus AI Week
  • PASHA
  • TechAdvisor
  • GamePro
  • Partnerships
  • TechAdvisor

Google EmbeddingGemma Leads Small Parameter Embedding Models With On-Device AI

  • September 6, 2025
Total
0
Shares
0
0
0
Share
Tweet
Share
Share
Share
Share

Google has expanded its Gemma family of models with the launch of EmbeddingGemma, an open-source embedding model designed for on-device use across smartphones, laptops, and desktops. Based on the Gemma 3 architecture, EmbeddingGemma is a 308 million parameter model trained on more than 100 languages and tailored to deliver efficient, private, and high-quality embeddings. According to Google DeepMind’s Min Choi, product manager, and Sahil Dua, lead research engineer, the model is built to integrate seamlessly with widely used tools such as Ollama, llama.cpp, MLX, LiteRT, LMStudio, LangChain, LlamaIndex, and Cloudflare, making it highly adaptable for developers seeking to deploy AI applications locally.

EmbeddingGemma has demonstrated strong results on the Massive Text Embedding Benchmark (MTEB) multilingual v2, where it ranked as the top-performing model under 500 million parameters. This performance underscores Google’s focus on delivering models that can run natively on personal hardware without requiring cloud dependency. The model’s design also supports customizable output dimensions and allows developers to apply it for a range of use cases, including Retrieval Augmented Generation (RAG) and semantic search. These capabilities position EmbeddingGemma as a tool that enables efficient AI-powered applications directly on user devices, ensuring privacy and functionality even in offline environments.

One of the most significant applications of EmbeddingGemma is its role in enabling mobile RAG pipelines. Traditionally, RAG systems rely on cloud or on-premises infrastructure to process embeddings and generate context-aware responses. By shifting this capability to devices like laptops and smartphones, enterprises can empower employees to access and query information directly through their local hardware. This approach allows for faster, more secure interactions with data, while reducing reliance on internet connectivity. Choi and Dua emphasized that the quality of the initial retrieval step is crucial in such pipelines, noting that poor embeddings can lead to irrelevant or inaccurate answers. EmbeddingGemma addresses this challenge with its high-quality representations, which enhance the reliability of on-device RAG systems.

To achieve this flexibility, Google introduced a method called Matryoshka Representation Learning within EmbeddingGemma. This allows developers to choose between different embedding vector sizes depending on their needs. For instance, developers may use the full 768-dimension vector for detailed tasks or opt for smaller dimensions to prioritize speed and efficiency. This adaptability makes the model suitable for diverse scenarios, from advanced enterprise applications to lightweight mobile solutions. The release also reflects growing interest in the embedding model space, where Google faces competition from Cohere’s Embed 4, Mistral’s Codestral Embed, OpenAI’s Text Embedding 3 Large, and Qodo’s Qodo-Embed-1-1.5B.

As interest in running AI applications natively on mobile devices continues to expand, hardware makers like Apple, Samsung, and Qualcomm are also working on ways to support models without compromising device performance or battery life. The arrival of EmbeddingGemma illustrates how embedding models are increasingly becoming a core component of enterprise AI strategies, with developers and organizations showing enthusiasm for integrating them into local workflows. Google’s emphasis on multilingual training, flexibility, and compatibility with popular AI frameworks positions EmbeddingGemma as an important entry in the embedding model market, particularly for developers seeking practical and private on-device solutions.

Source

Follow the SPIN IDG WhatsApp Channel for updates across the Smart Pakistan Insights Network covering all of Pakistan’s technology ecosystem. 

Share
Tweet
Share
Share
Share
Related Topics
  • AI models
  • DeepMind
  • EmbeddingGemma
  • embeddings
  • Gemma 3
  • Google
  • MLX
  • mobile AI
  • Ollama
  • on-device AI
  • RAG
  • semantic search
Previous Article
  • Business

Pakistan’s Services Exports Jump 18.3% In July Driven By IT And Telecom Growth

  • September 6, 2025
Read More
Next Article
  • Cellcos

PTCL Confirms Submarine Cable Damage May Slow Internet Speeds In Pakistan

  • September 6, 2025
Read More
You May Also Like
Read More
  • TechAdvisor

Microsoft Launches Surface Pro 12 and Surface Laptop With Snapdragon X2

  • Press Desk
  • June 17, 2026
Read More
  • TechAdvisor

OnePlus N6 Pakistan Release Expected Soon

  • Press Desk
  • June 16, 2026
Read More
  • TechAdvisor

WhatsApp Web Beta Adds Group Voice and Video Calls for Up to 32 Participants

  • Press Desk
  • June 16, 2026
Read More
  • TechAdvisor

Meta Edits App Gets Desktop Version and AI Production Assistant for Creators

  • Press Desk
  • June 15, 2026
Read More
  • TechAdvisor

How to Check if Netflix is Downgrading Your Streaming Quality

  • Press Desk
  • June 15, 2026
Read More
  • TechAdvisor

iPhone 18 Pro Dummy Models Reveal Dark Cherry and Light Blue Colour Options

  • Press Desk
  • June 15, 2026
Read More
  • TechAdvisor

Asus Launches ROG Zephyrus Duo ProArt PZ14 and TUF Gaming A14 With RTX 50-Series GPUs

  • Press Desk
  • June 15, 2026
Read More
  • TechAdvisor

Samsung Galaxy A27 5G Launches With Snapdragon 6 Gen 3 and Six-Year Updates

  • Press Desk
  • June 13, 2026
Trending Posts
  • Identity 360 Global Completes Two Million Biometric Verifications for Easypaisa
    • June 17, 2026
  • KP Introduces Fast Track Domicile Issuance With Dastak App and NADRA
    • June 17, 2026
  • Pakistan IT Exports Cross 4 Billion Dollars for First Time
    • June 17, 2026
  • Microsoft Launches Surface Pro 12 and Surface Laptop With Snapdragon X2
    • June 17, 2026
  • Pakistan Digital Authority Presents Digital Nation Framework And AI Governance At Civil Services Academy
    • June 17, 2026
about
CWPK Legacy
Launched in 1967 internationally, ComputerWorld is the oldest tech magazine/media property in the world. In Pakistan, ComputerWorld was launched in 1995. Initially providing news to IT executives only, once CIO Pakistan, its sister brand from the same family, was launched and took over the enterprise reporting domain in Pakistan, CWPK has emerged as a holistic technology media platform reporting everything tech in the country. It remains the oldest continuous IT publishing brand in the country and in 2025 is set to turn 30 years old, which will be its biggest benchmark and a legacy it hopes to continue for years to come. CWPK is part of the SPIN/IDG Wakhan media umbrella.
Read more
Explore Computerworld Sites Globally
  • computerworld.es
  • computerworld.com.pt
  • computerworld.com
  • cw.no
  • computerworldmexico.com.mx
  • computerwoche.de
  • computersweden.idg.se
  • computerworld.hu
Content from other IDG brands
  • PCWorld
  • Macworld
  • Infoworld
  • TechAdvisor
CW Pakistan CW Pakistan
  • CWPK
  • CXO
  • DEMO
  • WALLET

CW Media & all its sub-brands are copyrighted to SPIN-IDG Wakhan Media Inc., the publishing arm of NCC-RP Group. This site is designed by Crunch Collective. ©️1995-2026. Read Privacy Policy.

Input your search keywords and press Enter.