Elon Musk’s xAI has taken significant strides in improving the performance of its Grok-2 large language model (LLM) chatbot. In just three days, xAI developers completely rewrote the inference code stack using SGLang, an open-source system designed for efficient execution of complex language models. This resulted in a dramatic speed increase for both the full Grok-2 model and the streamlined Grok-2 Mini.
The improvements were announced by xAI developer Igor Babuschkin on the social network X. He revealed that Grok-2 Mini is now twice as fast as it was previously. This impressive feat is attributed to the collaborative effort of Lianmin Zheng and Saeed Maleki, who rewrote the inference code using SGLang.
SGLang’s efficiency extends beyond speed. It also enables xAI to serve the larger Grok-2 model, which requires multi-host inference, at a significantly faster pace. Additionally, both models experienced a slight increase in accuracy alongside the speed boost.
Developed by a team from Stanford University, UC Berkeley, Texas A&M, and Shanghai Jiao Tong University, SGLang offers a versatile platform. It supports a wide range of models, including Llama, Mistral, and LLaVA, and is compatible with open-weight and API-based models like OpenAI’s GPT-4. The system’s strength lies in its ability to optimize execution through automatic cache reuse and program-level parallelism.
This performance boost is accompanied by impressive results on the third-party Lmsys Chatbot Arena leaderboard, which benchmarks AI model performance. The full Grok-2 model has secured the number two spot with an Arena Score of 1293, placing it alongside Google’s Gemini-1.5 Pro and just behind OpenAI’s latest ChatGPT-4o.
Grok-2 Mini, also benefiting from the recent enhancements, climbed to the number five position with an Arena Score of 1268, trailing only GPT-4o mini and Claude 3.5 Sonnet. Notably, both Grok-2 and Grok-2 Mini are proprietary models developed by xAI, showcasing the company’s dedication to pushing the boundaries of AI technology.
Grok-2 has established itself as a leader, particularly in mathematical tasks, where it currently holds the top spot. The model also demonstrates strong performance across various categories, including Hard Prompts, Coding, and Instruction-Following, consistently ranking near the top. This performance surpasses prominent models like OpenAI’s GPT-4o (May 2024), which now sits at number four.
According to Babuschkin, the primary advantage of Grok-2 Mini lies in its enhanced speed. However, he assures further advancements are underway to make it even faster, catering to users who prioritize high performance without significant computational resources.
The addition of Grok-2 and Grok-2 Mini to the Lmsys Chatbot Arena leaderboard and their subsequent performance have garnered significant attention within the AI community. These achievements highlight xAI’s ongoing commitment to innovation and its relentless pursuit of pushing the boundaries of AI capabilities. As xAI continues to refine its models, we can expect further improvements in speed, accuracy, and overall performance, ensuring Grok-2 and Grok-2 Mini remain at the forefront of AI development.