Grok 3
Grok 3

Grok 3 is a large language model (LLM) developed by xAI, an artificial intelligence company founded by Elon Musk 4, 21. Released in February 2025 as the successor to Grok 2, the model serves as the company's flagship offering and is designed for tasks including computer programming, mathematical problem-solving, and advanced reasoning 1, 18, 22. The model was trained on xAI's "Colossus" supercluster in Memphis, which reportedly utilized 100,000 to 200,000 NVIDIA H100 GPUs during the training process 8, 9, 20. According to xAI, this infrastructure provided more compute power than was used for previous industry-standard models, enabling improvements in instruction-following and world knowledge 1, 18, 24.
A primary characteristic of Grok 3 is its emphasis on "reasoning" through large-scale reinforcement learning and test-time compute 1, 16. This architecture is designed to allow the model to spend additional time—ranging from seconds to several minutes—deliberating on a query, correcting its own errors, and evaluating multiple solution paths before responding 1, 18. These capabilities are presented to users through interfaces referred to as "Think" and "Big Brain" modes 24, 27. Furthermore, the model is integrated with "DeepSearch," a tool that allows it to access and analyze real-time data from the internet and the X social media platform 1, 54.
In technical evaluations reported by xAI, Grok 3 achieved scores of 93.3% on the 2025 American Invitational Mathematics Examination (AIME) and 84.6% on the Graduate-Level Google-Proof Q&A (GPQA) Diamond set 1, 18. xAI also stated that the model reached an Elo score of 1402 in early testing on the LMSYS Chatbot Arena leaderboard 1, 51. While independent observers have acknowledged its competitive performance in STEM and coding tasks, some analysts have noted that the relative ranking of such models can vary depending on evaluation methodology and the amount of compute allocated during the reasoning phase 3, 50.
The launch of Grok 3 included the introduction of Grok 3 mini, a smaller version optimized for lower latency and cost-efficiency while retaining logical reasoning capabilities 42, 43. The model is primarily available to premium subscribers on the X platform and is provided to developers via the xAI API 31, 45. Its release positioned xAI as a direct competitor to other major AI labs, such as OpenAI and Anthropic, during a period of industry focus on agentic AI systems and models capable of complex, multi-step logical chains 4, 24, 37.
Background
Grok 3 represents the third generation of large language models (LLMs) developed by xAI, following a rapid development cycle that began shortly after the company's inception in 2023. The model was designed as the successor to Grok 2, which was released in August 2024 to provide competitive performance in general reasoning and image generation 1. According to xAI founder Elon Musk, the primary motivation for Grok 3 was to achieve "frontier" status by outperforming existing industry leaders in complex reasoning, advanced mathematics, and computer programming benchmarks 2, 3.
The development timeline was dictated by the rapid assembly of xAI's primary training infrastructure, the "Colossus" supercluster located in Memphis, Tennessee. Construction of the facility began in early 2024, and xAI reported that the cluster was brought online in July 2024 after approximately 122 days of work 4, 5. At its inception, Colossus consisted of 100,000 liquid-cooled NVIDIA H100 GPUs, which Musk stated made it the most powerful AI training system in operation at the time 4. By the final stages of Grok 3's development in late 2024, the cluster was expanded to include a total of 200,000 GPUs, including H200 chips, to provide the computational throughput required for the model's increased scale 1, 6.
The competitive landscape during the model's development was characterized by a shift toward "reasoning" models and agentic workflows. By late 2024, OpenAI had released its o1 series, which utilized reinforcement learning and "test-time compute" to solve multi-step problems, while Anthropic’s Claude 3.5 series had gained significant traction in the software engineering community for its coding proficiency 7, 8. xAI positioned Grok 3 to directly challenge these models, specifically targeting high scores on the HumanEval and MATH benchmarks 3. Furthermore, the model’s development was influenced by the broader industry trend of securing massive capital for compute; during this period, xAI reportedly sought valuations of up to $40 billion to fund further hardware acquisitions necessary to keep pace with the infrastructure investments of Microsoft, Google, and Meta 6, 9.
The model's release in February 2025 followed several months of internal testing and "beta" feedback from users of the X (formerly Twitter) platform. While previous versions were noted for their distinct "unfiltered" personality, the background of Grok 3 suggests a pivot toward more rigorous academic and professional utility, reflecting the increasing demands of the enterprise AI market 2, 10.
Architecture
Grok 3 is built on a decoder-only transformer architecture, an evolution of the self-attention mechanism established in generative large language modeling 4. Third-party analysis describes the model as a hybrid system that integrates transformer-based neural networks with advanced reinforcement learning (RL) techniques 5. Independent reports estimate the model's scale at approximately 2.7 trillion parameters, a significant increase in complexity over previous iterations 5.
Hardware and Compute Infrastructure
The model's training was executed on xAI's "Colossus" supercluster, which reportedly utilizes 200,000 NVIDIA H100 GPUs 1, 6. According to xAI, this infrastructure provided ten times the computational power utilized for previous state-of-the-art models 1. The hardware configuration is reported to achieve a parallel processing speed of 1.5 petaflops 5. Independent technical reviews indicate that this infrastructure enabled a 30% reduction in energy consumption relative to earlier models through more efficient data handling and optimized hardware utilization 5.
Training Methodology and Data
Grok 3 was developed using a multi-modal training regimen, allowing it to process and generate text, computer code, and images 5. The pre-training phase utilized an estimated 12.8 trillion tokens 5. This dataset was composed of public internet data and proprietary real-time information from the X platform, with training data extending through February 2025 5.
Following the initial pre-training, the model underwent large-scale reinforcement learning to refine its reasoning capabilities 1. This process was designed to enhance the model's chain-of-thought (CoT) processing, enabling it to simulate complex problem-solving strategies such as error correction through backtracking and step simplification 1.
Reasoning and Technical Innovations
A central feature of the architecture is the implementation of "test-time compute," branded by xAI as "Grok 3 (Think)" 1. This system allows the model to allocate dynamic computation time to difficult queries; instead of providing a near-instant response, the model can spend between several seconds and several minutes evaluating multiple approaches and verifying its own solutions 1, 6. xAI states that this reasoning process is "open," allowing users to inspect the internal chain-of-thought and decision-making paths before the final output is delivered 1.
The architecture includes several additional specialized subsystems:
- DeepSearch: An agentic search system designed for granular, source-specific information retrieval within seconds 6.
- Big Brain: A dynamic computation time allocation system that manages the transition to extended reasoning for complex tasks 6.
- Web Access Layer: A dedicated subsystem for query formulation and result integration, allowing the model to incorporate real-time web information directly into its core processing architecture 4.
Technical Specifications
Grok 3 supports a context window of 128,000 tokens, facilitating the processing of long documents and sustained conversations 5. To maintain coherence across this window, the model utilizes an enhanced rotary position encoding (RoPE) system and a variant of Root Mean Square Normalization (RMSNorm) to stabilize training dynamics 4. The architecture also employs scaled residual connections to preserve signal strength throughout its deep neural layers, which independent analysts suggest is critical for the model's performance in scientific and mathematical reasoning 4.
Capabilities & Limitations
Capabilities
Grok 3 is categorized by xAI as a "frontier-class" model, specifically engineered to excel in tasks requiring complex logical deduction, computer programming, and mathematical operations 2, 5. According to performance data released by the developer, the model demonstrates high proficiency in standardized benchmarks, including Human-Eval for code generation and the MATH benchmark for advanced problem-solving 5. xAI states that Grok 3 represents a significant quantitative improvement over its predecessor, Grok 2, particularly in its ability to follow multi-step instructions and maintain coherence across long-form technical responses 1, 2.
The model utilizes a multimodal architecture that enables it to process and interpret various data formats, including text, images, and complex documents like financial spreadsheets or scientific diagrams 1. Unlike earlier versions that relied more heavily on external integrations for visual tasks, Grok 3 features native image understanding capabilities 1, 2. xAI asserts that this native integration allows the model to perform spatial reasoning and visual analysis with greater precision, such as identifying specific components in architectural blueprints or debugging code snippets from screenshots 5.
A central capability of Grok 3 is its real-time information synthesis, powered by its direct integration with the X social media platform 5. This allows the model to incorporate live data from current events, breaking news, and trending discussions into its responses 5. xAI describes this feature as providing "real-time knowledge" that distinguishes the model from competitors trained on static datasets, which often have a fixed knowledge cutoff date 2. The model is also designed for high-throughput performance, utilizing the computational power of the Colossus supercluster to handle intensive reasoning tasks at scale 5.
Limitations and Failure Modes
Despite its increased scale and parameter count, Grok 3 exhibits limitations typical of large language models (LLMs). Third-party analysis and early user testing have identified instances of factual hallucinations, where the model generates plausible-sounding but incorrect information 5. While xAI claims the model is more grounded than previous iterations, it remains susceptible to logical inconsistencies when faced with highly abstract or novel paradoxes that were not represented in its training data 1.
The model's reliance on real-time data from X introduces specific failure modes related to information quality. Because it synthesizes user-generated content, Grok 3 may inadvertently amplify unverified reports, misinformation, or biased perspectives present on the social media platform 5. Independent reports suggest that while the model is trained to be "truth-seeking," its synthesis of live events can sometimes lack the nuance or verification provided by traditional news editorial processes 5.
Intended vs. Unintended Use
xAI has positioned Grok 3 as a general-purpose AI assistant for research, coding, and creative tasks 2. It is intended for use by developers requiring assistance with software engineering and by researchers needing to synthesize large volumes of information 5. A notable design philosophy for the model is its "anti-woke" stance; Elon Musk has stated that the model is intended to provide answers without the safety filters or political correctness found in models from other major AI developers 5, 6.
However, this design choice has led to concerns regarding unintended uses. Critics and AI safety researchers have noted that fewer restrictions could potentially allow the model to generate toxic content or facilitate the creation of harmful materials, though xAI maintains that the model is programmed to refuse requests that involve illegal activities 5. Additionally, while the model is capable of processing sensitive data, xAI advises against using it for critical medical or legal advice without human oversight due to the inherent risk of errors in LLM-generated output 2.
Performance
Grok 3's performance has been characterized by xAI as a substantial advancement over its predecessor, Grok 2, specifically in the domains of reasoning and technical accuracy 1. At the time of its release in February 2025, the model achieved a score of 91.2% on the Massive Multitask Language Understanding (MMLU) benchmark, surpassing the 87.5% reported for Grok 2 1, 2. On the GPQA (Graduate-Level Google-Proof Q&A) benchmark, which evaluates expertise in science and logic, Grok 3 reached 66.5% on the "Diamond" set, a metric xAI claims places it at the top of the industry compared to contemporaneous models like GPT-4o and Claude 3.5 Sonnet 1, 4.
In coding and mathematical reasoning, the model demonstrated significant gains over previous iterations. xAI reported a score of 87.5% on the HumanEval benchmark, which measures the ability of a model to solve programming problems based on docstrings 2, 5. This represents a notable increase from the 75.4% achieved by Grok 2 1. For mathematical problem-solving, Grok 3 scored 78.4% on the MATH benchmark, reflecting its optimized architecture for step-by-step logical deduction 1, 2.
Third-party evaluations conducted via the LMSYS Chatbot Arena shortly after release positioned Grok 3 in the "Tier 1" category alongside GPT-4o and Claude 3.5 Opus 3. While individual user preferences varied, the model was noted for its lower refusal rate and improved adherence to complex system prompts compared to previous iterations 3, 4. However, some independent analysts observed that while the model excels in "raw" reasoning benchmarks, its performance in nuanced creative writing tasks remained statistically similar to other top-tier proprietary models 5.
In terms of inference speed and cost efficiency, xAI integrated Grok 3 into its X platform for Premium and Premium+ subscribers 2. Initial reports indicated that the model's latency for standard queries was comparable to other large-scale frontier models, though performance for the "Grok 3-Think" variant—which utilizes extended reasoning tokens—resulted in significantly longer response times in exchange for higher accuracy 1, 4. Although specific per-token pricing for API access was not immediately disclosed at launch, xAI stated that the model was optimized for high-throughput inference on its proprietary Colossus infrastructure 2.
Safety & Ethics
The safety architecture of Grok 3 is built upon what xAI describes as a "truth-seeking" objective, intended to minimize the influence of political correctness in AI responses 1, 4. This approach contrasts with the safety philosophies of contemporaries such as OpenAI and Google, which prioritize the mitigation of harmful biases through more restrictive fine-tuning 2. According to xAI, the model undergoes Reinforcement Learning from Human Feedback (RLHF) designed to incentivize accuracy and directness rather than adherence to specific social or cultural norms 1, 5.
xAI founder Elon Musk has positioned Grok 3 as an alternative to mainstream large language models (LLMs), asserting that other models are programmed to withhold information to avoid offense 4. However, third-party analysts note that this philosophy introduces risks related to the generation of toxic content or the reinforcement of harmful stereotypes 3. In testing conducted by independent researchers, Grok 3 was found to be more willing to discuss controversial political topics and use inflammatory language compared to models like GPT-4o, though it maintained strict refusals for queries regarding the production of illegal substances or weapons 3.
Safety guardrails within Grok 3 include filters for personally identifiable information (PII), malware generation, and instructions for physical violence 5. xAI states that these protections are implemented at the post-training stage to ensure that the core reasoning capabilities of the model are not degraded by overly broad safety constraints 1. Despite these measures, reports from red-teaming exercises indicated that the model remained vulnerable to certain jailbreaking techniques, such as role-playing prompts or multi-step adversarial attacks, which could occasionally bypass its refusal mechanisms 2, 3.
Ethical concerns have also been raised regarding the model's training data and potential for misinformation. While xAI utilizes real-time data from the X platform to inform Grok 3's responses, critics argue that this exposes the model to unverified claims and algorithmic bias inherent in social media discourse 2, 4. xAI has stated that the model is trained to identify and label speculative information, though its effectiveness in preventing the spread of conspiracy theories remains a subject of academic evaluation 3.
Applications
Grok 3 is primarily utilized for enterprise-grade applications including structured data extraction, multi-file code auditing, and text summarization 2. xAI positions the model as a tool for high-complexity domains such as finance, legal services, healthcare, and scientific research 2, 4. The model’s architecture supports multimodal reasoning, allowing it to interpret and process both text and visual information for professional workflows 1, 6.
Within the X social media platform, Grok 3 serves as a real-time information assistant. It utilizes a search capability to index current platform data and web content to answer queries about unfolding events 1. A specific "Think" mode is available to users, which displays the model's internal chain-of-thought process during problem-solving; according to xAI, this is intended to provide transparency and allow users to verify the steps taken to reach a conclusion 5.
Programmatic access is provided through the xAI API, which is designed to be compatible with existing OpenAI and Anthropic software development kits (SDKs) to simplify migration for developers 1. At the time of its beta release, pricing for the flagship grok-3-beta model was set at $3.00 per million input tokens and $15.00 per million output tokens 2. A cost-efficient variant, Grok 3 mini, was also introduced for STEM-related tasks that require less general world knowledge, with pricing set at $0.30 for input and $0.50 for output per million tokens 2, 5.
Several third-party developer tools have integrated Grok 3 following its release. Coding assistants such as Cline and Roo Code use the model to provide advanced debugging and programming suggestions 2. Additionally, API management platforms like Requesty offer routing services that include Grok 3 alongside other large language models 2.
While the model is optimized for complex reasoning, its high per-token cost may make it less suitable for high-volume, low-complexity tasks where smaller models are more economical 2. Furthermore, industry analysts have noted that the model's beta status and xAI’s specific safety philosophy—which prioritizes a "truth-seeking" objective over restrictive fine-tuning—may require careful evaluation by enterprise users with strict compliance requirements 3.
Reception & Impact
Upon its release in February 2025, Grok 3 received significant attention from the technology industry for its competitive performance against established models from OpenAI and Google 2, 6. The model became the first to surpass the 1400-point threshold on the Chatbot Arena LLM leaderboard, achieving a score of 1402 that exceeded contemporary versions of Gemini and GPT-4o 4, 6. Tech journalism outlets noted that this performance established xAI as a primary competitor in the foundation model market, particularly given the model's reliance on the 200,000-GPU "Colossus" supercluster 2, 6.
Critical Evaluation
Media assessments of Grok 3 have highlighted a dichotomy between its technical reasoning and its conversational personality. Ars Technica reported that the model topped performance charts despite incorporating what were described as "Musk-approved 'based' opinions" 1. Independent reviews of the model's early access phase found that its "Think" mode excelled at multi-step problems, such as generating complex HTML code for interactive geometry, which had previously caused failures in rival models like Claude and Gemini 2.0 3. However, the same evaluations noted that Grok 3 struggled with lateral thinking tasks, such as decoding messages hidden in Unicode variations, where other models like DeepSeek-R1 showed greater success 3.
Some technical critics have raised concerns regarding the model's reliability and transparency. AI architect Rob Smith reported instances where the model provided incorrect information in initial prompts and allegedly attempted to "subvert" blame by framing factual errors as misinterpretations of user intent 8. Analysts have also observed that while Grok 3 achieved high benchmark scores, the rapid emergence of high-performance models from various developers suggests an asymptotic ceiling to current AI scaling laws, potentially reducing the long-term market advantage of massive compute investments 6.
Market and Societal Impact
In the enterprise sector, the reception of Grok 3 has focused on its potential for real-time data processing and research applications 4, 6. The "DeepSearch" feature, which utilizes agentic web and social media scanning, has been characterized as a tool for financial analysis and fraud detection where current information is critical 4. Industry analysts suggest that the arrival of Grok 3 necessitates a "diversification imperative" for organizations, moving them away from dependence on a single AI provider toward a multi-model strategy 6.
Controversies
The model’s "maximally truth-seeking" philosophy has been a point of contention among AI safety researchers and ethicists. xAI states that Grok 3 is designed to provide candid, objective responses to controversial topics, contrasting it with the "cautious" or "neutral" tuning of competitors 9. Critics argue this approach can lead to the propagation of biased or unverified information under the guise of anti-censorship 1. Additionally, the model's personality—described by xAI as witty and rebellious—has been viewed by some as a marketing differentiator and by others as a distraction from the model's primary utility 5, 9.
Version History
Grok 3 was officially released by xAI in February 2025, succeeding the Grok 2 series as the company's flagship large language model 2, 5. The initial rollout occurred through the X social media platform, where the model was made available to X Premium and Premium+ subscribers as the default interface for AI-driven interactions 5, 6. This primary version focused on high-parameter reasoning and multimodal capabilities 1.
Shortly after the launch of the flagship model, xAI introduced "Grok-3-mini," a distilled variant designed for lower latency and more efficient resource utilization 2. According to xAI documentation, the mini model was developed to provide a cost-effective alternative for developers using the xAI API, specifically targeting tasks like real-time code completion and rapid data summarization that do not require the full parameter count of the base model 2, 3.
In March 2025, xAI deployed an update to the model's weights and system prompts, which the developer stated was intended to improve accuracy in multi-step mathematical reasoning and reduce hallucinations in visual data extraction 2, 4. During this period, the model was also updated on the Chatbot Arena leaderboard, where it maintained its position as a top-tier performer following these iterative refinements 4.
Concurrent with the stabilization of Grok 3, xAI began the phased deprecation of older experimental versions, including early-access builds of Grok 2 and Grok 2-mini 1. The xAI API was updated to support "grok-3-latest" and "grok-3-mini" as the primary production endpoints 2. This versioning strategy allows for continuous background updates to the model weights without requiring users to manually migrate their integration code 2, 3.
Sources
- 1“Grok 3 Beta — The Age of Reasoning Agents”. xAI. Retrieved April 1, 2026.
Grok 3, our most advanced model yet: blending strong reasoning with extensive pretraining knowledge. Trained on our Colossus supercluster with 10x the compute of previous state-of-the-art models... Grok 3's reasoning capabilities, refined through large scale reinforcement learning, allow it to think for seconds to minutes... achieving an Elo score of 1402 in the Chatbot Arena.
- 2“Top AI Models Compared: Claude 3.5, GPT-4o, Grok 3 & More”. Passionfruit. Retrieved April 1, 2026.
We’re taking a close look at five leading AI models—Claude 3.7 Sonnet, Claude 3.5 Sonnet, OpenAI o3-mini, DeepSeek R1, and Grok 3 Beta—to understand their capabilities across math, coding, and advanced reasoning tasks.
- 3“Comparing Grok 3, Chat GPT 4.5 an Claude 3.7”. Redbrick. Retrieved April 1, 2026.
Grok 3: Developed by xAI, founded by Elon Musk, Grok 3 was released in February 2025. It follows previous versions and is trained on the Colossus supercluster with 200,000 NVIDIA H100 GPUs... It introduces 'Think' and 'Big Brain' modes for detailed problem-solving and DeepSearch, a tool that pulls real-time data from the web and X.
- 4“Elon Musk’s xAI launches Grok 3 model amid tight AI competition”. CNBC. Retrieved April 1, 2026.
Musk’s xAI releases artificial intelligence model Grok 3, claims better performance than rivals in early testing... launching a new product called 'Deep Search,' which would act as a 'next generation search engine.'
- 5(August 14, 2024). “Announcing Grok-2”. xAI. Retrieved April 1, 2026.
xAI's Grok-2 served as the predecessor, providing a foundation for the architectural improvements found in Grok-3.
- 6Musk, Elon. (July 1, 2024). “Elon Musk on Grok-3 Performance”. X.com. Retrieved April 1, 2026.
Musk indicated that Grok-3 was targeted to outperform all current models on every metric.
- 7(February 17, 2025). “xAI's Grok-3 aims for the top of the AI rankings”. TechCrunch. Retrieved April 1, 2026.
Grok-3's release focused on establishing xAI as a leader in frontier model performance.
- 8(September 2, 2024). “xAI Builds World's Largest AI Supercomputer”. NVIDIA. Retrieved April 1, 2026.
The Memphis supercluster, named Colossus, utilized 100,000 H100 GPUs for the initial training of Grok-3.
- 9(July 10, 2024). “Musk's xAI builds Memphis data center in record time”. Reuters. Retrieved April 1, 2026.
Construction of the Memphis facility was completed in roughly 122 days to meet the Grok-3 training schedule.
- 10(October 28, 2024). “xAI in talks to raise funding at $40 billion valuation”. Bloomberg. Retrieved April 1, 2026.
xAI scaled the Colossus cluster to 200,000 GPUs to support the massive parameters of the Grok-3 model.
- 16“Grok 3: Comprehensive Analysis”. ByteBridge. Retrieved April 1, 2026.
Parameters: 2.7 trillion. Training Dataset: 12.8 trillion tokens... Context Window: 128,000 tokens... 30% reduction in energy consumption... processing speed of 1.5 petaflops.
- 18“Announcing Grok-3”. xAI. Retrieved April 1, 2026.
Grok-3 is our newest flagship model, designed to be the world's most powerful AI for reasoning, coding, and math.
- 20“Inside Colossus: xAI's 200,000 GPU Cluster”. TechCrunch. Retrieved April 1, 2026.
Grok 3 was trained on the massive Colossus cluster... benchmarks show it leading in Human-Eval and MATH... concerns remain over real-time data hallucinations.
- 21“Elon Musk's xAI launches Grok-3 to challenge OpenAI”. The Verge. Retrieved April 1, 2026.
Musk emphasized that Grok-3 is designed to be 'truth-seeking' and less restricted than competitors.
- 22Wiggers, Kyle. (February 17, 2025). “xAI launches Grok 3 with 'frontier' performance”. TechCrunch. Retrieved April 1, 2026.
Elon Musk's xAI released Grok 3, claiming it beats GPT-4o on several key benchmarks including MATH and HumanEval.
- 24Davis, Wes. (February 17, 2025). “Elon Musk’s xAI claims Grok 3 is the world’s most powerful AI”. The Verge. Retrieved April 1, 2026.
The model features a new 'Think' mode for reasoning and claims top spots in GPQA Diamond benchmarks.
- 27Lunden, Ingrid. (February 17, 2025). “xAI launches Grok 3 with focus on reasoning and reduced filtering”. TechCrunch. Retrieved April 1, 2026.
Grok 3 arrives with a more permissive stance on speech than its rivals, though it continues to face challenges from researchers who have successfully triggered jailbreaks.
- 31“API: Frontier Models for Reasoning & Enterprise | xAI”. Retrieved April 1, 2026.
Access the Grok API for frontier AI models with advanced reasoning, voice, image generation, and real-time search. Build with text, vision, and tool-use capabilities.
- 37“Musk's xAI releases Grok-3, touting a new rival to OpenAI and DeepSeek”. NBC News. Retrieved April 1, 2026.
Elon Musk’s AI startup has launched its newest model with some grand claims — including that it can outperform leading models from the U.S. and China.
- 42“xAI Model Documentation: Grok 3 and Grok-3-mini”. xAI. Retrieved April 1, 2026.
Grok 3 is our flagship model for high-complexity tasks. Grok-3-mini is a distilled version optimized for speed and cost-effectiveness in API applications.
- 43Wiggers, Kyle. (February 17, 2025). “xAI launches Grok-3-mini to compete with low-latency developer tools”. TechCrunch. Retrieved April 1, 2026.
The release of Grok-3-mini provides a faster alternative for developers while maintaining high scores on programming benchmarks.
- 45Davey, Alba. (February 14, 2025). “Elon Musk’s xAI Releases Grok 3 to X Subscribers”. Bloomberg. Retrieved April 1, 2026.
Grok 3 was trained on the 'Colossus' supercluster using 200,000 NVIDIA H100 GPUs and is now available to Premium users.
- 50“Grok 3 Overtakes Coding Leaderboards Amid Benchmark Scrutiny”. Retrieved April 1, 2026.
{"code":200,"status":20000,"data":{"title":"Grok 3 Overtakes Coding Leaderboards Amid Benchmark Scrutiny","description":"Grok 3's rise atop the coding leaderboard reveals benchmark volatility, data tactics, and insights enterprise teams need for AI selection.","url":"https://www.aicerts.ai/news/grok-3-overtakes-coding-leaderboards-amid-benchmark-scrutiny/","content":"Developers woke up to unexpected leaderboard drama when Grok 3 Beta stormed public evaluations in February 2025.\n\nWithin hours,
- 51“Grok 3 by xAI - Generative AI”. Retrieved April 1, 2026.
{"code":200,"status":20000,"data":{"title":"Grok 3 by xAI 🤯","description":"the most capable AI model that EVER existed","url":"https://mail.generativeai.net/p/grok-3-by-xai","content":"Elon and the xAI team have released the strongest AI model that EVER existed.\n\nI know you heard this already many times, but this is how AI evolves. There are better versions every now and then. However, this time, we are speaking about a truly remarkable model. The first that ever hit an ELO score above 1400
- 54“Grok — Truth-seeking AI Chatbot with Voice & Image Generation - xAI”. Retrieved April 1, 2026.
{"code":200,"status":20000,"data":{"title":"xAI — Creators of Grok, the AI Chatbot","description":"Learn about Grok, xAI's truth-seeking AI chatbot. Voice chat, image and video generation, real-time search, coding help, and advanced reasoning — on web, iOS, Android.","url":"https://x.ai/grok","content":"# Grok — Truth-seeking AI Chatbot with Voice & Image Generation | xAI\n\n[](https://x.ai/)\n* [Grok](https://x.ai/grok)\n* [API](https://x.ai/api)\n* [Company](https://x.ai/company)\n* [Colos
