Grok 4 Fast
Grok 4 Fast

Grok 4 Fast is a proprietary large language model (LLM) developed by xAI and released on September 19, 2025 4, 34. As a specialized variant within the Grok 4 model family, it is designed to prioritize inference speed and operational cost-efficiency over the extensive reasoning processes found in more computationally intensive models 5, 36. The model is accessible through the xAI platform, mobile applications for iOS and Android, and third-party API providers including OpenRouter and Vercel 3, 20, 30. xAI positioned the model as a strategic expansion of its model suite, targeting applications where high-throughput responses are more critical than complex, multi-step logical deduction 5, 24.
Technically, Grok 4 Fast is a multimodal model capable of processing both text and image inputs while generating text-based outputs 19, 34. It features a context window of 2 million tokens, an architecture that xAI asserts enables the model to ingest and analyze approximately 3,000 pages of text in a single prompt 5, 33. While xAI has not publicly disclosed the specific parameter count or technical architectural details, the model is classified as a proprietary system with restricted weights 1, 12. It is intended for tasks such as real-time question-answering, rapid code suggestions, and document drafting, utilizing its large context capacity to handle extensive datasets or long-form conversation histories 2, 24.
Independent evaluations by Artificial Analysis characterize Grok 4 Fast as a leading model in its class for the balance of intelligence and speed 1, 37. In performance testing, the model’s output speed and time to first token (TTFT) metrics were found to be above the median performance for non-reasoning models in its price tier 1, 27. According to xAI, the model is substantially faster than the standard Grok 4, utilizing fewer "thinking tokens" on average while maintaining performance on core benchmarks 5, 34.
On the Artificial Analysis Intelligence Index—which aggregates performance across coding, knowledge, mathematics, and scientific reasoning—Grok 4 Fast achieved a score significantly higher than the category average 1, 37. According to data provided by the developer, the model demonstrates high proficiency in specialized benchmarks, with xAI reporting scores of 85.7% on GPQA Diamond, 92% on AIME 2025, and 93.3% on HMMT 2025 5, 25. The model's pricing is established at $0.20 per million input tokens and $0.50 per million output tokens, positioning it as a high-speed alternative to "mini" or "lite" models from other developers, such as OpenAI's GPT-5 mini 20, 26, 40.
Background
Background
xAI developed Grok 4 Fast as part of a rapid iteration cycle following the July 9, 2025, release of the baseline Grok 4 model 29, 41. The model was officially introduced in early access beta on September 19, 2025, positioned as an efficiency-focused update released prior to the anticipated Grok 5 4, 34. Its development was influenced by a broader industry shift toward compact, high-capability models, often designated as "Flash" or "Mini" variants by competitors such as Google and OpenAI 11, 36.
The primary motivation for the model's creation was to provide reasoning capabilities with lower operational costs than its predecessor 4, 5. According to xAI, the model achieves performance levels comparable to the standard Grok 4 on academic benchmarks while utilizing 40% fewer "thinking tokens" on average 34. This optimization is intended to reduce computational overhead and latency 5, 34. xAI states that the model is specifically optimized for a "price-to-intelligence ratio" intended for its "Deep Search" function 5, 34.
At the time of release, the generative AI market was increasingly focused on balancing reasoning depth with operational costs 11, 36. Grok 4 Fast was designed to compete with models like OpenAI's GPT-5 mini and Google's Gemini 2.5 Pro 11, 36, 40. Analysis by Artificial Analysis and xAI technical documentation verified that the model was priced at $0.20 per million input tokens and $0.50 per million output tokens for both reasoning and non-reasoning modes 1, 20, 37.
The model's integration was tailored for the X (formerly Twitter) ecosystem, where it is available to X Premium and Premium+ users 5, 34. Beyond consumer use on the grok.com platform and mobile applications, xAI targeted broader developer adoption by making the model available via the xAI API and third-party gateways such as OpenRouter and Vercel AI Gateway 3, 30. This strategic rollout followed earlier efforts to expand the Grok user base, including 2025 integrations with other communication platforms and targeted growth in specific regional markets 2.
Architecture
Grok 4 Fast is based on a transformer-based architecture specifically optimized for low-latency inference and operational efficiency 11, 17. Unlike models that separate reasoning and general-purpose tasks into distinct entities, xAI states that Grok 4 Fast utilizes a unified architecture that natively integrates both "reasoning" and "non-reasoning" modes 11, 17. This design is intended to allow the model to dynamically adjust its computational expenditure based on the complexity of the query 11. The model architecture is multimodal, utilizing an encoder-decoder structure that enables it to process both text and image inputs 18.
Context and Memory Management
According to xAI, the model supports a 2 million token context window, which is approximately equivalent to 3,000 pages of text 17, 18. Supporting this expansive context required specific innovations in memory management and positional encoding to maintain coherence and retrieval accuracy across large datasets 17. While the previous generation Grok-1 utilized an 8,000-token context window, the Grok 4 family represents a significant expansion in the ability to process and maintain awareness of long-form documents and extended conversations 22.
Training Methodology
The training of Grok 4 Fast leveraged large-scale reinforcement learning (RL) to optimize what xAI describes as "intelligence density" 17. This process focused on maximizing performance benchmarks while minimizing the number of "thinking tokens" required for complex problem-solving 11, 17. xAI asserts that this methodology resulted in a 40% reduction in average thinking tokens compared to the standard Grok 4 model 11, 17. Furthermore, the model was trained end-to-end with tool-use reinforcement learning, specifically targeting capabilities such as autonomous web browsing, X (formerly Twitter) search, and code execution 17.
Computational Infrastructure
Training was conducted on the "Colossus" supercomputer cluster located in Memphis, Tennessee 19, 20. At the time of the model's development, the cluster utilized an array of NVIDIA H100, H200, and Blackwell-series (GB200 and GB300) GPUs 19, 20, 21. By early 2026, the cluster expanded to approximately 555,000 GPUs with a total power capacity of 2 gigawatts 20, 21. The infrastructure provides a memory bandwidth of 194 petabytes per second (PB/s) and a network throughput of 3.6 terabits per second (Tb/s) per server 20. xAI utilizes NVIDIA’s Spectrum-X Ethernet networking to facilitate high-speed data transfer across the cluster 20.
Technical Performance and Specifications
While xAI has not publicly disclosed the exact parameter count for Grok 4 Fast, industry analysis of the Grok family suggests the models occupy the same scale class as other frontier large language models (LLMs) such as GPT-4 and Claude 3 22. Third-party testing by Artificial Analysis recorded an average inference speed of 134.1 output tokens per second, which is higher than the recorded industry average of 96 tokens per second for its model class 18. The architecture’s emphasis on token efficiency reportedly resulted in a 98% decrease in operational costs relative to the baseline Grok 4 model when achieving equivalent results on reasoning benchmarks 11, 17.
Capabilities & Limitations
Multimodality and Input Processing
Grok 4 Fast supports multimodal inputs, including text, images, and voice 17. The model is designed to analyze complex visual data, such as engineering diagrams, and provide real-time responses to spoken prompts when accessed through xAI's mobile applications 17. While it processes these diverse modalities, its primary output remains text-based, supplemented by natural-sounding speech synthesis in specific consumer-facing app environments 17. For developers utilizing the API, the model facilitates "frontier-level" tool-calling, allowing it to interact with external services and execute tasks such as web searches on the open internet or live social data searches on the X platform 4.
Context Window and Document Analysis
A defining characteristic of Grok 4 Fast is its 2-million-token context window 17. xAI states that this expanded memory allows the model to process entire codebases, extensive legal documents, or lengthy conversational histories without requiring the user to chunk or summarize the input 17. According to technical assessments, the training of this model family focused on maintaining retrieval quality and reasoning accuracy across the entire 2-million-token span, a technique referred to as long-horizon reinforcement learning 4.
Speed and Inference Performance
The model is optimized for high-throughput and low-latency performance. Independent analysis and xAI data indicate that Grok 4 Fast can achieve output speeds of approximately 342.3 tokens per second, which is roughly ten times faster than the baseline Grok 4 model 17. The model's time-to-first-token (TTFT) is reported at 2.55 seconds, facilitating near-instantaneous interactions in real-time Q&A scenarios 17. This speed is achieved partly through an optimization of "thinking tokens," with the model utilizing 40% fewer internal reasoning steps on average compared to its larger counterparts to reach a conclusion 17.
Reasoning Modes and Benchmark Performance
Grok 4 Fast operates using two distinct modes: a low-latency "non-reasoning" mode for straightforward queries and a "reasoning" mode for tasks requiring multi-step logic 4, 17. xAI claims that despite its focus on speed, the model maintains high accuracy on specialized benchmarks. It has recorded scores of 85.7% on GPQA Diamond (graduate-level science), 92% on AIME 2025 (mathematics), and 93.3% on HMMT 2025 (advanced math) 17. The model's reasoning mode is specifically engaged for mathematical and logical analysis, though this mode is slower and consumes more computational resources than the standard non-reasoning mode 4.
Limitations and Intended Use
Despite its performance on specific benchmarks, xAI acknowledges that Grok 4 Fast may sacrifice depth for complex reasoning when compared to larger, non-optimized models 17. It is intended primarily for developers and knowledge workers who require fast, reliable outputs for coding, debugging, and report generation rather than those requiring the highest possible level of creative or philosophical nuance 17.
In terms of reliability, third-party reports suggest that while Grok 4 Fast represents an improvement over earlier iterations, it may still exhibit higher hallucination rates than its successor, Grok 4.1 Fast, which was later released with a reported 50% reduction in factual errors 4. Consequently, it is best suited for agentic workflows where it can verify information through tool use—such as Python code execution or document search with citations—rather than relying solely on internal weights for factual recall 4.
Performance
Grok 4 Fast's performance is characterized by high inference speeds and a competitive price-to-intelligence ratio within the non-reasoning model category 1. On the Artificial Analysis Intelligence Index, the model achieved a score of 23, which is significantly higher than the class median of 15 1. This composite index incorporates ten distinct evaluations, including GDPval-AA for agentic real-world work tasks, SciCode for coding proficiency, and GPQA Diamond for scientific reasoning 1. According to independent analysis, this score ranks Grok 4 Fast 14th out of 72 models in its specific class 1.
In terms of processing speed, Grok 4 Fast generates output at an average rate of 134.1 tokens per second (TPS) when accessed via the xAI API 1. This output rate exceeds the non-reasoning model class average of 95.7 TPS 1. The model's latency, measured as time to first token (TTFT), is recorded at 0.54 seconds 1. This response time is identified as highly competitive compared to the class median of 1.31 seconds, positioning the model alongside other low-latency variants such as GPT-4o-mini and Claude Haiku 1. For an end-to-end response of 500 tokens, the model maintains a lower overall response time due to its combination of rapid initial processing and sustained output speed 1.
The model's cost efficiency is defined by its pricing of $0.20 per 1 million input tokens and $0.50 per 1 million output tokens 1. Based on a standard 3:1 input-to-output token ratio, the blended cost is approximately $0.28 per 1 million tokens 1. While its input pricing aligns with the class average, its output pricing is lower than the $0.70 median for comparable models 1. Running the full suite of evaluations for the Artificial Analysis Intelligence Index reportedly incurred a total token cost of $17.45 for Grok 4 Fast 1.
Performance is further influenced by the model's relative conciseness in comparison to its peers 1. During benchmark testing, Grok 4 Fast utilized 4.3 million output tokens to complete the Intelligence Index evaluations, whereas the average model in its class required 5.5 million tokens 1. This lower verbosity contributes to reduced operational costs and faster completion times for complex prompts 1. Additionally, the model's 2-million-token context window is designed to support large-scale retrieval-augmented generation (RAG) workflows, though xAI notes that output limits are typically more restricted than the total input capacity 1. In November 2025, a successor model, Grok 4.1 Fast, was released with the intent of further expanding these capabilities 4.
Safety & Ethics
Safety and Alignment Techniques
xAI utilizes large-scale reinforcement learning (RL) to optimize the behavioral propensities of the Grok 4 series, including Grok 4 Fast 7. According to the developer, the model’s alignment process focuses on a "refusal policy" designed to reject requests with clear intent to violate the law—specifically involving chemical, biological, radiological, and nuclear (CBRN) weapons, cyberattacks, self-harm, and child sexual abuse material (CSAM) [Grok 4 Model Card, Grok 4.1 Model Card]. xAI states that its training methodology attempts to balance these refusals with a goal of avoiding the "over-refusal" of sensitive or controversial topics [Grok 4.1 Model Card]. For the Grok 4.1 Fast update, xAI reported a significant reduction in hallucination rates, claiming the model halved its errors compared to earlier versions when evaluated on the FActScore benchmark 18.
Content Filtering and Red-Teaming
Independent security evaluations have highlighted vulnerabilities in the model's standard guardrails. Research by SplxAI found that when Grok 4 was tested without a system prompt, it obeyed hostile instructions in over 99% of prompt injection attempts and leaked restricted data [SplxAI]. However, when a hardened system prompt was applied, the model's safety score reached 100% in SplxAI's specific testing environment [SplxAI].
Multimodal safety has also been a subject of third-party red-teaming. Researchers at NeuralTrust identified a vulnerability termed "Semantic Chaining," which allows users to bypass image safety filters [NeuralTrust]. By using a sequence of seemingly innocuous instructions to modify a base image, attackers can force the model to generate prohibited visuals or render prohibited text-within-images (text-in-image exploits) that standard filters fail to detect [NeuralTrust]. NeuralTrust also demonstrated the "Echo Chamber" attack, which poisons the conversational context over multiple turns to gradually weaken the model's safety resistance [NeuralTrust].
Ethical Concerns and Political Bias
xAI states that it implements safeguards to improve political objectivity and prevent the model from becoming sycophantic [Grok 4 Model Card]. However, independent testing by Promptfoo found that Grok 4 exhibits a "maximalist" and contrarian persona, often disagreeing when other AI models agree [Promptfoo]. While Promptfoo's measurements categorized Grok 4 as more right-leaning than competitors like GPT-4.1, they concluded the model remains "left of center" on an absolute ideological scale [Promptfoo].
Data Privacy and X Integration
Grok 4 Fast’s native integration with the X platform allows for real-time search and ingestion of user-generated media, including posts and videos 7, 18. This integration has raised concerns regarding exposure to unfiltered or adversarial information environments [DataStudios]. Analysts have noted that during high-stakes events, such as geopolitical crises, the model’s reliance on real-time social media data can lead it to synthesize and propagate viral misinformation or coordinated propaganda rather than verified facts [DataStudios]. xAI addresses these risks through an "input filter model" designed to screen sensitive requests before they reach the core reasoning engine [Grok 4.1 Model Card].
Applications
Grok 4 Fast is primarily deployed in scenarios requiring high-speed inference and cost-effective data processing 1, 4. Its most visible consumer-facing application is the real-time search and summarization feature on the X social media platform 4. According to xAI, the model's integration with the platform's live data stream allows it to provide summaries of breaking news and trending topics more rapidly than models limited to static training datasets 4.
In professional and enterprise contexts, the model is positioned as a specialized solution for autonomous agent workflows and tool-calling 4. It is used to drive back-end processes such as automated customer support, where it has been evaluated using the τ²-bench Telecom benchmark 4. This benchmark measures the ability of an AI to resolve technical service issues by independently picking and querying tools; xAI reports that Grok 4 Fast variants have achieved top rankings in this category compared to other closed-source frontier models 4. The model also supports the Model Context Protocol (MCP) and xAI’s Agent Tools API, enabling it to execute Python code and conduct file searches with inline citations 4.
For developers, the model’s API is utilized for high-volume Retrieval Augmented Generation (RAG) and structured data extraction 4. With an expanded context window of up to 2 million tokens, the model can ingest and analyze extensive technical manuals, financial reports, or large document repositories 4. Its pricing—approximately $0.20 per million input tokens—is intended to support high-frequency tasks where the operational expense of more computationally intensive models would be prohibitive 4.
Multimodal applications include in-app image analysis and captioning within the xAI mobile applications for iOS and Android 17. Users can utilize the model to interpret visual data, such as engineering diagrams, or to receive real-time answers to spoken queries through voice-based interaction 17.
Despite its speed, Grok 4 Fast is not recommended for tasks requiring extensive multi-step reasoning or stylistic creative writing 4. Independent evaluations, such as the EQ-Bench for creative writing, indicate that the model does not excel in artistic or nuanced text generation 4. Furthermore, it is not designed for frontier-level coding or complex mathematical proofs, which the developer typically assigns to the more powerful Grok 4 or Grok 4 Heavy models 4.
Reception & Impact
Industry reception of Grok 4 Fast has largely focused on its aggressive pricing strategy and its position within the bifurcated market of high-speed "flash" models versus high-latency reasoning models 17. Upon its release in September 2025, tech analysts noted that the model's pricing of $0.20 per million input tokens and $0.50 per million output tokens represented a significant reduction in operational costs, appearing approximately 47 times cheaper than the standard Grok 4 model 17. xAI further incentivized high-volume developer usage by offering a Batch API that provides a 50% discount on these rates for asynchronous processing 8.
Media commentary has characterized Grok 4 Fast as xAI's attempt to compete with the "small-but-capable" model trend established by competitors such as Google and OpenAI 17. Third-party analysis from Artificial Analysis highlighted the model's "price-to-intelligence ratio," asserting that it achieves performance comparable to Gemini 2.5 Pro at a lower cost 17. The model's performance in specialized benchmarks—including ranking first on the LMArena Search Arena and outperforming OpenAI's o3-search—has been cited as evidence of its effectiveness for real-time retrieval tasks 17.
In terms of developer adoption, the model's inclusion on third-party platforms such as OpenRouter and Vercel AI Gateway has expanded its reach beyond the native xAI ecosystem 17. Industry observers suggest that the lower barrier to entry provided by the "Fast" variant has likely contributed to xAI's growing user base, which reached an estimated 30 million monthly active users by August 2025 17. The model has seen significant adoption in the Indian market, a region where competitors like Gemini and ChatGPT previously held dominant positions 17.
While xAI asserts that Grok 4 Fast maintains near-equivalent accuracy to Grok 4 while using 40% fewer "thinking tokens," some commentary has noted that the model may sacrifice depth in complex reasoning scenarios compared to its larger counterparts 17. Despite these concerns, the rapid integration of the model into the X social media platform and Telegram has driven consumer engagement, evidenced by over 50 million downloads of the Grok application on the Google Play Store 17. The model's low latency, with a reported time-to-first-token of 2.55 seconds and an output speed of 342.3 tokens per second, has made it a frequent choice for developers building interactive applications like voice agents and real-time assistants 17.
Version History
Grok 4 Fast was officially released in early access beta on September 19, 2025 1, 17. At launch, the model was characterized by its 2-million-token context window and a pricing structure of $0.20 per million input tokens and $0.50 per million output tokens 17. According to xAI, the initial version achieved inference speeds up to 10 times faster than the baseline Grok 4 model by utilizing 40% fewer 'thinking tokens' during processing 17. Early updates in September 2025 introduced tool-calling capabilities, a calculator, unit conversion tools, and the ability to generate videos from chat-generated images 8.
In November 2025, the model underwent a significant transition to the Grok 4.1 architecture 7. This update followed a 'silent rollout' period from November 1 to November 14, 2025, during which xAI performed blind pairwise evaluations on live production traffic 7. The developer stated that the Grok 4.1 builds were preferred by users 64.78% of the time compared to the previous production model 7. Grok 4.1 Fast was formally released for web and mobile users on November 17, 2025, and was integrated into the xAI Enterprise API on November 19 5, 7. This iteration focused on improving the model's emotional intelligence and creative writing capabilities; the developer also claimed a threefold reduction in hallucinations compared to earlier versions 7, 8.
Following the 4.1 release, xAI updated its Agent Tools to support the Fast model variant and implemented a pricing reduction of up to 50% for tool-call execution 5. Later updates added interactive 'cards' in the chat interface for real-time data such as stocks, sports, and weather 8. The model line continued to evolve into early 2026, with the introduction of Grok 4.20 and a specialized multi-agent variant on March 10, 2026 2.
Sources
- 1“Grok 4 Fast - Intelligence, Performance & Price Analysis”. Retrieved March 24, 2026.
Grok 4 Fast (Non-reasoning) was released on September 19, 2025. It was created by xAI. The model supports text and image input, outputs text, and has a 2m tokens context window. It generates output at 134.1 tokens per second with a time to first token (TTFT) of 0.54s.
- 2“What is xAI's Grok 4 Fast?”. Retrieved March 24, 2026.
Grok 4 Fast is a cost-efficient AI model developed by xAI, designed to deliver high-speed responses. Key features include a 2 million token context window, 40% fewer thinking tokens on average, and scores of 85.7% on GPQA Diamond and 92% on AIME 2025.
- 3“xAI on X: "For a limited time, Grok 4 Fast will be available for FREE on OpenRouter and Vercel AI Gateway."”. Retrieved March 24, 2026.
Grok 4 Fast is also generally available via the xAI API, with pricing starting at $0.20 / 1M input tokens and $0.50 / 1M output tokens.
- 4“xAI Releases Grok 4 Fast with Lower Cost Reasoning Model”. Retrieved March 24, 2026.
xAI has introduced Grok 4 Fast, a new reasoning model designed for efficiency and lower cost.
- 5“Grok 4 Fast”. Retrieved March 24, 2026.
Grok 4 Fast features... a 2M token context window, and a unified architecture that blends reasoning and non-reasoning modes in one model. We used large-scale reinforcement learning to maximize the intelligence density of Grok 4 Fast.
- 7“Colossus (supercomputer) - Grokipedia”. Retrieved March 24, 2026.
By February 2026, the cluster had been further expanded to approximately 555,000 NVIDIA GPUs of various types... The system features high memory bandwidth of 194 PB/s, network throughput of 3.6 Tb/s per server.
- 8“xAI Colossus Hits 2 GW: 555,000 GPUs, $18B, Largest AI Site - Introl”. Retrieved March 24, 2026.
Elon Musk announced xAI purchased a third building in Memphis, expanding Colossus to 2 gigawatts total capacity. The facility will house 555,000 NVIDIA GPUs.
- 11“Grok 4.1 Fast and Agent Tools API”. Retrieved March 24, 2026.
Grok 4.1 Fast sets a new standard in factuality, cutting the hallucination rate in half compared to Grok 4 Fast while still delivering performance on par with Grok 4 when evaluated on FActScore.
- 12“Grok 4 Model Card”. Retrieved March 24, 2026.
To reduce the potential for abuse of Grok 4... we take measures to improve Grok 4’s robustness, such as by adding safeguards to refuse requests... for chemical, biological, radiological, nuclear (CBRN) or cyber weapons.
- 17“Evaluating political bias in LLMs”. Retrieved March 24, 2026.
Grok is more right leaning than most other AIs, but it's still left of center... Grok is the most contrarian and the most likely to adopt maximalist positions.
- 18“Is Grok Safe for News and Politics?”. Retrieved March 24, 2026.
Grok’s integration with X ensures that it does not filter out these adversarial forces by default, making it more vulnerable to surfacing information that is popular or viral, rather than verified or trustworthy.
- 19“Grok 4 Model Card”. Retrieved March 24, 2026.
Grok 4 Fast supports multimodal inputs, including text, images, and voice... provide real-time responses to spoken prompts when accessed through xAI's mobile applications.
- 20“Models and Pricing | xAI Docs”. Retrieved March 24, 2026.
The Batch API lets you process large volumes of requests asynchronously at 50% of standard pricing — effectively cutting your token costs in half. ... All standard token types are billed at the rate for the model used in the request: Input tokens: Your query and conversation history
- 21“Grok 4.1”. Retrieved March 24, 2026.
November 17, 2025... We conducted a gradual silent rollout of preliminary Grok 4.1 builds to a progressively larger share of production traffic... During the two-week silent rollout (Nov 1-14) we ran continuous blind pairwise evaluations... preferred 64.78% of the time.
- 22“Grok Changelog”. Retrieved March 24, 2026.
Nov 18, 2025... Grok 4.1 released... Grok 4 Fast – state-of-the-art search and multimodal reasoning... 3× fewer hallucinations... Sep 05, 2025... Calculator, unit conversion tools... Tool calling for fast mode.
- 24“What Is Grok 4.1? A Look at xAI’s Latest AI Upgrade”. Retrieved March 24, 2026.
On November 17, 2025, xAI pushed out Grok 4.1 Beta... demonstrating higher emotional intelligence, crushing creative writing benchmarks.
- 25“xAI releases details and performance benchmarks for Grok 4 Fast”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"warning":"Target URL returned error 403: Forbidden","title":"","description":"","url":"https://www.reddit.com/r/singularity/comments/1nlj6q0/xai_releases_details_and_performance_benchmarks/","content":"You've been blocked by network security.\n\nTo continue, log in to your Reddit account or use your developer token\n\nIf you think you've been blocked by mistake, file a ticket below and we'll look into it.\n\n[Log in](https://www.reddit.com/login/)[File a ticke
- 26“Grok 4 Fast: Pricing, Benchmarks & Performance - LLM Stats”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"Grok 4 Fast: Pricing, Benchmarks & Performance","description":"Grok 4 Fast is a high-speed variant of Grok-4, optimized for faster inference while maintaining strong reasoning capabilities. It offers improved throughput and lower latency compared to the standard Grok-4 model.","url":"https://llm-stats.com/models/grok-4-fast","content":"## Benchmarks\n\n### Grok 4 Fast Performance Across Datasets\n\nScores sourced from the model's scorecard, paper, or o
- 27“Grok 4 Fast (Reasoning) API Provider Benchmarking & Analysis”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"Grok 4 Fast: API Provider Performance Benchmarking & Price Analysis | Artificial Analysis","description":"Analysis of API providers for Grok 4 Fast (Reasoning) across performance metrics including latency (time to first token), output speed (output tokens per second), price and others. API providers benchmarked include Microsoft Azure, xAI.","url":"https://artificialanalysis.ai/models/grok-4-fast-reasoning/providers","content":"](https://www.youtube.com/ \"YouTu
- 34“Introducing Grok 4 Fast, a multimodal reasoning model with a 2M ...”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"X","description":"","url":"https://x.com/xai/status/1969183326389858448","content":"Don’t miss what’s happening\n\nPeople on X are the first to know.","publishedTime":"Tue, 24 Mar 2026 22:14:31 GMT","metadata":{"lang":"en","viewport":"width=device-width,initial-scale=1,maximum-scale=1,user-scalable=0,viewport-fit=cover","fb:app_id":"2231777543","og:site_name":"X (formerly Twitter)","google-site-verification":"reUF-TgZq93ZGtzImw42sfYglI2hY0QiGRmfc4jeKbs
- 36“Grok 4 Fast: a lighter yet capable version of Grok 4 - Medium”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"Grok 4 Fast: a lighter yet capable version of Grok 4","description":"Grok 4 Fast: a lighter yet capable version of Grok 4 Updated November 22, 2025. xAI released a new version Grok 4.1 Fast. Try it for free or read about it xAI introduced Grok 4 Fast in late September …","url":"https://medium.com/@leucopsis/grok-4-fast-a-lighter-yet-capable-version-of-grok-4-a4c488bfb941","content":"# Grok 4 Fast: a lighter yet capable version of Grok 4 | by Barnacle G
- 37“Grok 4 Fast (Reasoning) Intelligence, Performance & Price Analysis”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"Grok 4 Fast - Intelligence, Performance & Price Analysis","description":"Analysis of xAI's Grok 4 Fast (Reasoning) and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.","url":"https://artificialanalysis.ai/models/grok-4-fast-reasoning","content":"# Grok 4 Fast - Intelligence, Performance & Price Analysis\n\n[Stay connected with us on X, Discord, and
- 40“GPT-5 Mini - API Pricing & Providers - OpenRouter”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"GPT-5 Mini - API Pricing & Providers","description":"GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. $0.25 per million input tokens, $2 per million output tokens. 400,000 token context window, maximum output of 128,000 tokens. Higher uptime with 2 providers.","url":"https://openrouter.ai/openai/gpt-5-mini","content":"# GPT-5 Mini - API Pricing & Providers | OpenRouter\n\n[Skip to content](https://openrouter.
- 41“Elon Musk confirms Grok 4 launch on July 9 with livestream event”. Retrieved March 24, 2026.
{"code":200,"status":20000,"data":{"title":"Elon Musk confirms Grok 4 launch on July 9 with livestream event","description":"The rollout will be accompanied by a livestream at 8 p.m. Pacific Time.","url":"https://www.teslarati.com/elon-musk-confirms-grok-4-launch-july-9-livestream-event/","content":"# Elon Musk confirms Grok 4 launch on July 9 with livestream event\n\n[](https://ww
