Mistral Small 3.1 24B Instruct
Mistral Small 3.1 24B Instruct

Mistral Small 3.1 24B Instruct is a 24-billion parameter multimodal large language model (LLM) developed by the French AI firm Mistral AI 4. Released on March 17, 2025, the model serves as an iterative update within the "Small" family of models, specifically succeeding the text-only Mistral Small 3 (2501) variant 4. It is designed to balance computational efficiency with reasoning capabilities, targeting enterprise-scale applications and local deployments where privacy or hardware constraints are primary considerations 4. Unlike its immediate predecessor, version 3.1 introduces vision understanding, allowing it to process and analyze both text and image inputs within a single workflow 4.
The model is characterized by a 128,000-token context window, which facilitates the processing of long-form documents and complex, multi-turn conversational histories 4. Mistral AI states that the model is optimized for instruction-following, function calling, and the generation of structured outputs, which are essential for autonomous agents and integration with external software tools 4. Its training data encompasses dozens of languages and specialized datasets for mathematical reasoning and programming tasks 4. The model's architecture is optimized for inference on high-end consumer hardware, such as a single NVIDIA RTX 4090 GPU or specific Apple silicon configurations, providing a local alternative to larger cloud-hosted models 4.
Independent performance evaluations provided by Artificial Analysis place Mistral Small 3.1 24B Instruct in a moderate position within the mid-sized model market 4. It achieved an overall intelligence score of 14.5, outperforming 26% of compared models in its category, and a coding capability index of 13.9, which is higher than 39% of its peers 4. In specific reasoning benchmarks, the model recorded a score of 45.4% on the GPQA Diamond for graduate-level scientific reasoning and 29.9% on the IFBench instruction-following benchmark 4. While showing competency in technical domains with a 26.5% score on the SciCode Python programming benchmark, its performance in complex agentic scenarios was rated at 8.4 on the Artificial Analysis Agentic Index 4.
Within the Mistral AI product ecosystem, the 3.1 24B Instruct model is positioned as an intermediate solution between the smaller Ministral series and the flagship Mistral Large series 4. Shortly after its release, it was succeeded by Mistral Small 3.2, which Mistral AI indicates was designed to further improve accuracy on benchmarks like WildBench and reduce instances of repetitive or infinite generations 4. Mistral Small 3.1 has seen adoption in various third-party development environments and autonomous coding platforms, such as OpenClaw and Claude Code, where it is utilized for tasks including codebase exploration and multi-file editing 4. The model is accessible via API providers such as OpenRouter and Venice, and its weights have been made available for local execution 4.
Background
The development of Mistral Small 3.1 24B Instruct represents an evolution in Mistral AI’s strategy to provide mid-sized models that balance high-level reasoning with operational efficiency. Released on March 17, 2025, the model serves as a multimodal upgrade to Mistral Small 3 (2501), which was a text-only model released in January 2025 4. This iterative update reflects a broader industry shift toward integrating vision capabilities into standard instruction-tuned models, allowing the 'Small' family to move beyond pure text processing into visual document understanding and image analysis 4.
The lineage of the 'Small' model tier began with versions such as Mistral Small v24.09, which utilized a 22-billion parameter architecture 4. With the introduction of the 3.x series, the parameter count was increased to 24 billion. Mistral AI characterizes this 24B size as a strategic mid-point in their portfolio, positioned between the high-efficiency Mistral NeMo (12B) and the flagship Mistral Large 4. According to the developer, the 24B architecture is designed to offer a 'sweet spot' for enterprise deployments, providing reasoning capabilities comparable to larger models like Llama 3.3 70B or Qwen 32B while operating at approximately three times the speed on equivalent hardware 4.
The motivation for the 3.1 update was driven by market demand for models capable of handling complex, data-rich workflows that involve both long-form text and visual information. While previous versions focused on low-latency text tasks, Mistral Small 3.1 was engineered to support a 128,000-token context window, facilitating the analysis of long documents and extensive codebases 4. The addition of multimodal features was intended to address use cases such as ChartQA and DocVQA, where models must interpret structured data within images or diagrams 4.
At the time of its release, the model was part of a rapid development cycle for Mistral AI, which saw the subsequent release of Mistral Small 3.2 only months later to further refine instruction following and reduce repetition 4. The 3.1 variant remained a significant milestone for the firm as it unified the reasoning strengths of their text models with the visual understanding developed for the Pixtral series, offering a unified system for complex analysis and privacy-sensitive local deployments 4.
Architecture
Model Structure and Parameters
Mistral Small 3.1 24B Instruct is a multimodal large language model based on the Transformer architecture 8. It features 24 billion parameters, a count intended to balance computational efficiency with high-level reasoning capabilities 4, 5. The model is an instruction-tuned version of the Mistral-Small-3.1-24B-Base-2503 foundation model 5, 8. According to Mistral AI, this parameter size allows the model to achieve performance metrics comparable to larger models, such as the Llama 70B series, while maintaining a smaller memory footprint 10.
Multimodal Integration
A significant architectural update in the 3.1 version is the integration of vision capabilities, making it a multimodal system capable of processing both text and image inputs simultaneously 4, 6. The architecture supports two-dimensional (2D) RGB image formats in addition to one-dimensional (1D) text strings 8. Mistral AI states that the vision component allows the model to perform tasks such as document understanding, image analysis, and visual reasoning 6, 10. The developer asserts that the model's visual reasoning performance is on par with their larger Pixtral Large model released in 2024 10.
Context Management and Tokenization
The model supports a context window of 128,000 tokens 4, 5, 6. Mistral AI reports that the architecture maintains 100% retrieval accuracy on passkey evaluations across this entire 128k range, which is intended to prevent performance degradation during the processing of long documents 10. The model utilizes the "Tekken" tokenizer, which features a vocabulary size of 131,000 tokens 5, 8. This tokenizer is designed for efficiency across dozens of supported languages and structured data formats like JSON 5.
Optimization and Hardware Compatibility
The architecture is optimized for low-latency inference and deployment on standard enterprise and consumer-grade hardware 6. Mistral AI states that the model is "knowledge-dense" enough to fit on a single NVIDIA RTX 4090 GPU or a 32GB RAM MacBook once quantized 5, 6. Technical documentation from NVIDIA indicates the model is compatible with NVIDIA Ampere and Lovelace microarchitectures 8. Mistral AI reports that the model can achieve inference speeds of approximately 150 tokens per second in optimized environments 6.
Training and Agentic Capabilities
Mistral Small 3.1 24B Instruct was developed using refined training methodologies intended to improve long-context handling and tool-use precision 10. The model is described by the developer as "agent-centric," featuring native support for function calling and structured JSON output 5, 8. This design is intended to facilitate its use in autonomous workflows, such as codebase exploration and multi-step reasoning tasks 5, 6. While specific training datasets remain undisclosed, the developer notes that the model was fine-tuned to maintain strong adherence to system prompts and minimize repetitive outputs 5, 8.
Capabilities & Limitations
Mistral Small 3.1 24B Instruct is a multimodal model designed to process both text and image inputs within a 128,000-token context window 4. The model is intended for diverse applications ranging from conversational agents and function calling to long-document analysis and privacy-sensitive local deployments 4.
Vision and Multimodal Processing
The model features integrated vision capabilities, allowing it to perform image analysis, optical character recognition (OCR), and visual reasoning 4. According to Mistral AI, the 3.1 24B variant is optimized for describing image contents and extracting structured data from visual inputs, such as charts or documents 4. In practical implementations, users can provide image URLs alongside text prompts to ask questions about visual data 4. The model's vision architecture is intended to enable the translation of visual information into textual reasoning for tasks like technical diagram interpretation and image-based data entry 4.
Language and Instruction Following
Mistral Small 3.1 24B Instruct supports multilingual interactions across dozens of global languages 4. The model is tuned for instruction following, specifically for complex, multi-step prompts 4. However, benchmark testing by independent evaluators has shown varied results in this area. In the IFBench instruction-following benchmark, the model achieved a score of 29.9%, suggesting it may struggle with highly specific or constrained prompt adherence compared to larger models 4. Its performance in long-context reasoning is quantified at 19.7% on the AA-LCR evaluation, reflecting its capability to process information from extended documents, albeit with lower precision than flagship variants 4.
Specialized Reasoning and Coding
The model is designed for technical tasks including mathematical reasoning and programming 4. Benchmarks indicate a composite coding capability score of 13.9, which places it ahead of approximately 39% of models evaluated by Artificial Analysis 4. In graduate-level scientific reasoning (GPQA Diamond), it attained a score of 45.4% 4. It also supports agentic workflows and tool use; however, average tool call error rates have been recorded at 26.42% in some provider testing, indicating that users may encounter failures when the model attempts to orchestrate external functions 4.
Limitations and Known Issues
While positioned as a high-performance mid-sized model, Mistral Small 3.1 24B Instruct has notable limitations compared to larger models in the Mistral family, such as Mistral Large 2 or Mistral Medium 4. Its overall intelligence score is rated at 14.5, which is lower than many flagship enterprise models currently in use 4.
A primary limitation involves accuracy and hallucination rates. Independent testing shows an omniscience accuracy of 15.0%, with a hallucination rate—defined as the rate of incorrect answers among non-correct responses—of 23.2% 4. Furthermore, the model exhibits performance drops on highly difficult reasoning tasks; it scored 0.0% on both the GDPval-AA economic task evaluation and the CritPt research-level physics reasoning benchmark 4.
Mistral AI released an updated version, Mistral Small 3.2, to specifically address behavioral issues identified in version 3.1, including a tendency for infinite generations (repetition loops) and lower precision in structured outputs 4. Consequently, version 3.1 is considered less reliable for tasks requiring repetition reduction and high-accuracy function calling than its successor 4.
Performance
Mistral Small 3.1 24B Instruct is designed as a mid-sized model that attempts to provide high-level reasoning and multimodal capabilities with lower computational requirements than flagship models 5, 8. Performance evaluations generally place the model at the top of its parameter class, frequently outperforming larger models such as Gemma 3 27B and proprietary small models like GPT-4o Mini in specific reasoning and vision tasks 5, 10.
General and Technical Benchmarks
In standard language understanding and reasoning tests, the model achieved a score of 80.62% on the Massive Multitask Language Understanding (MMLU) benchmark 5. For specialized reasoning, it scored 45.96% on GPQA Diamond (graduate-level science) and 69.3% on the MATH benchmark 5, 8. Its coding proficiency is noted by a score of 88.41% on HumanEval and 74.71% on MBPP 5.
In comparative evaluations provided by Mistral AI, the 24B Instruct variant outperformed GPT-4o Mini and Claude 3.5 Haiku on GPQA Diamond and HumanEval 10. However, third-party analysis by Artificial Analysis indicates that while the model is above average in its size class, it ranks 17th out of 53 comparable models on their aggregate Intelligence Index 11.
Vision and Multimodal Performance
The model's integrated vision encoder allows it to process visual data alongside text. On vision-centric benchmarks, Mistral Small 3.1 24B Instruct achieved 68.91% on MathVista and 94.08% on DocVQA 5. It also recorded scores of 93.72% on AI2D and 86.24% on ChartQA 5, 8. According to developer reports, these figures represent a notable improvement over the previous text-only Mistral Small 3 (2501) and surpass the vision performance of GPT-4o Mini on the MathVista and DocVQA metrics 8, 10.
Throughput, Latency, and Efficiency
Mistral AI asserts that the model can achieve inference speeds of 150 tokens per second (tps) 10. Independent testing by Artificial Analysis recorded an average output of 139 tps, which is described as faster than the class average of 98 tps 11. On the OpenRouter platform, specific providers using FP8 quantization reported a throughput of approximately 62.5 tps with a time-to-first-token latency of 0.82 seconds 4.
The model's 24-billion parameter size is intended to facilitate local deployment. When quantized to 8-bit precision (FP8), it can be hosted on a single NVIDIA RTX 4090 GPU or a consumer-grade MacBook with 32GB of RAM 8, 10. Evaluations of FP8 per-tensor quantization showed minimal performance degradation, maintaining a score of 85.96% on the ChartQA benchmark compared to the 86.04% achieved by the unquantized baseline 12.
Cost and API Pricing
As of March 2025, the model is positioned as a cost-efficient alternative for enterprise and developer use. Official API pricing via Mistral AI's platform is set at $0.10 per million input tokens and $0.30 per million output tokens 6, 11. This pricing structure is approximately 20 times less expensive than Mistral’s larger Pixtral Large model, which charges $2.00 per million input tokens 6. While affordable for its capability level, Artificial Analysis noted the model is "somewhat expensive" when compared strictly against other open-weight non-reasoning models of similar size 11.
Safety & Ethics
Mistral Small 3.1 24B Instruct is an instruction-finetuned variant of the Mistral-Small-3.1-24B-Base-2503 foundation model 8, 10. The model's safety and alignment profile is defined by its transition from a base pretrained model to one optimized for conversational assistance, reasoning, and agentic tasks 5. Mistral AI asserts that the model maintains strong adherence to system prompts, a feature intended to allow developers to enforce specific operational boundaries and safety constraints during deployment 10.
Alignment and Guardrails
The model utilizes instruction-tuning to improve its performance in following user directives and executing function calls 5, 8. While the developer characterizes the model as a solid foundation for both enterprise and consumer-grade applications, specific details regarding internal filtering mechanisms or the use of techniques like Reinforcement Learning from Human Feedback (RLHF) or Direct Preference Optimization (DPO) are not extensively detailed in its technical releases 5, 8. Third-party providers, such as those accessed via OpenRouter, indicate that content moderation often remains the responsibility of the implementing developer rather than being hard-coded into the model's weights 4. Technical documentation recommends the use of post-processing techniques, including text formatting and JSON parsing, to ensure the stability and safety of generated outputs in production environments 8.
Multimodal Safety Risks
As a multimodal model capable of processing both text and imagery, Mistral Small 3.1 is subject to safety challenges unique to vision-language systems. General research into multimodal large language models (MLLMs) has identified vulnerabilities such as 'hidden image prompts,' where malicious content embedded in visual data can bypass textual safety filters to trigger harmful outputs 13. Academic frameworks like SafeBench have highlighted that MLLMs frequently exhibit concerns regarding the generation of harmful content when presented with complex, interleaved inputs 15. Mistral AI positions the model for 'privacy-sensitive deployments' and 'local inference,' suggesting that its ability to run on consumer hardware like an NVIDIA RTX 4090 can mitigate some privacy risks associated with cloud-based data processing 4, 5, 8.
Transparency and Ethical Considerations
There is limited transparency regarding the specific datasets used to train Mistral Small 3.1. Documentation provided by NVIDIA and Mistral AI lists the data collection and labeling methods as 'undisclosed' 8. This lack of disclosure limits the ability of independent researchers to evaluate the model for inherent demographic biases or the inclusion of copyrighted materials 8. However, the model is released under the Apache 2.0 license, which Mistral AI states is intended to 'democratize' AI by allowing for community-led auditing and the development of specialized subject-matter experts through downstream fine-tuning 5, 10.
Applications
Mistral Small 3.1 24B Instruct is designed as a versatile model for generative AI tasks, including instruction following, image understanding, and autonomous agentic workflows 5. Due to its 24-billion parameter count, it is frequently applied in scenarios where larger models are computationally prohibitive but higher reasoning capabilities are required than those offered by smaller, 1B–8B parameter models 4, 10.
Visual and Document Analysis
The model's multimodal capabilities enable applications in automated visual document analysis, such as invoice processing, chart interpretation, and document verification 4, 5. Mistral AI suggests its use for industrial applications, including visual inspection for quality checks, object detection in security systems, and image-based customer support 5. In the healthcare sector, the model is proposed for medical diagnostic assistance, though the developer indicates such applications generally require further domain-specific fine-tuning 5.
Enterprise and Conversational Agents
Equipped with a 128,000-token context window, the model is used for enterprise-grade chatbots that must maintain memory across long conversations or analyze extensive technical documentation 4, 5. It supports low-latency function calling, which allows it to act as a controller within automated workflows by executing external tools or APIs 5. Mistral AI asserts that the model is suitable for virtual assistants where quick, accurate responses are a primary requirement for user experience 5.
Technical and Specialized Support
The model's performance in STEM reasoning and programming facilitates its use as a coding assistant and technical support tool 4, 5. Third-party developers have utilized the model as a base for specialized reasoning variants, such as DeepHermes 24B 5. Furthermore, its architecture allows for fine-tuning in highly regulated or specialized fields, such as legal research or technical troubleshooting, where subject matter expertise is critical 5.
Constrained and Local Environments
A primary application of Mistral Small 3.1 is local deployment on consumer-grade hardware, which is a significant factor for small businesses and independent developers 10. The model can operate on a single NVIDIA RTX 4090 GPU or a Macintosh computer with 32GB of RAM, making it a candidate for on-device processing and privacy-sensitive applications where data cannot be transmitted to external cloud servers 5, 10. While the model offers high efficiency for its size, for tasks requiring the highest possible levels of reasoning accuracy, Mistral AI provides larger models such as Mistral Large 4.
Reception & Impact
The reception of Mistral Small 3.1 24B Instruct has centered on its positioning as a high-utility intermediate model, though its release has also prompted discussions regarding the trade-offs between benchmark performance and general world knowledge. Mistral AI asserts that the model is the leading performer in its weight class, specifically citing its ability to outperform comparable models like Gemma 3 (27B) and proprietary alternatives such as GPT-4o Mini in reasoning and multimodal tasks 4, 5.
Industry and Community Reception
Community evaluation of the 24-billion parameter size has been generally positive, with the model being viewed as a viable open-source alternative for the "80%" of generative AI tasks that require a balance of reasoning and low latency 5. In local deployment tests, the model achieved its claimed generation speeds of approximately 150 tokens per second for standard text queries 8. However, technical reviews have identified significant hardware barriers for high-end features; while Mistral AI suggests the model can run on a single consumer-grade GPU or a 32GB RAM workstation, independent testing found that utilizing the full 128,000-token context window or multimodal features creates extreme memory demands that may lead to latency spikes or timeouts on consumer hardware 8.
Critiques of Knowledge Regression
A notable point of contention within the AI community involves the model's factual recall. Comparisons between the January 2025 release (v2501) and earlier iterations (v2409) indicated a regression in broad English knowledge 12. While MMLU scores—which measure STEM and academic proficiency—improved, performance on tests of general pop culture and "niche" knowledge declined 12. User reports on Hugging Face documented frequent hallucinations regarding popular television cast members and musical lyrics that the previous version handled accurately 12. Some researchers attributed this to "overfitting," suggesting that the training process prioritized optimizing for specific benchmarks at the expense of maintaining a diverse internal knowledge base 12.
Strategic and Economic Impact
Mistral AI’s strategy with the 3.1 24B release aligns with its mission to provide transparent, open-weight alternatives to the "opaque boxes" of proprietary US-based models 14. CEO Arthur Mensch has characterized such models as essential for preventing market concentration among a few heavily funded American firms 14. For small-to-medium enterprises (SMEs), the 24B size is viewed as an important development for the adoption of multimodal AI. Its Apache 2.0 license allows for cost-effective local deployment, which is particularly relevant for organizations operating under strict privacy or regulatory frameworks, such as the EU AI Act, who cannot rely on cloud-based proprietary APIs 14, 16.
Version History
Mistral AI employs a versioning convention for its models that combines traditional version numbers with a four-digit chronological identifier (YYMM) representing the release date 5, 6. The Mistral Small lineage has evolved through several iterations, transitioning from a text-only architecture to multimodal and hybrid systems 4, 7.
Mistral Small 3 (2501)
Released in January 2025, Mistral Small 3 was the foundational model of the 3.x series. It was designed as a 24-billion parameter text-only model under the Apache 2.0 license, intended for low-latency tasks and local deployment on consumer-grade hardware 4, 5. According to Mistral AI, it provided a significant increase in reasoning capabilities over the earlier Mistral NeMo 12B model 4.
Mistral Small 3.1 (2503)
Mistral Small 3.1, released on March 17, 2025, introduced multimodal capabilities to the Small series 4, 5. This version integrated a vision encoder to support image analysis and optical character recognition (OCR) alongside text processing 5. The context window was expanded to 128,000 tokens. Mistral AI asserted that version 3.1 achieved higher text performance than its predecessor and reached inference speeds of up to 150 tokens per second 5.
Mistral Small 3.2 (2506)
Launched in June 2025, Mistral Small 3.2 focused on refining instruction following and technical reliability 4. Key updates included a reduction in "infinite generations" (unending repetitive loops) and improved accuracy for structured outputs and function calling 4. Mistral AI designated this version as the official replacement for version 3.1, which was scheduled for retirement in November 2025 5.
Mistral Small 4 (2603)
Mistral Small 4, released in March 2026, represented a major architectural transition to a mixture-of-experts (MoE) system 6, 7. While the model contains 119 billion total parameters, only 6 billion are active per token, maintaining the efficiency associated with the "Small" branding 7. This version unified the specialized capabilities of Mistral's Magistral (reasoning), Pixtral (vision), and Devstral (coding) models into a single system with an expanded 256,000-token context window 6, 7.
See Also
Sources
- 4“mistral-small-3.1-24b-instruct-2503 Model by Mistral AI | NVIDIA NIM”. Retrieved March 24, 2026.
Architecture Type: Transformer-based Language Model. Network Architecture: Instruction-tuned, multimodal, Transformer-based. Model Parameters: 24 billion. Supported Hardware Microarchitecture Compatibility: NVIDIA Ampere, NVIDIA Lovelace (e.g., RTX 4090).
- 5“AI Model Catalog | Microsoft Foundry Models”. Retrieved March 24, 2026.
Mistral Small 3.1 (25.03) often matches or outperforms much larger models, including 70B parameter Llama models. ...demonstrates 100% retrieval capability on passkey evaluations up to 128k context. ...demonstrates performance on par with Pixtral Large.
- 6“LLM Model: Mistral Small 3.1 24B by Mistral AI | Deepranking.ai”. Retrieved March 24, 2026.
reasoning (MMLU 80.62%, GPQA Diamond 45.96%), coding (HumanEval 88.41%, MBPP 74.71%), math (MATH 69.3%), and multilingual benchmarks (71.18% average across 24 languages), while also dominating vision tasks like MathVista (68.91%), DocVQA (94.08%), and AI2D (93.72%), outperforming similarly sized proprietary models like Gemma 3 27B and GPT-4o Mini in most categories.
- 7“Mistral Small 3.1 24B Base vs Pixtral Large: Complete Comparison”. Retrieved March 24, 2026.
Mistral Small 3.1 24B Base ($0.10/1M tokens) is 20.0x cheaper than Pixtral Large ($2.00/1M tokens). For output processing, Mistral Small 3.1 24B Base ($0.30/1M tokens) is 20.0x cheaper than Pixtral Large ($6.00/1M tokens).
- 8“unsloth/Mistral-Small-3.1-24B-Instruct-2503 · Hugging Face”. Retrieved March 24, 2026.
Mistral Small 3.1 can be deployed locally and is exceptionally 'knowledge-dense,' fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized. Instruction Evals: MMLU 80.62%, MATH 69.30%, GPQA Diamond 45.96%, MBPP 74.71%, HumanEval 88.41%. Vision: Mathvista 68.91%, ChartQA 86.24%, DocVQA 94.08%, AI2D 93.72%.
- 10“nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 · Hugging Face”. Retrieved March 24, 2026.
Evaluations against the unquantized baseline on ChartQA: baseline (0.8604), FP8 (0.8596). Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization.
- 11“Multimodal AI Faces New Safety Threats | CSA”. Retrieved March 24, 2026.
Enkrypt AI's new report reveals critical safety flaws in multimodal models, exposing risks like CSEM content and CBRN info via hidden image prompts.
- 12“SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models”. Retrieved March 24, 2026.
Multimodal Large Language Models (MLLMs) are showing strong safety concerns (e.g., generating harmful outputs for users), which motivates the development of evaluation frameworks.
- 13“Mistral Small 3.1 Beats GPT-4o Mini with Just 24B Parameters”. Retrieved March 24, 2026.
Mistral Small 3.1 breaks these barriers, delivering top-tier results locally with... a single RTX 4090 GPU or a Mac with 32 GB RAM.
- 14“Mistral Small 3”. Retrieved March 24, 2026.
Mistral Small 3 is a pre-trained and instructed model catered to the ‘80%’ of generative AI tasks—those that require robust language and instruction following performance, with very low latency.
- 15“Mistral 3.1 Review: Can You Run It Locally?”. Retrieved March 24, 2026.
Running Mistral 3.1 at full capacity (128k context) is a massive load on any system. Even our dual A100 setup struggled... using the multimodal functionality (especially image input) required additional GPU resources. Latency increased.
- 16“mistralai/Mistral-Small-24B-Instruct-2501 · This Mistral Small has FAR less knowledge than the last.”. Retrieved March 24, 2026.
Mistral Small v2501 is seeing a similar drop from v2409... pop culture information (TV shows, movies, music, games, sports...) is more scrambled and weakly held in v2501 vs v2409.
