Alpha
amallo chat Icon
Wiki/Models/Sonar Pro
model

Sonar Pro

Sonar Pro is a large language model (LLM) developed by Perplexity AI, designed specifically for search-centric applications and retrieval-augmented generation (RAG) 1. Launched on January 21, 2025, it serves as the flagship offering within the Sonar API family 4. Unlike standard generative AI models that rely primarily on static training data, Sonar Pro is engineered to integrate real-time internet connectivity to improve factual accuracy and provide up-to-date information 1. Perplexity positions the model as a solution for developers and enterprises seeking to build search tools that prioritize authority and verifiable sourcing through a web-grounded framework 4.

The model is distinguished from the base Sonar model by its capacity for multi-step reasoning and a larger context window 1. Perplexity states that Sonar Pro is capable of handling more nuanced search queries and maintaining coherence across complex follow-up questions 1. A primary feature of the model is its citation output; the developer asserts that Sonar Pro provides approximately double the number of citations per search compared to the standard Sonar model on average 1. This focus on transparent sourcing is intended to allow users to verify generated information against trusted web sources in real-time, reducing the likelihood of AI hallucinations 1, 4.

In performance evaluations focused on factual correctness, Sonar Pro has been benchmarked using SimpleQA, a metric designed to measure an AI's ability to answer short, fact-seeking questions 1. On this benchmark, Sonar Pro achieved an F-score of 0.858, which Perplexity claims outperforms general-purpose frontier models from competitors such as OpenAI, Google, and Anthropic in factuality-oriented tasks 1, 4. By prioritizing real-time grounding over stored internal parameters, the model is positioned as a direct competitor to general-purpose models like GPT-4o for enterprise-grade retrieval tasks 1.

Commercially, Sonar Pro is a central component of Perplexity's strategy to expand its revenue beyond consumer subscriptions through developer-facing API services 4. It utilizes a usage-based pricing model that is more expensive than the base Sonar tier, reflecting its more intensive search operations and detailed response generation 4. Several organizations have integrated the model into their software prior to or at its general availability; for example, Zoom uses Sonar Pro to power its AI Companion 2.0, allowing users to perform real-time web searches within the video conferencing interface 1. Other early adopters include Copy AI for sales prospect research and Doximity for medical journal inquiries, where in-line citations are utilized to provide research-backed answers in professional settings 1.

Background

The development of Sonar Pro occurred within a broader industry transition toward retrieval-augmented generation (RAG) and the mitigation of large language model (LLM) hallucinations. Most early generative AI models provided answers based primarily on static training data, which restricted their ability to provide current information or verify factual claims against external evidence 14. Perplexity AI, which initially gained prominence as a consumer-facing search interface, developed the Sonar API family to provide developers with tools specifically optimized for factuality through real-time internet connectivity 1.

The move toward proprietary, search-specific models like Sonar Pro allowed for greater customization of search behaviors that general-purpose models often lacked. According to Perplexity, standard generative AI features available at the time of the model's release were limited by their reliance on stored parameters rather than trusted external sources 1. By building a model specifically for the API market, Perplexity aimed to offer search domain filters and customizable source selection, addressing developer demand for greater control over the grounding data used in AI responses 1.

On January 21, 2025, Perplexity officially launched the Sonar API service, which included both the base Sonar model and the more advanced Sonar Pro 4. While the base Sonar model was designed to be a lightweight and cost-effective option for simple question-and-answer tasks, Sonar Pro was developed to handle more complex, multi-step queries 14. Perplexity positioned Sonar Pro as a solution for enterprise-level requirements, providing a larger context window and a higher average number of citations per search compared to the base version 1.

At the time of its release, the market for AI APIs was characterized by significant price competition and an increasing focus on benchmarking factual correctness 4. Perplexity asserted that Sonar Pro outperformed contemporary models from Google, OpenAI, and Anthropic on the SimpleQA benchmark—a metric designed to evaluate the factual accuracy of short-form answers—achieving an F-score of 0.858 14. Early enterprise adoption included integrations by Zoom, which utilized Sonar Pro to provide real-time search capabilities within its video conferencing platform, and the medical platform Doximity, which used the model to surface clinical guidelines with inline citations 1.

Architecture

The technical architecture of Sonar Pro is founded on the Llama 3.3 70B model, which Perplexity AI has subjected to search-specific fine-tuning and alignment procedures 2. While the base Llama 3.3 70B provides the underlying linguistic and reasoning capabilities, Sonar Pro is differentiated by its optimization for two specific dimensions: answer factuality and readability 2. According to the developer, factuality in this context refers to the model’s ability to ground its outputs in real-time search results while resolving conflicting or missing information from multiple web sources 2. The readability component involves training the model to utilize markdown formatting effectively to organize complex information into concise, structured responses 2.

In terms of inference performance, Sonar Pro integrates with Cerebras hardware infrastructure to achieve high-speed text generation 2. The developer states that the model reaches a decoding throughput of approximately 1,200 tokens per second, which it claims is nearly ten times faster than the throughput of Gemini 2.0 Flash 2. This specialized hardware setup is intended to minimize latency during the retrieval-augmented generation (RAG) process, allowing for nearly instantaneous responses even when the model must synthesize data from numerous external sources 2. While the Pro model served through the Perplexity user interface utilizes this infrastructure, Perplexity noted at launch that the API version was scheduled for a subsequent transition to the same hardware 2.

The model’s context management system is significantly expanded relative to the standard Sonar model to support multi-step research tasks and longer conversational histories 1. This larger context window allows Sonar Pro to ingest and process more nuanced search histories and a higher volume of source material simultaneously 1. Perplexity reports that this architectural capacity enables the model to provide, on average, double the number of citations per search compared to the base Sonar model 1. For the 'Pro' tier, the architecture is specifically designed to execute multiple searches for a single user prompt, allowing it to aggregate information from a broader range of the internet before generating a final response 4.

Functionally, the architecture supports advanced developer features such as a dedicated JSON mode and the application of search domain filters 1. These capabilities allow for structured data extraction and targeted information retrieval across specific subsets of the web 1. In factual evaluation benchmarks, the developer reports that Sonar Pro achieved an F-score of 0.858 on SimpleQA, a metric designed to test the ability of large language models to answer short, fact-seeking questions without hallucinating 1. Additionally, the model is characterized by its adherence to user-provided instructions and world knowledge, with the developer citing superior performance over models such as GPT-4o mini and Claude 3.5 Haiku on the IFEval and MMLU benchmarks 2.

Capabilities & Limitations

Sonar Pro is specialized for tasks requiring high factual accuracy, real-time information retrieval, and verifiable citations. According to Perplexity AI, the model is engineered to prioritize factuality and authority by grounding its generative outputs in external web data rather than relying solely on internal training parameters 14.

Primary Capabilities

The model is optimized for search-centric retrieval-augmented generation (RAG) and is capable of handling in-depth, multi-step queries 1. A central feature of Sonar Pro is its citation mechanism; the model provides approximately double the number of citations per search compared to the standard Sonar model 14. This high citation density is intended to foster user trust and allow for easier verification of claims in high-stakes fields such as medicine or finance 1. For example, the medical platform Doximity utilizes the model to provide doctors with research-backed answers regarding guideline changes or insurance reimbursements 1.

For developers and enterprise users, Sonar Pro includes native support for structured outputs through a dedicated JSON mode 1. This allows the model’s research findings to be integrated into programmatic workflows without the need for additional post-processing. Additionally, the model supports search domain filters, which allow users to narrow search authority to specific trusted domains or datasets, effectively customizing the sources from which the model retrieves information 14. The model also features a larger context window than the base Sonar model, enabling it to maintain coherence over longer search sessions and follow-up questions 1.

Performance and Benchmarks

Perplexity evaluates the model's factuality using the SimpleQA benchmark, which measures the ability of a model to answer short, fact-seeking questions. According to internal testing, Sonar Pro achieved an F-score of 0.858 on this benchmark 1. The developer states that this score indicates higher factual correctness than several competing models from Google, OpenAI, and Anthropic in the context of short-form factual retrieval 14.

Limitations and Failure Modes

Despite its focus on factuality, Sonar Pro has several identified limitations and specific failure modes. Its performance is fundamentally dependent on the quality and availability of external search indices; if the retrieved search results contain inaccuracies or if the model cannot access relevant real-time data, its output may be compromised 1. While the RAG architecture is intended to mitigate hallucinations, the model remains a generative system that can still produce incorrect information if it misinterprets source material or if sources are contradictory 4.

Sonar Pro is not intended for general-purpose creative tasks such as creative writing, poetry, or highly abstract roleplay, as its fine-tuning is focused on summarization and fact-seeking 1. Compared to larger, general-purpose large language models (LLMs) that lack integrated search, Sonar Pro may exhibit less versatility in non-search-related reasoning or creative brainstorming. Additionally, its use of real-time search can lead to more unpredictable pricing and latency for developers compared to static LLMs, as the model may execute multiple searches to satisfy a single complex prompt 4.

Performance

Sonar Pro's performance is characterized by its focus on factual accuracy and high-speed inference for search-related tasks. On the SimpleQA benchmark, which evaluates the ability of large language models to answer short, fact-seeking questions correctly, Sonar Pro achieved an F-score of 0.858 1. This score represents a notable improvement over the standard Sonar model, which recorded an F-score of 0.773 on the same metric 14. Perplexity AI states that this performance level is achieved by combining the reasoning capabilities of the underlying model with real-time internet retrieval to verify information against current sources 1.

In standardized academic evaluations, the developer reports that Sonar Pro outperforms other models in its class, specifically GPT-4o mini and Claude 3.5 Haiku 2. These evaluations include the Instruction Following Evaluation (IFEval) benchmark, which tests adherence to complex formatting and logic constraints, and the Massive Multitask Language Understanding (MMLU) suite, which measures world knowledge across a broad range of domains 2. Internal A/B testing conducted by Perplexity AI further indicated that user satisfaction levels for Sonar Pro approach those of GPT-4o and exceed those of Claude 3.5 Sonnet 2.

Operational speed is a primary metric for the model, which utilizes Cerebras inference infrastructure to facilitate rapid output. According to the developer, Sonar Pro reaches a decoding throughput of 1,200 tokens per second 2. This speed is approximately 10 times faster than comparable models, such as Gemini 2.0 Flash, during specific decoding tasks 2. Perplexity asserts that this throughput enables near-instant answer generation for both simple and detailed queries 2.

Regarding cost efficiency, Sonar Pro is positioned as a more affordable alternative to traditional frontier models for search-centric applications. The API pricing is structured with a base rate of $5 per 1,000 searches, supplemented by token-based costs: $3 per 1 million input tokens and $15 per 1 million output tokens 4. While this makes Sonar Pro more expensive than the base Sonar model, the Pro version is designed to handle more complex, multi-step queries and provides approximately double the number of citations per search 14.

Safety & Ethics

The safety and ethical framework of Sonar Pro is centered on the mitigation of artificial intelligence hallucinations and the promotion of factual transparency. Perplexity AI states that the model is purpose-built for factuality, employing a retrieval-augmented generation (RAG) architecture to ground its responses in real-time web data rather than relying exclusively on internal training parameters 14. This approach is intended to address the inherent limitations of standard generative models, which are often prone to providing outdated or incorrect information because they are informed only by static training data 4.

Factuality and Misinformation

To align the model with objective reporting, Perplexity utilizes alignment techniques designed to prioritize factual accuracy over creative or speculative content. The developer claims that these optimizations allow Sonar Pro to perform effectively on the SimpleQA benchmark, which evaluates an LLM's ability to answer short, fact-seeking questions correctly 14. According to the developer's internal testing, Sonar Pro achieved an F-score of 0.858 on this benchmark, compared to 0.773 for the standard Sonar model 1. By prioritizing real-time information retrieval, the model seeks to minimize the risk of disseminating misinformation common in models that rely solely on stored weights 1.

Transparency and Verification

A core component of Sonar Pro’s safety architecture is the systematic inclusion of in-line citations. By providing direct links to the sources used to generate an answer, the model allows users to independently verify claims 1. This feature is intended to support high-stakes use cases, such as medical research or professional finance, where verifiable documentation is required for accuracy and trust 1. Additionally, the Sonar Pro API offers search domain filters, enabling developers to restrict the model's information retrieval to specific, trusted domains to further reduce the risk of surfacing content from unreliable sources 1.

Content Filtering and Privacy

Sonar Pro incorporates safety layers designed to filter harmful content and prevent the retrieval of dangerous material from the web. Perplexity asserts that these filters are integrated into the real-time search pipeline, allowing the model to handle queries about current, rapidly evolving events while adhering to safety guidelines 1. Regarding data privacy, the developer states that enterprise partners can implement Sonar Pro for private searches 1. This capability is utilized by third-party platforms, such as Zoom, which integrates the model to provide real-time information within its conferencing software without requiring users to exit the secure environment 14.

Applications

Sonar Pro is primarily utilized in enterprise environments and specialized industries that require real-time data retrieval and verifiable information. The model is delivered via an API, enabling third-party developers to embed generative search features directly into their own software products 14.

Enterprise and Productivity

A primary application of Sonar Pro is its integration into the Zoom AI Companion 2.0. Within this conferencing platform, the model powers a side-panel assistant that performs live web searches during active sessions 1. This allows participants to verify facts or research topics during meetings without navigating away from the video interface 4. According to Zoom product management, the model enables the assistant to retrieve information from sources external to the company's internal data silos 1.

Specialized Industry Research

In the healthcare sector, the medical network Doximity utilizes Sonar Pro to provide physicians with a research tool for clinical and administrative inquiries 1. Doctors use the model to track medical guideline updates and navigate insurance reimbursement protocols. Perplexity AI states that the model's in-line citations are essential in this context to provide the transparency required for medical decision-making 1.

In sales and go-to-market (GTM) automation, the platform Copy.ai has integrated the model to streamline background research on potential clients and target organizations 1. According to the developer, this integration has allowed sales representatives to reduce manual research time by approximately eight hours per week, contributing to a reported 20% increase in rep throughput 1.

Ideal and Non-Ideal Scenarios

Sonar Pro is intended for in-depth research tasks that involve complex, multi-step queries 1. Its architecture, which supports a larger context window and provides significantly more citations per search than the standard Sonar model, makes it suitable for detailed technical, financial, or academic inquiries 14. It is specifically positioned for "fact-seeking" tasks where information from static training data may be outdated or insufficient 4.

Conversely, the model may not be the optimal choice for applications requiring extreme low latency or simple, single-turn interactions. Perplexity AI indicates that its standard Sonar model—which is lighter and faster—is more appropriate for basic question-and-answer features where speed is prioritized over depth of research 1. Furthermore, because Sonar Pro's pricing involves variable costs based on the number of searches performed per query, it may be less suited for high-volume, low-complexity tasks where budget predictability is the primary requirement 4.

Reception & Impact

Sonar Pro's industry reception has been characterized by its emphasis on speed and citation-backed accuracy. On the SimpleQA benchmark, which measures factual correctness in short answers, the model recorded an F-score of 0.858; Perplexity AI claims this performance exceeds that of comparable models from Google and OpenAI 14. The model’s speed is attributed to its integration with Cerebras inference infrastructure, which reportedly enables a decoding throughput of 1200 tokens per second 2. Early enterprise adoption includes integrations with Copy AI for go-to-market research and Zoom’s AI Companion 2.0, where the model facilitates real-time searches within video conferencing sessions 14.

Despite these performance metrics, the model and its underlying search engine have been central to discussions regarding the ethical implications of AI-driven search. An investigation by WIRED characterized the service as a "bullshit machine," reporting that the system bypassed exclusion protocols to scrape content from websites and occasionally presented fabricated information as fact 5. This criticism highlights a broader industry tension between generative search tools and the rights of web publishers.

The economic impact of Sonar Pro and similar "answer engines" has raised significant concerns within the digital marketing and publishing sectors. The shift toward AI-generated summaries is frequently described as a "visibility crisis" that challenges traditional search engine optimization (SEO) practices 7. Industry research indicates that 71.5% of users now utilize AI tools for information retrieval, often resulting in "zero-click searches" where users receive answers directly within the AI interface without visiting the source websites 6. This impact is noted to be particularly severe for manufacturers and firms providing technical documentation; their structured data is highly susceptible to AI summarization, which can lead to declining web traffic even when search rankings remain stable 8.

In the competitive landscape, Sonar Pro is positioned as a direct rival to Google’s AI Overviews and OpenAI’s SearchGPT 47. While Perplexity states that Sonar Pro provides twice as many citations on average as its lighter models to improve transparency, industry analysts argue that this model of information delivery risks eroding the financial viability of the publishers it cites 48. Consequently, some digital strategy specialists advocate for a transition toward "Generative Engine Optimization" (GEO), a strategy focused on ensuring brands remain visible and authoritative within the summaries generated by models like Sonar Pro 7.

Version History

The Sonar model family was developed as a specialized successor to Perplexity AI's earlier search-oriented models. On January 21, 2025, Perplexity officially transitioned the Sonar API into general availability, establishing a tiered structure that introduced Sonar Pro as the flagship model alongside a standard, lightweight Sonar version 14.

Model Foundation and Migration

The underlying architecture of the Sonar family has evolved through successive iterations of Meta's Llama models. The current version of Sonar Pro is built upon the Llama 3.3 70B foundation 2. Perplexity states that this version was specifically fine-tuned to improve two primary dimensions: answer factuality and readability 2. This transition from earlier Llama 3-based versions was designed to enhance the model's ability to resolve conflicting information and ground responses more effectively in real-time search results 2.

API and Feature Evolution

With the formal launch of the Sonar Pro API in early 2025, several technical features were introduced to provide developers with greater control over retrieval-augmented generation (RAG) processes. These updates included:

  • Domain Filtering: A feature allowing users to restrict or prioritize specific web domains during the search and retrieval phase 14.
  • Structured Outputs: The introduction of JSON mode for select usage tiers to facilitate integration into programmatic workflows 1.
  • Increased Citation Density: Sonar Pro was updated to provide approximately twice as many citations per search as the standard Sonar model 14.
  • Context Window Expansion: The Pro tier was granted a larger context window to accommodate more complex, multi-step queries and extensive follow-up interactions 1.

Infrastructure Updates

In tandem with model updates, Perplexity integrated the consumer-facing version of Sonar with Cerebras inference infrastructure. According to the developer, this integration enables a decoding throughput of 1200 tokens per second 2. While the API version of Sonar Pro did not initially launch on this infrastructure, Perplexity indicated that a migration to Cerebras for API users was planned following the initial rollout 2.

Sources

  1. 1
    Introducing the Sonar Pro API by Perplexity. Retrieved March 24, 2026.

    For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth, multi-step queries with added extensibility, like double the number of citations per search as Sonar on average. ... Sonar Pro leads this benchmark with an F-score of 0.858.

  2. 2
    Perplexity launches Sonar, an API for AI search | TechCrunch. Retrieved March 24, 2026.

    Perplexity on Tuesday launched an API service called Sonar, allowing enterprises and developers to build the startup’s generative AI search tools into their own applications. ... Perplexity claims Sonar Pro outperformed leading models from Google, OpenAI, and Anthropic on a benchmark that measures factual correctness... SimpleQA.

  3. 4
    Perplexity Is a Bullshit Machine - WIRED. Retrieved March 24, 2026.

    A WIRED investigation shows that the AI-powered search startup Forbes has accused of stealing its content is surreptitiously scraping—and making things up out of thin air.

  4. 5
    The Future of Search: SEO and AI Impacts Explained. Retrieved March 24, 2026.

    71.5% of people report using AI tools for search... AIOs contribute to the growth of zero-click searches, where users get answers directly in search results without clicking through to a website.

  5. 6
    Navigating AI Search vs Traditional SEO: Strategies to Stay Visible in 2025 and Beyond. Retrieved March 24, 2026.

    As AI-generated answers replace traditional search results, your website may no longer appear in the same places... Generative Engine Optimization (GEO) is a new strategy that blends traditional SEO with AI search optimization.

  6. 7
    Why AI Search Is Quietly Killing Traditional SEO for Manufacturers. Retrieved March 24, 2026.

    Manufacturing websites typically publish structured technical information... This type of content is highly compatible with AI summarisation systems. Search engines can extract the key facts and present simplified answers directly on the results page.

  7. 8
    BREAKING : Perplexity announced Sonar Pro, a new real-time .... Retrieved March 24, 2026.

    {"code":200,"status":20000,"data":{"title":"TestingCatalog AI News 🗞️ (@testingcatalog) on Threads","description":"BREAKING 🚨: Perplexity announced Sonar Pro, a new real-time search API which outperforms other competitors at the SimpleQA benchmark 👀","url":"https://www.threads.com/@testingcatalog/post/DFGXKrStxpA","content":"[![Image 1: testingcatalog's profile picture](https://scontent-sof1-1.cdninstagram.com/v/t51.2885-19/259336839_986179312322557_4101704442879723157_n.jpg?stp=dst-jpg_s150x

Production Credits

View full changelog
Research
gemini-2.5-flash-liteMarch 24, 2026
Written By
gemini-3-flash-previewMarch 24, 2026
Fact-Checked By
claude-haiku-4-5March 24, 2026
Reviewed By
pending reviewMarch 25, 2026
This page was last edited on March 26, 2026 · First published March 25, 2026