FREE WEBINAR

Amazon Full Service: Common Mistakes in Account Management

Microsoft Built The AI Visibility Tool Google Still Doesn’t Have — Here’s What We Found

What Are the Key Takeaways From Bing’s AI Performance Dashboard?

  • Microsoft launched AI Performance inside Bing Webmaster Tools on February 9th, 2025, making it the first tool from any major search engine to show publishers exactly how their content is cited in AI-generated answers.
  • A single long-form article published by skincare brand SeoulCeuticals accumulated 3,200 AI citations in under four weeks, roughly 10 times more than any other page on the site.
  • Grounding Queries, the most valuable section in the dashboard, shows the sub-queries AI systems fire when retrieving your content, revealing how AI is positioning your brand, not how humans are searching for it.
  • Citations measure machine retrieval events, not human traffic: high citation volume does not guarantee click-throughs, and brands must build a commercial path from cited content to product pages.
  • The dashboard covers Microsoft Copilot and Bing AI Summaries directly, with partial ChatGPT coverage; Google Gemini, Perplexity, and Claude are not included.

General Summary

On February 9th, 2025, Microsoft released AI Performance inside Bing Webmaster Tools, giving publishers the first dedicated dashboard from any major search engine to measure how content performs inside AI-generated answers. Google Search Console still blends AI Overview activity into general organic data, making it impossible to isolate. Bing AI Performance changes that by reporting total citations, page-level citation activity, visibility trends, and, most valuably, Grounding Queries: the sub-queries AI systems fire when retrieving content to build a response. For e-commerce brands investing in content, this tool reveals which pages are shaping AI answers, which authority signals AI is associating with specific URLs, and where the gap between AI citation and human conversion is costing them revenue. The data is free, it is available to any site verified in Bing Webmaster Tools, and most brand owners have not looked at it yet.

Extractive Summary

Microsoft launched AI Performance inside Bing Webmaster Tools on February 9th, giving publishers measurable data on AI citation activity for the first time. Grounding Queries shows the sub-queries AI systems fire to retrieve your content, not the questions humans type, creating a direct window into how AI is positioning your brand. High citation volume does not translate directly into traffic, because citations are machine retrieval events, and without a commercial path from cited content to a product page, brands are building visibility with no way to capture revenue from it. The dashboard covers Microsoft Copilot and Bing AI Summaries primarily, with partial visibility into ChatGPT, while Gemini, Perplexity, and Claude remain outside its scope. The practical playbook is four steps: set up AI Performance, find the highest-cited page, analyse Grounding Queries for language drift, and audit that page for a clear link to a product.

Abstractive Summary

AI search has been running a parallel economy inside brand content for months. Every time a buyer asks ChatGPT or Copilot a product question, an AI system is deciding which brands are authoritative, which sources to cite, and what information to surface. Until Bing AI Performance launched, that economy was invisible to the brands being evaluated inside it. The deeper implication is not just that visibility is measurable now. It is that the gap between citation and commerce is the central strategic problem for every brand investing in content. AI systems are building authority profiles for brands based on what they cite. Brands that understand which pages are earning citations, what questions those citations are answering, and whether those pages connect to a purchase path will compound an advantage that competitors who are not watching the data cannot replicate. The brands ignoring this are donating content strategy to AI systems without capturing any of the commercial return.

What Is Bing AI Performance and Why Does It Matter?

Bing AI Performance is the first dedicated dashboard from any major search engine that shows publishers how their content performs inside AI-generated answers, launched by Microsoft on February 9th, 2025. It tracks how often individual pages are cited when Bing’s AI and Microsoft Copilot build responses to user queries, giving brands a measurable signal where none existed before.

Before this, brands had two options. Type questions into ChatGPT or Copilot and hope their name appeared. Or check bot traffic in Google Analytics and make educated guesses about what the crawls meant. Neither method produced reliable data. Neither told you which specific pages were doing the work.

Google Search Console does not separate AI Overviews from traditional organic traffic. The data is blended. A page that earns 10,000 impressions from AI Overviews and 500 from conventional search results looks identical to a page with 10,500 organic impressions. There is no way to isolate it. There is no way to act on it.

Bing changed that. The dashboard shows five data points: total citations across the site, average cited pages per day, page-level citation activity by URL, visibility trends over time, and Grounding Queries. The first four establish scale. The fifth one tells you why.

Access is free and immediate. Any site verified in Bing Webmaster Tools can go to bing.com/webmasters/aiperformance right now. Verification takes under two minutes. Most brand owners have not done it.

The dashboard also carries a disclaimer at the top that is worth reading carefully. It states that the data shown represents a sample of overall activity. The numbers are real measurements, not estimates. But they are not capturing everything. That matters when interpreting citation counts: if a page shows 3,200 citations, the actual number of retrieval events is likely higher.

What Did We Find When We Pulled Up the Dashboard Live?

One page had 3,200 AI citations. The next highest page on the same site had 334. The gap was not close.

This happened on a live client call with SeoulCeuticals, a K-beauty skincare brand. We pulled up Bing AI Performance on screen share on March 18th. The page driving those citations was a long-form article titled ‘The Complete Guide to PDRN in Skincare: What the Research Actually Shows.’ It had gone live on February 23rd, less than four weeks earlier.

PDRN, polynucleotide, is a clinically backed skincare ingredient still relatively early in mainstream consumer awareness. The article was written to cover the topic comprehensively: mechanism of action, clinical evidence, concentration data, safety profile. It was long. It cited studies. It answered questions the way a medical reference source would.

That authority positioning is exactly what AI rewarded. While every other page on the site was earning hundreds of citations, one article was earning thousands, because AI systems had identified it as the most credible source on a specific topic.

The brand owner, Craig, looked at the data for about three seconds and said: ‘Do you think we should do a couple more PDRN articles?’ The instinct was right. But before replicating a result, you need to understand what actually drove it. That answer is inside Grounding Queries.

What the SeoulCeuticals result demonstrates is that AI citation is not distributed evenly across a content portfolio. One page can outperform the rest of a site by a factor of ten, because it happens to match the type of source AI is looking for when answering a specific category of question. Clinical depth on an emerging ingredient is exactly the kind of content AI retrieves when a buyer asks an informed question about that ingredient. Broad category articles, product comparison posts, and brand overview pages rarely earn the same citation density, because AI is not turning to them to answer specific technical questions.

What Are Grounding Queries and Why Do They Change Everything?

Grounding Queries are the sub-queries AI systems fire against the web index before building a response, and they are categorically different from the questions humans type. When someone asks Copilot ‘What is the best PDRN serum?’, the AI does not just answer from memory. It fires its own structured queries to retrieve verified information first. Those sub-queries are what appears in this tab.

The distinction matters because the queries look nothing like search terms. They read as structured, clinical phrases: ‘polynucleotide clinical evidence skin regeneration’, ‘PDRN concentration wound healing study’, ‘polynucleotide anti-aging mechanism dermal fibroblasts’. These are machine-generated search strings optimised for precision retrieval, not human intent phrasing optimised for a search bar.

What Does Layer One of Grounding Query Analysis Tell You?

The first layer reveals what AI already associates with your content. If the grounding queries firing against a skincare article are medical reference phrases, AI is treating that page as a clinical source. That is a different authority positioning than ‘best PDRN serum to buy.’ Both framings might earn citations. Only one builds the kind of trust that places a brand ahead of competitors in AI-recommended product answers.

Knowing which positioning AI has assigned to each URL tells brands whether they are building the authority signal they intended, or an accidental one that stops short of commercial conversion.

What Is Language Drift and Why Should Brands Act on It?

Language drift occurs when AI associates a page with terminology the brand never actually wrote. Grounding queries may include phrases that do not appear anywhere in the original article. The AI is connecting those terms to the content anyway, based on semantic relationships it has inferred.

These phrases are content briefs. Each one represents a question AI is already trying to answer using content that only partially covers it. Writing a dedicated article on that exact phrase converts an accidental association into a deliberate one. Brands that systematically mine language drift from Grounding Queries and build content around those terms are compounding their authority with each piece they publish.

The practical move is straightforward. Screenshot the Grounding Queries tab. Drop every phrase into a spreadsheet. Sort by frequency. Separate phrases that suggest purchase intent from phrases that suggest pure research intent. A phrase like ‘best PDRN serum recommendation’ belongs in a different content track than ‘PDRN clinical trial wound healing.’ Both are opportunities. They are different types of opportunities, and they require different article formats, different calls to action, and different placements in the buyer journey.

What Are Programmatic Queries and What Do They Signal?

Some entries in Grounding Queries look automated, because they are. GEO monitoring tools, generative engine optimisation platforms, systematically query the web to benchmark which brands AI systems treat as authoritative. When those programmatic queries fire against a specific URL, it means the content is being evaluated and catalogued by third-party systems tracking AI brand authority. Seeing those queries is a positive signal: the page has reached the threshold where it is worth monitoring.

What Is the Visibility Gap and Why Is It the Central Problem?

The Visibility Gap is the difference between AI citation volume and actual human traffic: a page can accumulate thousands of citations while generating almost no click-throughs. One documented case shows a page with over a thousand Bing AI citations during a period when it had three traditional Bing search impressions.

Citations in this dashboard are machine retrieval events. The AI fetched the page to build an answer. Whether a human then clicked through to the site is a separate question, and right now, AI Performance does not answer it. There is no click-through rate column. Microsoft acknowledges this gap. It has not been filled yet.

This means citation volume is a brand equity metric, not a revenue metric, until the brand builds the bridge between them.

The visibility gap is not unique to Bing. Perplexity, ChatGPT, and Gemini all build answers from cited sources, and none of them currently pass clean referral traffic signals back to publishers in a way that separates AI-driven visits from traditional organic ones. Bing is the first platform to show citation data directly. The gap it reveals exists everywhere AI generates answers from web content. What Bing AI Performance has done is make one corner of that invisible economy visible for the first time.

How Should Brands Think About Citations in Two Buckets?

The first bucket is authority. Every citation shapes AI’s internal model of what a brand knows, how expert it is, and which topics it owns. That model compounds. A brand cited consistently on a specific topic becomes the default reference for that topic. The position strengthens with each citation, whether the brand is paying attention or not.

The second bucket is commerce. Citation without conversion is brand equity with no revenue path attached. The question is not whether the AI cited the page. The question is whether someone who read an AI answer shaped by that page can find a product to buy. If there is no link from the informational article to the relevant product page, the brand has built a library with no gift shop.

What Was the Right Reaction to 3,200 Citations on a Single Page?

Craig did not celebrate for long. Within thirty seconds he said: ‘We need to put banners on this page. Banners that link directly to the PDRN product page.’ The article earns the citation. The citation puts the article in front of an AI-informed buyer. The article then needs to convert that buyer into a customer. The citation gets a brand into the room. The conversion architecture on the page closes the deal.

What Does Bing AI Performance Actually Cover and What Does It Miss?

Bing AI Performance tracks Microsoft Copilot and Bing AI Summaries directly, with partial coverage of ChatGPT through the Bing Search API relationship, but it does not cover Perplexity, Google Gemini, or Claude. This is a Microsoft ecosystem tool.

The ChatGPT coverage is real but shrinking. Because ChatGPT historically used the Bing Search API to browse the web, grounding queries fired by ChatGPT against the Bing index can appear in this dashboard. Craig’s read on the SeoulCeuticals data was that the citations reflected GPT alongside Copilot. That was probably accurate at the time.

The overlap is declining. One analyst tracking 240 million ChatGPT citations found that Bing’s overlap with ChatGPT dropped from 26% in April 2024 to 8% by July 2024. ChatGPT is increasingly using Google’s index instead. The data in Bing AI Performance represents a partial ChatGPT picture, real signal, not the complete story.

The dashboard itself carries a disclaimer at the top: the data shown represents a sample of overall activity. If a page shows 3,200 citations, the true number is higher. The tool is not undercounting deliberately. It is capturing what it can measure. That means the actual authority being built is larger than the dashboard reflects.

What Is the Practical Playbook for Using This Data?

Step one is access. Go to bing.com/webmasters/aiperformance. If the site is not verified in Bing Webmaster Tools, verification takes under two minutes. The dashboard is free, it is live, and it is already measuring content activity whether the brand owner has looked at it or not.

Step two is diagnosis. Go to the Pages tab. Find the highest-cited URL. If one page significantly outperforms every other page, the priority is understanding why before trying to replicate it. Topic selection, format, external authority signals like press coverage or backlinks, content depth: each of these can drive citation volume, and each one produces a different strategic response.

Step three is the Grounding Queries audit. Screenshot the queries. Drop them into a spreadsheet. Look for phrases that repeat. Look for terminology the brand never wrote. Look for queries that suggest purchase intent rather than pure ingredient research. That data is telling you what AI thinks the brand is authoritative on, and what questions it is trying to confirm when it retrieves the page.

Step four is the commercial audit. Whatever the highest-cited page is, open it and ask one question: can someone reading this page find a product to buy? If the answer is no, fix that before writing a single new piece of content. The citation is already being earned. The only thing missing is the path to revenue.

Why Is This Tool Available When Google’s Equivalent Still Isn’t?

Microsoft has a structural incentive Google does not. Bing’s market share in traditional web search sits below 4% globally. Copilot and AI-powered Bing represent the strongest growth lever Microsoft has in search. Giving publishers tools to measure and optimise for that AI channel directly serves Microsoft’s interest in making Bing AI indispensable to the content ecosystem.

Google’s position is different. Google Search is the dominant platform. Disaggregating AI Overview data from organic data gives publishers visibility into a dynamic that Google may prefer to manage on its own terms. There is no evidence Google will add this transparency soon. The incentive structure does not push in that direction.

For brands, this means Bing AI Performance is the best available proxy for a cross-platform phenomenon. The content strategies that earn citations in Microsoft’s AI ecosystem correlate strongly with strategies that earn citations in other AI systems. Comprehensive coverage of a specific topic, cited evidence, direct answers, low semantic distance between entity names and their attributes: these patterns perform well across Copilot, ChatGPT, Gemini, and Perplexity. Bing AI Performance gives brands a measurement framework for testing what works, even for the platforms it does not directly cover.

The practical implication is that brands should treat Bing AI Performance as a signal layer, not a complete measurement solution. When a content strategy moves the needle in Bing’s AI ecosystem, the underlying factors driving that result almost certainly operate across other platforms too. Topic authority, citation density, entity clarity, and content structure are not Bing-specific variables. They are the factors AI systems across the board use to evaluate source quality. Optimising for what this dashboard measures is optimising for AI search broadly.

What Should Brands Do Right Now?

The gap between brands using this data and brands ignoring it will widen fast. AI-cited content builds brand authority in AI systems automatically, whether the brand is measuring it or not. The brands measuring it can act on it. The ones not measuring it are donating content strategy to AI systems and capturing none of the commercial return.

Setting up Bing AI Performance takes two minutes. The data has already been accumulating. Go look at it.

See how we can help you maximise revenue from your ad spend

Scroll to Top