AI Overview citation patterns (GEO/AEO)

reference · Scope: marketing-site · Status: current

Created 2026-06-11 · Updated 2026-06-25

Summary

Overview

"AI Overview citation patterns" denotes the empirical question of which web pages get cited inside generative search surfaces — Google's AI Overviews and AI Mode, ChatGPT's web-grounded answers, Perplexity, Microsoft Copilot, and the smaller Anthropic / DeepSeek / Mistral surfaces — and what content, technical, and entity-graph properties predict that citation. The discipline is variously called Generative Engine Optimization (GEO) and Answer Engine Optimization (AEO); Whitespark's 2026 Local Search Ranking Factors report was the first practitioner survey to add "AI Search Visibility" as a formal ranking category, marking the moment AI citation moved from a side-bet to a first-class line item in local SEO planning.

This page consolidates the directly measured evidence as of mid-2026. It draws on one peer-reviewed primary study (Princeton's GEO paper, Aggarwal 2024), one peer-reviewed observational study (Schanbacher's SSRN paper on German real-estate sites), and a set of high-volume vendor analyses (Ahrefs' 17 M-citation freshness study; Seer's 3,119-query CTR study; Profound's 680 M-citation cross-platform overlap analysis; OtterlyAI's 1 M-citation third-party split; Surfer SEO's 36 M-AI-Overview / 46 M-citation tracker; BrightEdge's AI Overview prevalence tracker; Digital Applied's 5,000-site schema audit). It also catalogues a production rule-set — what is shipped on a typical client site, what is refused, and where the honest answer is "the direct evidence is thin and an SMB should optimize for customers, not for an unverified citation hypothesis."

Three things should be carried into any reading of the rest of the page. First, the strongest single primary finding is from the Princeton GEO paper, where Quotation Addition lifted AI-response visibility by +41% on Position-Adjusted Word Count (PAWC), Statistics Addition by +31%, and Cite Sources and Fluency Optimization tied at +28%; the only tactic that hurt visibility was keyword stuffing, at -8% to -10%. Second, the strongest single revenue translation is Seer's September 2025 study: brands cited inside an AI Overview earn +35% organic CTR and +91% paid CTR versus uncited brands on the same SERPs, while the AI Overview itself collapses overall organic CTR by 61% and paid CTR by 68%. Third, the strongest single "do not over-promise" finding is Dan Taylor's January 2026 Search Engine Land analysis of 107,352 AI-Overview/AI-Mode webpages, which found Core Web Vitals carry only weak negative Spearman correlations with AI visibility (LCP r = -0.12 to -0.18, CLS r = -0.05 to -0.09) — i.e., CWV is a gate, not a growth lever. The honest synthesis is that content patterns (quotations, statistics, citations, freshness, structured BLUF formatting) do most of the work; schema is hygiene; speed is a constraint, not a signal.

See the Aggarwal et al. Generative Engine Optimization paper (arXiv:2311.09735 v3, KDD '24), Google's May 15, 2026 generative-AI optimization guidance, Measurement frameworks for small-business websites for AI Overview prevalence, and Google Search Central's query fan-out documentation for the upstream primary sources, alongside the multi-vendor decoupling evidence catalogued in the "Indirect evidence" section below.

The Princeton GEO paper (Aggarwal et al., 2024) — the canonical primary study

The Princeton GEO paper — Aggarwal et al., Generative Engine Optimization, arXiv:2311.09735 (v3) — is the only peer-reviewed primary study that tests discrete content tactics against AI-response visibility under controlled conditions. The paper was published at ACM SIGKDD KDD '24 in Barcelona (DOI 10.1145/3637528.3671900), with affiliations across Princeton, Georgia Tech, the Allen Institute for AI, and IIT Delhi. It introduces GEO-bench (~10,000 queries × 9 source datasets × 25 domains) and evaluates nine tactics against two metrics — Position-Adjusted Word Count (PAWC), which weights citation share by where the cited passage appears in the AI response, and Subjective Impression (an LLM-rated quality score). The main answer engine is GPT-3.5-turbo; a 200-query subset is validated on the live Perplexity.ai engine. Table 6 of v3 reports the per-tactic lifts on PAWC.

Claim (verbatim): "Including citations, quotations from relevant sources, and statistics can significantly boost source visibility, with an increase of over 40% across various queries." Top methods (Cite Sources, Quotation Addition, Statistics Addition) "achieved a relative improvement of 30–40% on the Position-Adjusted Word Count metric and 15–30% on the Subjective Impression metric," with "visibility improvements up to 37%" on the live engine Perplexity.ai.

Source: Aggarwal et al., GEO: Generative Engine Optimization (Princeton / Georgia Tech / Allen Institute for AI / IIT Delhi), arXiv:2311.09735 v3, ACM SIGKDD KDD '24 — https://arxiv.org/abs/2311.09735 (Jun 28, 2024).

Confidence: Verified (peer-reviewed).

Caveat: The lifts were measured on edits to visible page text, NOT schema markup. Visibility is measured by citation-share in synthesised answers, not real click traffic. The Subjective Impression metric uses an LLM-as-judge, which carries the usual circularity caveats. Live-engine validation is a single 200-sample test on Perplexity. Tested on 2024-era engines (GPT-3.5, Perplexity); persistence on Gemini 3.5 / GPT-5 / Claude 4 / Sonar Pro is unverified.

Quotation Addition: +41% on PAWC (the top single tactic)

Quotation Addition — inserting verbatim quoted passages from named external sources — was the best-performing single tactic at +41% on PAWC. The paper's abstract and §1 state: "Including citations, quotations from relevant sources, and statistics can significantly boost source visibility, with an increase of over 40% across various queries." Confidence is Verified (primary read of paper v3).

There is a circulating attribution error in the SEO industry that credits Statistics Addition with the +41% number. The paper itself names Quotation Addition as the +41% tactic on PAWC. A flag is warranted: many second-generation SEO briefs inherit this confusion. The canonical reading is to use the paper, not the SEO blogs.

Statistics Addition: +31% on PAWC (the #2 tactic)

Statistics Addition — inserting concrete numbers, percentages, dates, and currency values into prose — lifted PAWC by +31%, the second-best result in the study. The practical implication for KB-shaped content is direct: every claim that can carry a number should carry one. Vague qualifiers like "many businesses" or "most of the time" are precisely the low-lift pattern the paper measured. Confidence: Verified (primary).

The PAWC metric is the most defensible of the paper's citation measures because it counts actual word-share in the synthesised answer rather than relying on a judge's rating. Across the three top tactics — Cite Sources, Quotation Addition, Statistics Addition — the band of relative improvement on PAWC sits at 30-40%.

Claim: Top methods (Cite Sources, Quotation Addition, Statistics Addition) "achieved a relative improvement of 30–40% on the Position-Adjusted Word Count metric" (PAWC — word-share in the synthesised answer, weighted by position).

Source: Aggarwal et al., arXiv:2311.09735 v3 — Jun 28, 2024.

Confidence: Verified (peer-reviewed).

Cite Sources: +28% on PAWC (tied with Fluency Optimization)

Cite Sources — adding named external citations comprising author + institution + URL — lifted PAWC by +28%. Fluency Optimization (rewriting for readability without changing substantive claims) tied at +28%. Confidence: Verified (primary).

The proposed mechanism is that AI retrievers extract more reliably from passages that anchor claims to a named institution + date. The signal is the same one Wikipedia's verifiability policy enforces — attribution makes a passage extractable as a fact, not as an opinion.

The paper's softer companion metric — Subjective Impression, an LLM-rated measure of how favourably the source comes across in the synthesised answer — moved 15-30% under the same top tactics.

Claim: Top methods achieved a 15–30% relative improvement on the Subjective Impression metric — an LLM-rated measure of how favourably the source comes across in the synthesised answer.

Source: Aggarwal et al., arXiv:2311.09735 v3 — Jun 28, 2024.

Confidence: Verified (peer-reviewed).

Caveat: Subjective Impression uses an LLM as judge, so the metric carries the usual circularity caveats of LLM-as-judge methodology. Pair with the PAWC band (the harder metric) for a balanced citation.

Keyword Stuffing: -8% to -10% (the only tactic that hurts)

Of the nine tactics tested, keyword stuffing was the only one that performed worse than baseline, driving visibility down 8-10%. Practitioners carrying forward 2018-era SEO instincts — keyword density targets, exact-match anchors, semantic stuffing of variants — measurably reduce their AI citation eligibility. The discipline is now non-overlapping with old-school SEO on this lever. Confidence: Verified (primary).

Rank-5 pages gain +115.1% (the biggest interventional upside)

The paper's strongest leverage finding is that GEO treatments (citation, quotation, statistics addition, in combination) lifted AI-response visibility by +115.1% for pages ranked around organic position 5. Position-1 pages saw little change. The intervention compounds where the marginal page sits — the middle of page 1. Confidence: Verified. The implication for client work is that sites ranking #3-#8 organically have the biggest upside from structured / GEO content; sites already ranking #1 should still ship the GEO patterns (insurance + extractability), but the lift is smaller. The decoupling between organic rank and AI citation documented in the "Indirect evidence" section below extends this picture: rank-5 leverage is the controlled-experiment view; the observational vendor studies show the same pattern at scale.

Combined lifts

Combined Fluency + Statistics reached +35.8% on PAWC. The paper does not report all pairwise combinations, but the additive read is conservative — the defensible operating assumption is that stacking Quotation + Statistics + Cite Sources on the same page is the durable pattern. The cap on combined lift is not measured.

Cross-engine validation: the GEO methods carried over to a live answer engine in a 200-sample Perplexity test.

Claim: "Visibility improvements up to 37%" on the live engine Perplexity.ai (200-sample test).

Source: Aggarwal et al., arXiv:2311.09735 v3 — Jun 28, 2024.

Confidence: Verified for the test result.

Caveat: Sample of 200 on a single engine. The only live-engine validation reported in the paper; crosses the gap from "works in the synthetic benchmark" to "works on a shipping AI answer engine," but the sample is modest.

Freshness as a measured citation signal

Ahrefs (2025, 17 M citations): AI-cited content is 25.7% fresher on publish date

Ahrefs analyzed 17 million AI citations and found AI-cited URLs averaged 1,064 days since publication versus 1,432 days for organic SERP results — a 25.7% freshness advantage on publish date. On last-updated date the advantage narrowed to 13.1% (909 days vs 1,047). Source: https://ahrefs.com/blog/do-ai-assistants-prefer-to-cite-fresh-content/. Confidence: Verified.

Claim: Ahrefs (Despina Gavoyannis), 16.975 M citations across 7 AI platforms — "the average age of URLs cited by AI assistants is 1,064 days, compared to 1,432 days for URLs in organic SERPs — 25.7% 'fresher'… ChatGPT is most likely to cite newer pages."

Source: https://ahrefs.com/blog/fresh-content/ — 2026.

Confidence: Verified (primary measurement on a large sample).

Caveat: Ahrefs sells SEO tools; this is a direct measurement on its own crawl — vendor source, large-N primary data.

A separate "67% more citations for recently updated pages" figure has circulated in some 2026 SEO writeups but could not be located in the Ahrefs source. The defensible numbers are 25.7% (publish) and 13.1% (updated) — not 67%. The finding pairs with Seer Interactive's October 2025 recency study (below): two independent datasets confirm the recency bias. A dateModified field that updates on real content changes is structurally beneficial; a dateModified that updates on cosmetic CSS changes is gaming and likely to be discounted.

Seer's complementary measurement on 5,000+ AI-cited URLs is the other half of the recency picture — a percentile distribution by content age rather than a mean-age comparison.

Claim: Seer Interactive analyzed 5,000+ AI-cited URLs (October 2025) and found a strong recency bias across ChatGPT, Perplexity, and Google AI Overviews: 65% of AI bot hits target content published in the past year; 79% in the past 2 years; 89% in the past 3 years; 94% in the past 5 years. Only 6% of hits target content older than 6 years.

Quote (Seer): "Nearly 65% of log hits were for content published within the past year (2025). 79% of total hits targeted content from the last two years (2024-2025). 89% of hits were on content updated within the last three years (2023-2025). 94% of hits occurred on content published within the last five years (2021-2025). Only 6% of hits were on content older than six years."

Source: Seer Interactive, Study: AI Brand Visibility and Content Recency, October 2025 — https://www.seerinteractive.com/insights/study-ai-brand-visibility-and-content-recency.

Confidence: Verified.

Whitehat SEO (2026): 3.2× more citations within 30 days

Whitehat SEO's 2026 cross-platform analysis reports: "Content updated within 30 days earns 3.2× more AI citations across platforms." Perplexity is the most freshness-sensitive — 82% citation rate for 30-day content vs 37% for older. Source: https://whitehat-seo.co.uk/blog/ai-engines-comparison-citations. The directional consistency with the Ahrefs and Seer findings is the load-bearing point; the absolute multipliers are vendor-published and should be flagged as such.

Profound (Aug 2024-Jun 2025, 680 M citations): cross-platform overlap is low

Profound's analysis of 680 million citations across the Aug 2024-June 2025 window found that only 11% of domains are cited by both ChatGPT and Perplexity, and Google AI Overviews and Google AI Mode cite the same URLs only 13.7% of the time. Source: https://www.tryprofound.com/blog/ai-platform-citation-patterns. Confidence: Verified. The strategic read is that citation strategy is not single-target — winning on Perplexity does not automatically win on ChatGPT, and winning on AI Overviews does not automatically win on AI Mode.

Companion citation-volume data from Qwairy Q3 2025: Perplexity averages 21.87 citations per response; ChatGPT averages 7.92 (https://www.qwairy.co/blog/provider-citation-behavior-q3-2025). The combination — high citation volume, freshness sensitivity, preference for proprietary data — makes Perplexity the platform where a normalized open-data dashboard or original-research page is most likely to be cited.

Schema and structured data as citation signals — the contested zone

The evidence on schema is the most internally contradictory part of the 2025-2026 AI-citation literature. Three findings sit in tension; the reconciliation matters for how the question is framed to clients.

Schanbacher (SSRN 2025): FAQPage and Product schema strongly predict ChatGPT visibility

Schanbacher (SSRN paper id 5641050) studied 1,508 German real-estate agent websites and found:

FAQPage schema: odds ratio ~13 for ChatGPT visibility (p<0.001)
Product schema: odds ratio ~4 (p<0.001) — sites with Product schema had 17.2% visibility vs 1.8% without (~10× lift)
Mobile-friendly: OR ~5.2
robots.txt present: OR ~3.4
Multi-level headings: h2 OR ~3.3, h3 OR ~2.3

Confidence: Verified (peer-reviewed) — but single domain (real estate) and single country (Germany). Generalizability across industries/languages is unverified.

Ahrefs (April 2026): no AI-citation lift from adding JSON-LD on already-cited pages

In direct counterpoint, Ahrefs' April 2026 controlled study tracked 1,885 pages adding JSON-LD against 4,000 controls between Aug 2025 and Mar 2026.

Claim: "Adding schema produced no major uplift in citations on any platform." AI Overviews on treated pages declined 4.6% more than controls (statistically significant); AI Mode and ChatGPT showed small, non-significant changes.

Source: Ahrefs, We Tracked 1,885 Pages Adding Schema. AI Citations Barely Moved. April 2026 — https://ahrefs.com/blog/schema-ai-citations/.

Confidence: Verified.

Critical caveat: Every page in the study already had 100+ AI Overview citations before treatment. The study measures whether schema moves already-cited pages, not whether it helps pages get cited in the first place. Confounded for acquisition; valid for incremental lift.

Contested by: Suganthan Mohanadasan's three lives of schema framing and SchemaApp — both argue the study only measures one of three schema effects (real-time query fetch), missing the larger Knowledge Graph and training-data effects that compound over years.

Reconciliation: Schanbacher measures acquisition; Ahrefs measures increment

The two studies are not actually in conflict once the populations are read carefully. Ahrefs measured the increment on pages that were already cited 100+ times. Schanbacher measured acquisition on a population where many pages had zero AI visibility. Both findings can be true: schema helps a site cross from "invisible" to "cited" but does not help it go from "cited" to "cited more." This reading aligns with Suganthan Mohanadasan's "three lives of schema" framing — see below.

Suganthan: schema has three lives — index-time, training-time, query-time

Suganthan Mohanadasan argues schema operates in three layers: (1) index-time — Google Knowledge Graph entity disambiguation; (2) training-time — canonical entity stores that feed next-generation LLMs (Wikidata, Schema.org corpus); (3) query-time — real-time fetch by AI agents at the moment of a user query. Single-experiment studies like Ahrefs' April 2026 result only measure layer 3. Layers 1 and 2 compound over years and account for most "schema is dead" hot-takes missing the point. Source: https://suganthan.com/blog/three-lives-of-schema-markup/. Confidence: Single-source synthesis (analytical framing, not empirical study).

This is the cleanest counter-frame to "Ahrefs proved schema is dead." When a client cites the Ahrefs result as a reason not to invest in schema, the honest position is that Ahrefs measured one slice; the other slices accumulate value the experiment cannot see.

Digital Applied 5 K-site audit (April 2026): 22% schema validity, r=+0.34 with AI citation

Digital Applied audited 5,000 production sites in April 2026 and found that 71% deploy at least one schema type, but only 22% pass Google's Rich Results Test cleanly. Pearson correlation between clean-validation rate and AI-citation frequency: r = +0.34. Source: https://digitalapplied.com/blog/schema-markup-adoption-5k-site-audit-2026. Confidence: Single-source (vendor-published; methodology not independently audited).

The implication is that "has schema" is a useless metric and "has valid schema" is the lever. Most installed-base schema is broken — missing required properties, conflicting nested types, scraping-error strings in title fields. The defensible production rule is to ship validated schema, not just emit <script type="application/ld+json"> and hope.

Google's own guidance: schema is not required for AI Overviews

Google's official May 15, 2026 guidance states that schema markup is not required for AI Overviews. See Technical SEO standards and structured-data discipline (2026).

Quote (verbatim): "From Google Search's perspective, optimizing for generative AI search is optimizing for the search experience, and thus still SEO."

Quote on commodity content (verbatim): "Be sure that you're writing non-commodity content that your readers will find helpful and reliable. Commodity content (for example, something like '7 Tips for First-Time Homebuyers') is often based on common knowledge, which could originate from anyone, and typically adds little unique insight for readers."

Source: Google, Optimizing your website for generative AI features on Google Search — https://developers.google.com/search/docs/fundamentals/ai-optimization-guide (May 15, 2026).

Confidence: Verified (Google primary).

Tactics Google explicitly says are unnecessary: llms.txt files, content chunking, AI-specific rewriting, inauthentic brand mentions, over-indexing on structured data for AI.

Tactics Google rewards: non-commodity content with a unique point of view, crawlable indexable pages, images/video, strong E-E-A-T.

This is consistent with the reconciliation above: schema is not a citation-acquisition cheat code, but it is necessary infrastructure for entity disambiguation, rich-result eligibility, and Knowledge-Graph membership.

Whitespark 2026: structured data as a formal AI-visibility input

Whitespark's 2026 Local Search Ranking Factors report (surveyed 47 local SEO experts, published late 2025 / early 2026) added "AI Search Visibility" as a formal ranking category for the first time. Structured data, consistent citations, and curated list mentions are named as direct AI-visibility inputs. "Dedicated page for each service" ranks #1 in Local Organic factors and #2 in AI Visibility factors. Reputation.com's summary: "Structured content — clear GBP fields, accurate website copy, and schema markup — gives AI systems the context needed to generate accurate answers." Source: https://whitespark.ca/local-search-ranking-factors/. Confidence: Verified for the category addition; weight estimates (proximity ~55%, GBP ~32%, reviews 16-20%) are practitioner-estimated industry-consensus, not measured experimentally.

The implication for an SMB clinic, contractor, or professional-services firm: a two-location operation with LocalBusiness schema + one page per service + a structured FAQ library will outperform a single-page prose site at AI citation, even at equal word count. Schema is now formally a local-ranking factor, not just an SEO nice-to-have.

Schanbacher / FAQPage / Product — the practical takeaway

The single most actionable schema finding for SMB sites is the Schanbacher result on FAQPage and Product. Even with the single-vertical caveat, the odds-ratio magnitude (OR ~13 for FAQPage, OR ~4 for Product) is large enough that the cost-benefit on a well-structured FAQ block is overwhelmingly positive — JSON-LD is non-render-blocking, payload is 1-10 KB per page, and the worst case is "no measurable lift" rather than "harm."

Schema for contractor sites — 2026 Google guidance

Translated to the contractor vertical specifically, the 2026 Google guidance stack is:

GeneralContractor / HomeAndConstructionBusiness / LocalBusiness — the primary business-entity schema. Pick the most specific applicable type.
Service schema per service line — kitchen renovation, basement finishing, custom home, etc.
Review and AggregateRating — subject to Google's authenticity requirements (reviews must be first-party, not third-party-collected).
Person schema for principals — with sameAs links to LinkedIn for E-E-A-T credentialing; pair with hasCredential for Gold Seal / P.Eng. / PMP.
FAQPage schema on service pages — surfaces in AI Overviews and zero-click features.
Project (Article subtype) schema on case studies — with datePublished, address, image, photographer credit where applicable.

Sources: schema.org; Google Search Central. Confidence: Verified for the schema types; Industry-consensus for the contractor-vertical adoption recommendations.

Based on HTTP Archive structured-data tracking, contractor sites lag the broader local-business average by roughly 15 percentage points in schema adoption — a directional finding (HTTP Archive 2024/2025 structured-data chapter; not vertical-isolated). With 45% of consumers now using ChatGPT/AI for local recommendations (BrightLocal LCRS 2026), schema-rich pages are increasingly the ones that get cited. The aphorism: schema-rich content gets cited; HomeStars profile pages do not.

The complete schema build-out for a Tier-2 Ontario ICI GC site is roughly:

Organization (root, in layout)
LocalBusiness / GeneralContractor per office location
Service per service-line page
Person + hasCredential per team-bio
FAQPage per service page
Article (Project subtype) per case study
BreadcrumbList site-wide
Organization + memberOf for HBA affiliations on the /affiliations page

The technical preconditions — server-render and crawler access

The single largest hidden barrier to AI citation is not content quality; it is technical reachability. OtterlyAI's 1 M-citation study found 73% of audited sites have technical barriers (robots.txt blocks, JS-only rendering) preventing AI crawler access. The fix is non-optional; it is the floor below which content-level optimization does nothing.

Server-render rule

Every page of a client site should serve rendered HTML at the URL — either statically generated (Next.js export, Astro, Hugo) or server-side rendered (Next.js App Router with server components, classic SSR). Never ship a single-page-app shell where content arrives via client-side JavaScript.

AI crawlers (GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot) do not execute JavaScript at scale.

Claim: Multiple measurement sources confirm: AI crawlers do not execute JavaScript at scale.

OtterlyAI 1 M-citation study: 73% of audited sites blocked AI crawlers via robots.txt or JS-only rendering.

Cloudflare crawler logs (Jan-July 2025): GPTBot / ClaudeBot / PerplexityBot fetched raw HTML; no JS execution observed.

Sources: https://otterly.ai/blog/the-ai-citations-report-2026/; https://ekamoira.com/blog/ai-citations-llm-sources.

Confidence: Industry-consensus.

A site that requires JS to render its content is invisible to AI engines — and increasingly invisible to AI-augmented search overall, since 48% of queries now trigger an AI Overview. See Measurement frameworks for small-business websites.

A defensible default stack is Next.js 15 App Router with server components — every page renders to HTML at build time or on the server. Client components ("use client") are reserved for interactivity; the content is always in the server-rendered tree. For sites built on a client-rendered framework, migration to a static-generation tool should precede any AI-visibility work. There is no exception.

Schema as hygiene, not growth lever

A defensible working rule: always ship clean, validated schema (LocalBusiness, Organization, Service, Product, Article, BreadcrumbList, FAQPage where genuinely applicable) on every client site. Never position schema to a client as the lever that will move AI citations on its own.

Three independent findings converge on this stance: Google's May 15, 2026 guidance says schema is not required for AI Overviews (Technical SEO standards and structured-data discipline (2026)); Ahrefs' April 2026 controlled study on 1,885 pages found no AI-citation lift from adding JSON-LD, with treated pages declining 4.6% more than controls on AI Overviews; but the Schanbacher peer-reviewed real-estate study shows Product schema correlates with 10× ChatGPT visibility on acquisition, and Whitespark 2026 added AI Search Visibility as a formal category with structured data named as a direct input. Schema is necessary infrastructure for entity disambiguation, rich results, and Knowledge Graph membership. It is not a citation hack. The citation lever is the content patterns above schema: quotations, statistics, citations, freshness, comprehensive topic coverage.

How this applies in practice: a client site emits validated schema by default — generated programmatically from the same data source as the visible content, so the two never drift. Schema cost is essentially zero (JSON-LD is non-render-blocking, 1-10 KB per page). The argument made to clients is "this is the foundation; the citation work lives on top."

llms.txt — skeptical

OtterlyAI's 90-day measurement reported 84 of 62,100 AI-bot requests (0.1%) targeted /llms.txt files — worse than an average content page on the same domains. Source: Kai Spriestersbach, "The llms.txt is dead," Medium. Confidence: Single-source, corroborated qualitatively by reports that Google added llms.txt to its docs in December 2024 then removed them within 24 hours.

The /llms.txt proposal was published by Jeremy Howard (Answer.AI) on September 3, 2024 — a curated Markdown file at the website root specifically for LLM retrieval (https://answer.ai/posts/2024-09-03-llmstxt.html). Mintlify rolled it out across all hosted docs sites in November 2024, making thousands of sites llms.txt-aware "practically overnight." The traffic data has not followed.

The defensible stance: don't ship llms.txt as a citation strategy. If it ships at all, it should ship as a courtesy — the durable value is structured content on the actual pages, not the index file. Acknowledging this skepticism early helps inoculate the client against trend-chasing.

Who gets cited — the third-party / first-party split

OtterlyAI (Sept 2025): community platforms capture 52.5% of citations

OtterlyAI's analysis of 1 M+ AI citations (Jan-Sept 2025) across ChatGPT, Perplexity, and Google AI Overviews found:

52.5% of citations go to community platforms (Reddit, Quora)
47.5% go to brand domains
73% of audited sites have technical barriers preventing AI crawler access

Source: https://otterly.ai/blog/the-ai-citations-report-2026/. Confidence: Verified (vendor-published; methodology disclosed).

Industry headline framing — "AI Search Engines Depend 95% on Third-Party Sources" — is rhetorically inflated. The actual split is closer to 50/50, but the 47.5% brand share is concentrated among large publishers, not SMB sites.

The strategic implication is a two-track posture. (1) Make the client's own site AI-crawlable (server-render rule, schema hygiene, freshness signals). (2) Build presence on Reddit, Quora, YouTube, and authoritative third-party publications, because that is where half the citations live.

The underlying reason structured surfaces (Wikipedia, Reddit threads, YouTube transcripts) dominate the citation pool is extractability: AI retrievers can lift atomic facts out of records far more reliably than out of prose or pixels.

Claim (synthesis): A structured catalogue is a body of records (products or any structured knowledge) in which each item is described by defined, independent attributes. Because the attributes are exposed as data rather than buried in prose or pixels, the catalogue can be (a) queried and filtered by visitors via faceted search, and (b) read by machines — search engines, AI answer engines, and downstream tools. The same content rendered as prose or a PDF is not query-eligible and is not individually indexable per record.

Source: Synthesis of USPTO Patent 8,700,594 (structured data is "searchable by data type"; unstructured data — bitmaps, audio, text docs — is not); Integrate.io / MongoDB structured-vs-unstructured definitions; Google Search Central's structured-data documentation (each individual element labelled so users can search by ingredient, calorie count, cook time).

Confidence: Industry-consensus (combining primary-source definitions with primary-source search-engine documentation).

Wikipedia, YouTube, Reddit: the top-cited domains

Ahrefs' Q1 2026 cross-platform data places Wikipedia as the #1 cited domain in Google AI Mode at 11.22% of all tracked mentions; YouTube is #2 at 9.51%. Surfer SEO Tracker (March-Aug 2025, 36 M AI Overviews, 46 M citations) reports YouTube ~23.3%, Wikipedia ~18.4%, Google.com ~16.4%. Sources: Surfer SEO (https://surferseo.com/blog/ai-citation-report/); The Digital Bloom (https://thedigitalbloom.com/learn/google-ai-overviews-top-cited-domains-2025/); Ahrefs Q1 2026. Confidence: Industry-consensus.

The common thread among top-cited domains is structured, neutral, schema-rich content with explicit citation chains. Wikipedia operationalizes verifiability; YouTube's metadata (description, chapters, transcripts) is structurally extractable; Reddit's thread structure surfaces atomic Q&A. This is the empirical case for KB-shaped content: dense, neutral, encyclopedic, factually attributed, structurally extractable.

Profound: cross-platform fragmentation

The 11% ChatGPT/Perplexity domain overlap and 13.7% AI Overviews / AI Mode URL overlap from the 680 M-citation Profound dataset mean that citation strategy cannot be single-target. A page that wins on ChatGPT (favors Wikipedia / authoritative-publisher patterns) will often not win on Perplexity (favors Reddit / community / proprietary-data patterns) without separate effort.

Perplexity in particular favours "visible statistics and proprietary data, named sources with verifiable methodology" (Leapd 2026). A normalized open-data dashboard surfaced as a public page hits all three — visible statistics, proprietary normalization, named methodology — which is one reason "build on official open data, not scraping" is a defensible production pattern.

A forward-looking corollary, flagged as speculative, is that AI answer engines may begin to cite useful tools — calculators, configurators, dashboards — directly, treating them as citation targets rather than ad slots.

Claim: AI answer engines (AI Overviews, AI Mode, Perplexity, ChatGPT search) may increasingly surface and cite useful tools, and brand mentions / links correlate with visibility in AI overviews — but this is early and contested.

Source: Practitioner observation; cross-link to the AI-search rollout history (AI Overviews Canada full rollout Oct 28, 2024; AI Mode Canada launch Aug 21, 2025, English-only; Whitespark Q2 2025: 15% local / 92% informational AI Overview presence).

Confidence: Directional-Speculative.

Caveat: Do not over-weight. This is the most speculative claim on the page — useful as a forward-looking line in client conversations, not as a planning assumption.

Citation behaviour by platform

Platform	Avg. citations / response	Notable bias
Perplexity	21.87 (Qwairy Q3 2025)	Freshness-sensitive — 82% citation rate for 30-day content vs 37% for older (Whitehat SEO 2026). Reddit-heavy.
ChatGPT	7.92 (Qwairy Q3 2025)	Wikipedia-heavy. Often cites pages ranking at organic position 21+ in related Google queries (~90% of the time, per Semrush).
Google AI Overviews	(volume varies)	54.5% of citations match top organic URLs (BrightEdge Oct 2025, up from 32% in 2024); overlap >75% in YMYL sectors.
Google AI Mode	(volume varies)	Wikipedia #1 at 11.22%; YouTube #2 at 9.51% (Ahrefs Q1 2026).
Microsoft Copilot	(not tracked here)	Bing-aligned.

The honest claim is that platform-specific source preferences are real and durable: Reddit-heavy for Perplexity, Wikipedia-heavy for ChatGPT, Bing-aligned for Microsoft Copilot.

An important nuance for the AI Overviews row above: AIO correlates with traditional rankings more than the other AI surfaces because Google reuses its existing index via a query-fan-out / FastSearch mechanism, while ChatGPT and AI Mode draw from a wider pool. Ranking #1 neither guarantees nor is required for citation, but the conditional probability is highest for AIO.

Claim: AI Overviews specifically correlate with traditional rankings more than other surfaces, because Google reuses its index via "query fan-out" / FastSearch. ChatGPT and AI Mode draw from a wider pool. So ranking #1 neither guarantees nor is required for citation.

Source: Industry analysis (BrightEdge, Ahrefs, vendor commentary, 2025–2026).

Confidence: Industry-consensus.

Direct evidence on technical signals — what is and is not a lever

Core Web Vitals are a gate, not a signal of excellence

Dan Taylor (SALT.agency), writing in Search Engine Land on January 13, 2026, analyzed n=107,352 webpages in AI Overviews / AI Mode. Spearman correlations: LCP r = -0.12 to -0.18 (weak negative); CLS r = -0.05 to -0.09. Verbatim conclusion: "Core Web Vitals do not act as a growth lever for AI visibility. They act as a constraint. Good performance does not create an advantage. Severe failure creates disadvantage… Core Web Vitals are therefore best understood as a gate, not a signal of excellence."

Confidence on the Taylor analysis: LOW — no methodology disclosure, no published dataset, single-contributor analysis on a Semrush-owned property. The directional read (CWV as gate) is consistent with broader evidence but the magnitudes should not be quoted as load-bearing.

Indirect evidence (moderate): overlap with organic top-10

The overlap picture has two co-existing layers. The first is the "old" finding that AI surfaces draw heavily from the organic top 10:

seoClarity: "a whopping 99.5% of the time one or more of the top 10 web results was included in the AI Overview's sources."
Ahrefs (July 2025): "76.1% of URLs cited in AI Overviews also rank in the top 10 of Google search results."
Semrush AI Mode Comparison (2025, n=5,000 keywords): Perplexity has "over 91% domain overlap and 82% URL overlap" with Google's top 10; ChatGPT has the weakest overlap. A separate Semrush study found ChatGPT cites pages ranking at position 21+ in related Google queries ~90% of the time.
BrightEdge (Oct 2025): 54.5% of citations in AI Overviews match top organic URLs (up from 32% in 2024); overlap >75% in YMYL sectors.

The second layer is the early-2026 decoupling — multiple independent studies show AI citation diverging sharply from Google organic rank, fastest on the more AI-native surfaces.

Claim: A parallel Ahrefs study (863k keywords / 4 M URLs) puts page-1 overlap with AI Overview citations at 38%, down from 76% in July 2025.

Source: ahrefs.com — 2026.

Confidence: Industry-consensus (consistent direction with BrightEdge and Moz).

Caveat: Ahrefs is a vendor; the 863k-keyword sample is internal.

Claim: Moz found 88% of Google AI Mode citations are outside the organic top 10 (2026).

Source: moz.com — 2026.

Confidence: Industry-consensus (corroborates BrightEdge, Ahrefs).

The trajectory — page-1 overlap collapsing from 76% to 38% in roughly six to seven months on AI Overviews, and 88% of AI Mode citations sitting outside the top 10 — is the more important finding than either single level. The two systems are diverging fast, not converging. Other corroborating points from the same window: BrightEdge Feb 2026 reports ~17% organic-top-10 overlap on AI Overview citations; Brandlight's top-Google ↔ AI-cited overlap dropped from 70% to under 20%; eMarketer (Dec 2025, citing Ahrefs) reports only ~8% of ChatGPT and ~8.6% of Gemini citations come from Google's top 10.

The honest answer for clients: optimize performance for customers and conversion, not for hypothetical AI citation gains. Citation behaviour is driven by content quality, domain authority, freshness, structured BLUF formatting, and platform-specific source preferences. Speed is not a known direct lever; it is at most an indirect floor via organic rank.

Revenue translation — what AI citation is worth

Seer (Sept 2025): -61% organic CTR from AI Overviews, +35% for cited brands

Seer Interactive's September 2025 study of 3,119 queries / 42 organizations / 25.1 M organic impressions found:

Organic CTR fell 61% (1.76% → 0.61%) when an AI Overview appeared on a SERP.
Paid CTR fell 68%.
But brands cited inside an AI Overview earned +35% organic clicks and +91% paid clicks versus uncited brands on the same SERPs.

Source: Seer Interactive, September 2025 — https://www.seerinteractive.com/insights/study-ai-brand-visibility-and-content-recency; Inc.com analysis. Confidence: Verified.

The strategic implication for client books: the CTR collapse from AI Overviews is real (-61% organic). The halo from being cited inside one is also real (+35% / +91%). The middle position — visible on the SERP but not cited in the AI Overview — is the worst place to be. This is the strongest "AI visibility matters in revenue terms" claim available in the 2025-2026 literature.

AI Overview prevalence

BrightEdge's tracking reports that 48% of queries now trigger an AI Overview as of February 2026. See Measurement frameworks for small-business websites. The CTR-collapse / cited-brand-halo dynamic is therefore live on roughly half of all commercial SERPs and is the dominant revenue story underneath the citation question.

Whitespark 2026: AI Search Visibility as a formal category

The Whitespark 2026 addition of AI Search Visibility as a formal category is itself a revenue signal — the practitioners closest to local-SEO budgets are treating AI citation as a budgeted line item, not a side experiment. "Dedicated page for each service" ranks #2 in AI Visibility factors, behind only proximity-style signals. The directional read is that the IA pattern most aligned with AI citation is also the pattern most aligned with local-organic ranking — there is no large trade-off between optimizing for one and the other.

Vertical signal — legal as a leading-edge AI adopter

Clio's Legal Trends reports show AI adoption among legal professionals accelerating sharply: 19% in 2023 → 79% in October 2024 → 93% among mid-sized firms in April 2025. Clio's CEO Jack Newton: "AI has reached the level of adoption the cloud took a decade to obtain." Companion finding: 64% of mid-sized firms offer flat-fee billing (specifically flat-fee, not all alternative fee structures).

Sources: https://www.clio.com/about/press/clio-latest-legal-trends-report/; Clio 2025 Legal Trends for Mid-Sized Law Firms (April 2025). Confidence: Verified — vendor-published, flag. Clio has a commercial interest in legal-tech adoption stats, but the direction is consistent across other 2024-25 legal-tech surveys.

The relevance to AI citation: high-AI-adoption verticals are also the verticals where end-users are most likely to use AI tools to find professional-services providers, which is why the citation pattern matters earliest and most acutely in legal. Flat-fee billing is structurally different from hourly billing — a flat fee cannot be priced without historical data on per-matter cost. The 64% flat-fee figure signals that mid-sized law firms are doing the engagement-profitability analysis that vertical SaaS supports but few firms historically had the structured data for. The data layer + AI overlay is what makes flat-fee viable, and the citation question (does the firm's page get surfaced when a prospect asks ChatGPT "best Toronto employment lawyer for severance review") sits directly on top of that data-layer maturity.

Query Fan-Out — Google's primary retrieval mechanism

Underneath all of the above sits Google's Query Fan-Out — the mechanism by which a single user prompt is decomposed into many sub-queries that each pull candidate URLs, with the final AI response synthesized from the union.

Quote (Google primary): "AI Overviews and AI Mode may use a 'query fan-out' technique — issuing multiple related searches across subtopics and data sources."

Source: Google Search Central — https://developers.google.com/search/docs/appearance/ai-features.

Confidence: Verified (Google primary).

The practical reading is that "rank for the query" is a 2010s frame; "be surfaced by at least one of the fan-out sub-queries" is the 2026 frame. Comprehensive topic coverage — multiple sub-pages per service, FAQ blocks for adjacent intents, glossary entries for entity disambiguation — is the IA pattern that wins fan-out. A page that comprehensively answers a topic and its adjacent sub-questions gets cited across many sub-queries it never directly targeted; a narrow single-keyword page does not. This is the structural reason the KB-shaped site beats the single-prose-page site at equal word count, and the structural reason Whitespark's "dedicated page for each service" ranks #2 in AI Visibility factors.

What the direct evidence does and does not say

There is a fair-minded synthesis in late 2026 industry reporting that the direct, controlled evidence on AI search citation remains thin. The Princeton GEO paper is the only primary peer-reviewed study; Schanbacher is peer-reviewed but single-vertical / single-country; the rest is vendor-published observational analysis with disclosed methodology but variable sample composition and zero reproducibility infrastructure. The decoupling between organic rank and AI citation documented in the "Indirect evidence" section is the broader honesty note: ranking on Google does not guarantee AI citation, and vice versa.

The defensible operational position:

Ship the content patterns. Quotations, statistics, citations, freshness, BLUF formatting, comprehensive topic coverage — these are the primary-study-backed levers (Aggarwal 2024) and they also serve human readers.
Ship the technical floor. Server-render, validated schema, crawler access, fresh dateModified on real changes — these prevent disqualification and accumulate value in schema's index-time and training-time layers.
Build the third-party surface. Reddit / YouTube / authoritative-publisher presence — because half of AI citations live there.
Do not over-promise. Speed is a gate, not a lever. Schema is hygiene, not a hack. llms.txt is courtesy, not a strategy. Direct evidence on most "AI optimization" claims is thinner than the trade press suggests.
Measure where measurement is possible. Seer's -61% / +35% / +91% framing is the closest thing to a defensible revenue case. Track cited-vs-uncited share on commercial queries, not abstract "AI visibility scores."

Sources and confidence

Verified (primary peer-reviewed): Aggarwal et al., Generative Engine Optimization, arXiv:2311.09735 v3, Table 6 (Quotation Addition +41%, Statistics +31%, Cite Sources +28%, Fluency +28%, Combined Fluency+Statistics +35.8%, Keyword Stuffing -8 to -10%, Rank-5 pages +115.1%, PAWC band 30-40%, Subjective Impression band 15-30%, Perplexity live-engine validation up to 37% on n=200). Schanbacher, The Impact of JSON-LD Metadata on ChatGPT Visibility, SSRN paper id 5641050 (FAQPage OR ~13, Product OR ~4, mobile OR ~5.2, robots.txt OR ~3.4, h2 OR ~3.3, h3 OR ~2.3) — single-vertical, single-country caveat.
Verified (vendor-published, methodology disclosed): Ahrefs 17 M-citation freshness study (25.7% / 13.1%; 1,064 vs 1,432 days); Ahrefs Q1 2026 cross-platform domain data (Wikipedia 11.22%, YouTube 9.51% in AI Mode); Ahrefs Q1 2026 page-1 overlap study (863k keywords / 4M URLs; 38% AIO overlap, down from 76% Jul 2025); Ahrefs April 2026 schema controlled study (1,885 treated / 4,000 control; -4.6% AIO delta vs control); Seer Interactive Sept 2025 (n=3,119 queries / 42 orgs / 25.1 M impressions; -61% organic CTR, -68% paid CTR, +35% / +91% for cited brands); Seer Interactive Oct 2025 recency study (5,000+ AI-cited URLs; 65/79/89/94% within 1/2/3/5 years); Profound 680 M-citation study (11% ChatGPT/Perplexity overlap, 13.7% AI Overviews / AI Mode overlap); OtterlyAI 1 M-citation analysis (52.5% community, 47.5% brand, 73% with technical barriers); Surfer SEO Tracker (36 M AI Overviews / 46 M citations); BrightEdge Oct 2025 (54.5% overlap with top organic, up from 32%); BrightEdge Feb 2026 (~17% AIO-citation overlap with organic top 10); Moz 2026 (88% of AI Mode citations outside the top 10); Qwairy Q3 2025 (Perplexity 21.87 / ChatGPT 7.92 citations per response).
Verified — Google primary: Google May 15, 2026 AI optimization guidance ("schema not required for AI Overviews"; "optimizing for generative AI search is optimizing for the search experience, and thus still SEO"; commodity-content quote); Google Search Central query-fan-out documentation ("issuing multiple related searches across subtopics and data sources"); BrightEdge Feb 2026 (48% of queries trigger an AI Overview); Whitespark 2026 Local Search Ranking Factors (AI Search Visibility added as formal category; n=47 expert survey).
Verified — vendor-published, flag commercial interest: Clio Legal Trends (AI adoption 19% → 79% → 93%; 64% flat-fee among mid-sized firms); Whitehat SEO 2026 (3.2× freshness multiplier; Perplexity 82% vs 37% citation rate by recency).
Industry-consensus: Surfer SEO and The Digital Bloom top-cited-domain summaries; Suganthan Mohanadasan three-lives-of-schema framing (single-source synthesis); contractor-vertical schema build-out (schema.org + Google Search Central + HTTP Archive directional adoption data); Reputation.com summary of Whitespark structured-content quote; AIO-vs-ChatGPT query-fan-out / FastSearch correlation nuance; eMarketer Dec 2025 / Brandlight decoupling corroboration; AI crawlers do-not-execute-JS measurement (OtterlyAI + Cloudflare crawler logs Jan-July 2025); extractability-of-records-vs-prose synthesis (USPTO Patent 8,700,594; Integrate.io / MongoDB; Google Search Central recipe example).
Single-source / Directional: Digital Applied 5 K-site audit (71% deploy schema, 22% valid, r=+0.34 — vendor, methodology not independently audited); OtterlyAI llms.txt 0.1% traffic measurement; Dan Taylor / SALT.agency CWV analysis on n=107,352 webpages (LCP r=-0.12 to -0.18, CLS r=-0.05 to -0.09 — no methodology disclosure, no published dataset, single-contributor on Semrush-owned property).
Directional-Speculative: AI answer engines may increasingly surface and cite useful tools, with brand mentions / links correlated with AI Overview visibility — early and contested; useful as a forward-looking line, not a planning assumption.
Corrections / rejected industry framings: Princeton +41% credited to Statistics Addition in many SEO blogs — the paper itself names Quotation Addition. Ahrefs "67% more citations for recently updated pages" cannot be located in the source; use 25.7% (publish) / 13.1% (updated). OtterlyAI "AI Search Engines Depend 95% on Third-Party Sources" framing is rhetorically inflated; the actual split is 52.5% / 47.5%.

See Measurement frameworks for small-business websites and Technical SEO standards and structured-data discipline (2026) for the live cross-link reference cards anchoring this page.