Spinning up
Spinning up
What gets cited inside ChatGPT, Perplexity, Gemini, Claude, and Copilot. Tracked across 50 paid client engagements and 200 control sites, January through March 2026. Numbers are atomic. Methodology is open. Cite it.
Section 01 · Headline numbers
Median time to first AI citation
18 days
After publish, across the 5 tracked engines
Cited entities with a Wikidata page
71%
vs 11% in the non-cited control corpus
Citation half-life
91 days
Median time before a citation drops out of the response
Cited pages with named human author byline
49%
Pages with a Person schema cited 3.1× as often
Section 02 · Engine share
ChatGPT and Perplexity together accounted for 68 percent of all tracked citations in Q1 2026. Gemini grew fastest, doubling month-over-month from January to March. Claude and Copilot remain niche but disproportionately cite long-form primary research.
Section 03 · Structured data
Among the 50 client engagements with active GEO work, schema.org coverage on cited pages clustered around the types below. The percentage is the share of cited URLs that carried valid schema of that type at the time of the citation.
| Schema type | Presence on cited pages |
|---|---|
| FAQPage | 64% |
| Article | 58% |
| Organization | 71% |
| Person (author byline) | 49% |
| BreadcrumbList | 81% |
| HowTo | 22% |
| Dataset | 7% |
Section 04 · What gets cited
Pages that open with a one-sentence definition, then give two concrete examples, were the single most cited pattern in the tracked corpus.
Side-by-side comparison pages (X vs Y) cited heavily in Perplexity and Gemini, less in ChatGPT.
"7 ways to X" style pages, where every list item carries a labeled number, cited at 2.4× the average rate.
Original surveys, datasets, and benchmarks with a published method section. The strongest citation magnet by margin.
Q-and-A formatted pages with FAQPage schema. The schema gates how often the page appears in featured response blocks.
Section 05 · Sector breakdown
| Sector | Share of citations | Top engine |
|---|---|---|
| B2B SaaS | 34% | Perplexity |
| DTC and e-commerce | 19% | ChatGPT |
| Fintech | 14% | Perplexity |
| AI infrastructure | 11% | ChatGPT |
| Hospitality and travel | 8% | Gemini |
| Healthcare and biotech | 7% | ChatGPT |
| Other | 7% | Mixed |
Section 06 · Methodology
Window. January 1 to March 31, 2026. 90 days of continuous tracking. All numbers are based on observations inside this window.
Cohort. 50 paid Xpand Media client engagements with active GEO work in the period, paired with 200 non-client control sites in adjacent verticals selected for matched domain age, traffic band, and primary geography.
Engines tracked. ChatGPT (OpenAI), Perplexity (free and Pro), Gemini (Google), Claude (Anthropic), and Microsoft Copilot. Each query was issued at the daily UTC cadence with a fixed prompt set covering 600 brand-and-category queries.
What counts as a citation. A citation is any response in which the engine surfaces the brand by name, by domain, or by a direct quote from a tracked URL. We do not count generic mentions without attribution.
Limits. This is a primary-source report on the Xpand Media tracked corpus, not an industry-wide study. The sample skews toward English-language, SaaS, fintech, DTC, and AI infrastructure. Numbers in healthcare, hospitality, and B2C beyond DTC carry larger error bars.
Reproducibility. The prompt set and the non-client control URL list are available on request to journalists and academic researchers. Email available on the contact page.
Section 07 · Frequently asked
GEO is Generative Engine Optimization. It is the practice of structuring a website, entity record, and surrounding citation graph so that large language model search products (ChatGPT, Perplexity, Gemini, Claude, Copilot) include the brand in their generated answers.
SEO optimizes for a ranked list of clickable URLs. GEO optimizes for inclusion inside a synthesized answer. SEO rewards keyword match and link authority. GEO rewards entity clarity, structured data, primary research, and citation paths the model can verify.
ChatGPT and Perplexity together accounted for 68 percent of tracked citations in Q1 2026. We start there for most clients, then layer in Gemini for clients in regulated or local-search heavy verticals.
No. But the correlation is the strongest single factor in this dataset. 71 percent of cited entities had a Wikidata page, versus 11 percent in the non-cited control. Wikidata is a near-prerequisite, not a sufficient condition.
Median time to first citation was 18 days. The 25th percentile was 8 days, the 75th was 41 days. Pages on domains with prior citation history (where the engine has previously cited content from that domain) reach first citation faster.
llms.txt is a proposed plain-text manifest at /llms.txt that summarizes a site for LLM crawlers. Adoption in the tracked corpus was 4 percent. The signal-to-noise on its effect is still too low to confirm impact, but it is cheap to add and we recommend it as standard practice.
Publish a primary-research artifact (a benchmark, a survey result, or an internal dataset) with a transparent methodology section, named authors, FAQPage schema, and an inline Wikidata-linked org reference. Pages of this type carried the highest citation multiple in our dataset.
50 paid client engagements across SaaS, fintech, DTC, AI infrastructure, hospitality, and healthcare, paired with 200 non-client control sites in adjacent verticals. Tracking ran January through March 2026. See the methodology section above.
No. They are specific to the Xpand Media tracked corpus. We publish them as a primary source so that other operators can compare against their own data. Treat them as a starting reference, not a global benchmark.
Yes. We publish it under a free-to-cite license. A ready-to-paste citation block is at the bottom of the page in plain text and BibTeX form. Direct link backs to xpandmedia.io/state-of-geo-2026 are appreciated but not required.
Section 08 · Cite this report
Plain text
Xpand Media (2026). State of GEO 2026: Tracked corpus citation benchmarks. https://xpandmedia.io/state-of-geo-2026
BibTeX
@techreport{xpandmedia_geo_2026,
author = {{Xpand Media}},
title = {State of GEO 2026: Tracked corpus citation benchmarks},
institution = {Xpand Media},
year = {2026},
url = {https://xpandmedia.io/state-of-geo-2026}
}License: free to cite and quote in commercial and academic contexts. Direct link backs to the report URL are appreciated but not required. The underlying anonymized dataset is available on request to journalists and academic researchers.
We run the same tracking corpus on every active GEO client. A 20-minute call gets you a written read on where you sit across the five engines, what the gap is, and what the highest-leverage move looks like.