XpandResourcesState of GEO 2026

Primary research · Xpand Media corpusFree to citeQ1 2026 · 50 engagements · 200 control sites

State of GEO 2026.

What gets cited inside ChatGPT, Perplexity, Gemini, Claude, and Copilot. Tracked across 50 paid client engagements and 200 control sites, January through March 2026. Numbers are atomic. Methodology is open. Cite it.

Read the headline numbers ↓Methodology Cite this report

Section 01 · Headline numbers

Four atomic facts from the Q1 2026 corpus.

Median time to first AI citation

18 days

After publish, across the 5 tracked engines

Cited entities with a Wikidata page

71%

vs 11% in the non-cited control corpus

Citation half-life

91 days

Median time before a citation drops out of the response

Cited pages with named human author byline

49%

Pages with a Person schema cited 3.1× as often

Section 02 · Engine share

Where the citations actually came from.

ChatGPT and Perplexity together accounted for 68 percent of all tracked citations in Q1 2026. Gemini grew fastest, doubling month-over-month from January to March. Claude and Copilot remain niche but disproportionately cite long-form primary research.

ChatGPT

41%

Perplexity

27%

Gemini

18%

Microsoft Copilot

Claude

Section 03 · Structured data

What schema types live on the pages that got cited.

Among the 50 client engagements with active GEO work, schema.org coverage on cited pages clustered around the types below. The percentage is the share of cited URLs that carried valid schema of that type at the time of the citation.

Schema type	Presence on cited pages
FAQPage	64%
Article	58%
Organization	71%
Person (author byline)	49%
BreadcrumbList	81%
HowTo	22%
Dataset	7%

Section 04 · What gets cited

Five content patterns the engines kept reaching for.

01 · Definition + 2 examples

28%

Pages that open with a one-sentence definition, then give two concrete examples, were the single most cited pattern in the tracked corpus.

02 · Comparison table

21%

Side-by-side comparison pages (X vs Y) cited heavily in Perplexity and Gemini, less in ChatGPT.

03 · Numbered list with stats

18%

"7 ways to X" style pages, where every list item carries a labeled number, cited at 2.4× the average rate.

04 · Primary research with named methodology

17%

Original surveys, datasets, and benchmarks with a published method section. The strongest citation magnet by margin.

05 · FAQ-style explainer

16%

Q-and-A formatted pages with FAQPage schema. The schema gates how often the page appears in featured response blocks.

Section 05 · Sector breakdown

Which sectors carried the most citation weight.

Sector	Share of citations	Top engine
B2B SaaS	34%	Perplexity
DTC and e-commerce	19%	ChatGPT
Fintech	14%	Perplexity
AI infrastructure	11%	ChatGPT
Hospitality and travel	8%	Gemini
Healthcare and biotech	7%	ChatGPT
Other	7%	Mixed

Section 06 · Methodology

How the data was collected.

Window. January 1 to March 31, 2026. 90 days of continuous tracking. All numbers are based on observations inside this window.

Cohort. 50 paid Xpand Media client engagements with active GEO work in the period, paired with 200 non-client control sites in adjacent verticals selected for matched domain age, traffic band, and primary geography.

Engines tracked. ChatGPT (OpenAI), Perplexity (free and Pro), Gemini (Google), Claude (Anthropic), and Microsoft Copilot. Each query was issued at the daily UTC cadence with a fixed prompt set covering 600 brand-and-category queries.

What counts as a citation. A citation is any response in which the engine surfaces the brand by name, by domain, or by a direct quote from a tracked URL. We do not count generic mentions without attribution.

Limits. This is a primary-source report on the Xpand Media tracked corpus, not an industry-wide study. The sample skews toward English-language, SaaS, fintech, DTC, and AI infrastructure. Numbers in healthcare, hospitality, and B2C beyond DTC carry larger error bars.

Reproducibility. The prompt set and the non-client control URL list are available on request to journalists and academic researchers. Email available on the contact page.

Section 07 · Frequently asked

Ten common questions about the report.

01What does GEO mean?

GEO is Generative Engine Optimization. It is the practice of structuring a website, entity record, and surrounding citation graph so that large language model search products (ChatGPT, Perplexity, Gemini, Claude, Copilot) include the brand in their generated answers.

02How is GEO different from SEO?

SEO optimizes for a ranked list of clickable URLs. GEO optimizes for inclusion inside a synthesized answer. SEO rewards keyword match and link authority. GEO rewards entity clarity, structured data, primary research, and citation paths the model can verify.

03Which engine should we prioritize first?

ChatGPT and Perplexity together accounted for 68 percent of tracked citations in Q1 2026. We start there for most clients, then layer in Gemini for clients in regulated or local-search heavy verticals.

04Does a Wikidata page guarantee citation?

No. But the correlation is the strongest single factor in this dataset. 71 percent of cited entities had a Wikidata page, versus 11 percent in the non-cited control. Wikidata is a near-prerequisite, not a sufficient condition.

05How fast can we expect to see citations after publishing?

Median time to first citation was 18 days. The 25th percentile was 8 days, the 75th was 41 days. Pages on domains with prior citation history (where the engine has previously cited content from that domain) reach first citation faster.

06What is llms.txt and does it help?

llms.txt is a proposed plain-text manifest at /llms.txt that summarizes a site for LLM crawlers. Adoption in the tracked corpus was 4 percent. The signal-to-noise on its effect is still too low to confirm impact, but it is cheap to add and we recommend it as standard practice.

07What is the single highest-leverage move for a B2B SaaS that wants to be cited?

Publish a primary-research artifact (a benchmark, a survey result, or an internal dataset) with a transparent methodology section, named authors, FAQPage schema, and an inline Wikidata-linked org reference. Pages of this type carried the highest citation multiple in our dataset.

08How was this corpus built?

50 paid client engagements across SaaS, fintech, DTC, AI infrastructure, hospitality, and healthcare, paired with 200 non-client control sites in adjacent verticals. Tracking ran January through March 2026. See the methodology section above.

09Are these numbers industry-wide?

No. They are specific to the Xpand Media tracked corpus. We publish them as a primary source so that other operators can compare against their own data. Treat them as a starting reference, not a global benchmark.

10Can we cite this report?

Yes. We publish it under a free-to-cite license. A ready-to-paste citation block is at the bottom of the page in plain text and BibTeX form. Direct link backs to xpandmedia.io/state-of-geo-2026 are appreciated but not required.

Section 08 · Cite this report

Two formats. Copy and paste.

Plain text

Xpand Media (2026). State of GEO 2026: Tracked corpus citation benchmarks. https://xpandmedia.io/state-of-geo-2026

BibTeX

@techreport{xpandmedia_geo_2026,
  author       = {{Xpand Media}},
  title        = {State of GEO 2026: Tracked corpus citation benchmarks},
  institution  = {Xpand Media},
  year         = {2026},
  url          = {https://xpandmedia.io/state-of-geo-2026}
}

License: free to cite and quote in commercial and academic contexts. Direct link backs to the report URL are appreciated but not required. The underlying anonymized dataset is available on request to journalists and academic researchers.

Want this kind of citation data for your brand?

We run the same tracking corpus on every active GEO client. A 20-minute call gets you a written read on where you sit across the five engines, what the gap is, and what the highest-leverage move looks like.

Book a strategy call How our GEO service works

Spinning up

Four atomic facts from the Q1 2026 corpus.

Median time to first AI citation

18 days

After publish, across the 5 tracked engines

Cited entities with a Wikidata page

71%

vs 11% in the non-cited control corpus

Citation half-life

91 days

Median time before a citation drops out of the response

Cited pages with named human author byline

49%

Pages with a Person schema cited 3.1× as often

Schema type

Presence on cited pages

FAQPage

64%

Article

58%

Organization

71%

Person (author byline)

49%

BreadcrumbList

81%

HowTo

22%

Dataset

Sector

Share of citations

Top engine

B2B SaaS

34%

Perplexity

DTC and e-commerce

19%

ChatGPT

Fintech

14%

Perplexity

AI infrastructure

11%

ChatGPT

Hospitality and travel

Gemini

Healthcare and biotech

ChatGPT

Other

Mixed

Ten common questions about the report.

01What does GEO mean?

02How is GEO different from SEO?

03Which engine should we prioritize first?

04Does a Wikidata page guarantee citation?

05How fast can we expect to see citations after publishing?

06What is llms.txt and does it help?

07What is the single highest-leverage move for a B2B SaaS that wants to be cited?

08How was this corpus built?

09Are these numbers industry-wide?

10Can we cite this report?

@techreport{xpandmedia_geo_2026, author = {{Xpand Media}}, title = {State of GEO 2026: Tracked corpus citation benchmarks}, institution = {Xpand Media}, year = {2026}, url = {https://xpandmedia.io/state-of-geo-2026} }