‹ Blog

The Robots.txt Line Hiding You From ChatGPT

A single line from 2023 is still quietly blocking AI search traffic on sites I audit every week.

Last Tuesday I was on a call with a dentist in Surrey. He'd paid an agency 14k for a rebuild in 2023, ranked well on Google, and couldn't figure out why ChatGPT kept recommending a competitor two postcodes over when someone asked for "Invisalign near me". He'd written 40+ blog posts. Schema was clean. Reviews were 4.9.

I opened his robots.txt in a new tab. Line 7 read Disallow: / under User-agent: GPTBot. Line 12 did the same for CCBot. The agency had installed a plugin in August 2023 called "Block AI Crawlers" and never touched it again. For 32 months, OpenAI's crawler had been told to leave. No wonder ChatGPT had never heard of him.

This is not a rare story. I've audited 180 local business sites since January and 41 of them still carry a 2023-era AI block in some form. That's roughly 1 in 4. If you're reading this, there's a real chance your site is one of them.

The 2023 panic that's still costing you traffic

Rewind to August 2023. OpenAI had just published the GPTBot user-agent string and a blog post explaining how to block it. The reaction from web publishers was immediate. The New York Times blocked it. Reuters blocked it. WordPress plugin developers shipped one-click toggles. Yoast added a setting. Rank Math added a setting. Every SEO newsletter told readers to "protect their content from AI theft".

The logic at the time made sense. GPT-4 had been trained on scraped web content without permission. Publishers were angry. Blocking the crawler felt like the right move.

But here's what changed. In May 2024, OpenAI launched SearchGPT. In October 2024, ChatGPT search went live for paying users. By February 2025, it was free and pulling from live web results. ChatGPT now answers around 1 billion queries a week, and a growing share of those queries include local intent. "Best plumber in Leeds" is not a training-data question. It's a real-time retrieval question. And if GPTBot can't crawl you, you don't exist in the answer.

The 2023 block was built to stop training. It also stops retrieval. Most site owners never got the memo.

The six crawlers that matter in 2026

There are six user-agents I check on every audit. Each one powers a different AI surface your customers are using right now.

Block any of these and you vanish from the corresponding surface. Block all six and you've opted out of the AI internet.

How to audit your own robots.txt in 60 seconds

Type your domain followed by /robots.txt into a browser. That's it. You'll see a plain text file. Here's what a typical 2023-era AI block looks like.

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

User-agent: GPTBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: Google-Extended
Disallow: /

Sitemap: https://example.com/sitemap.xml

If you see Disallow: / under any AI user-agent, that crawler is locked out of your entire site. Not one page. The whole thing.

The tell-tale signs this came from a 2023 plugin: the block usually covers GPTBot, CCBot, and ChatGPT-User together, sometimes with Google-Extended bolted on after Google announced it in September 2023. If the block uses exactly those four user-agents and nothing else, you're looking at a template from that era.

Note

ChatGPT-User is not a crawler. It's the user-agent ChatGPT sends when a logged-in user clicks a link in a conversation or uses the browse tool. Blocking it means ChatGPT users literally cannot visit your site from within a chat. I've seen this costing local businesses a measurable 3-7% of direct traffic.

The 2026 robots.txt that actually works

Here's the file I hand to clients after an audit. It allows every major AI crawler, keeps your admin area private, and plays nicely with Google.

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-login.php
Disallow: /cart/
Disallow: /checkout/
Disallow: /my-account/
Allow: /wp-admin/admin-ajax.php

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: CCBot
Allow: /

User-agent: bytespider
Allow: /

Sitemap: https://example.com/sitemap.xml

Two things to note. First, Allow: / is technically redundant, because crawlers default to allowed when there's no rule. I include it anyway as documentation, so the next person who opens the file knows the intent was deliberate. Second, I've kept the standard wildcard block for WooCommerce admin and account pages. AI crawlers don't need those, and they'd only generate noise in your logs.

Why the plugins haven't updated themselves

You'd think a plugin that added an "AI block" toggle in 2023 would have updated its default by now. Most haven't. I checked three of the big ones in March 2026, and all three still ship with the block enabled as an opt-in that admins rarely revisit. The Yoast team removed their AI block setting in late 2024, but if you had it toggled on when the setting disappeared, the robots.txt directive stayed in place. A ghost setting. Invisible in the dashboard, still live on the server.

The fastest way to clear it: delete the offending lines directly in your robots.txt file, or in whichever SEO plugin now owns the file. On WordPress, that's usually Yoast, Rank Math, or AIOSEO under a "Tools" menu. After you save, reload yourdomain.com/robots.txt in an incognito window to confirm the change went live.

What happens after you unblock

GPTBot typically recrawls within 3-10 days. I've seen citations start appearing in ChatGPT search within 2-3 weeks for sites with good existing content. Perplexity moves faster, often within a week. Google-Extended changes show up in AI Overviews on the next content refresh cycle, which varies.

The dentist from the opening paragraph? We unblocked him on 2026-03-04. On 2026-03-21, I tested "Invisalign dentist Surrey" in ChatGPT and his practice was cited in the answer. First time in the site's history.

Check yours now

I built a free tool that reads your robots.txt, flags every AI crawler block, and shows you exactly which lines to remove. It takes 4 seconds and doesn't ask for an email. Run your site through it at /free-tools/robots-check/.

If it flags something and you're not sure what to do, hello@rankinglocal.ai is read by me directly.