Most AI visibility problems are subtle. This one isn't. If the wrong line is sitting in your robots.txt, an AI engine literally cannot read your pages — and no amount of schema, reviews, or great content will get you cited. The free Robots Check finds that line in about 30 seconds. Here's what it looks at and, more importantly, how to read the result correctly, because the popular advice on this is mostly wrong.
What the tool does
Paste your domain into the Robots Check and it fetches your robots.txt directly from your server — no login, no account, no email. It parses the file and reports a simple allowed/blocked status for each AI crawler that matters, and it shows you the exact rule causing any block so you can fix it.
If you don't have a robots.txt at all, the tool tells you that too. A missing file returns "allow all" — that's fine for AI visibility. The dangerous case is a file that blocks the wrong things.
The distinction almost every guide gets wrong
Here's the part worth slowing down for, because getting it wrong wastes effort. Most AI providers run two different kinds of crawler, and only one of them controls whether you can be cited.
- One crawler collects data to train the model. Blocking it has nothing to do with whether you show up in answers.
- A separate crawler powers the live search and citations. This is the one you must allow to be recommended.
Concretely:
- OpenAI:
GPTBotis training-only.OAI-SearchBotis the one that controls ChatGPT search citations. BlockingGPTBotdoes not remove you from ChatGPT search — blockingOAI-SearchBotdoes. There's alsoChatGPT-Userfor live, user-triggered fetches. (OpenAI's bots documentation spells this out.) - Anthropic:
ClaudeBotis training.Claude-SearchBotcontrols Claude's search citations.Claude-Userhandles user-triggered fetches. - Google:
Google-Extendedonly governs whether your content trains Gemini/Vertex — it does not control AI Overviews. AI Overviews and AI Mode ride on ordinary Googlebot, so blocking Googlebot is what removes you there. - Microsoft:
Bingbotis the single crawler that feeds both Bing and — because they share the index — Microsoft Copilot and ChatGPT's web search. One bot, both surfaces. - Perplexity:
PerplexityBotindexes,Perplexity-Userfetches on demand.
So if someone "protected their content from AI" a couple of years ago by blocking GPTBot, they may have given up nothing on the citation side — or, if they blocked OAI-SearchBot or Googlebot, they may have made themselves invisible without realising it. The tool checks the citation-grade crawlers specifically, which is the set that actually changes whether you get recommended.
The "Disallow: /" disaster
The single most damaging pattern is also the simplest. Somewhere, sometime, a developer wrote this:
User-agent: * Disallow: /
That tells every crawler on earth to stay out — Google included. If you ever find this on a live site and wonder why organic traffic is zero, now you know. The AI-era cousin is blocking specific bots "just in case," usually added in 2023 when "block the AI scrapers" was trending advice:
User-agent: GPTBot Disallow: / User-agent: OAI-SearchBot Disallow: /
That second block is the one that quietly costs you ChatGPT citations. The first only costs you training inclusion.
A robots.txt that allows the right crawlers
Here's a clean starting point that blocks genuine junk while allowing the citation-grade AI crawlers through:
User-agent: * Disallow: /wp-admin/ Disallow: /cgi-bin/ Allow: /wp-admin/admin-ajax.php User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: Claude-SearchBot Allow: / User-agent: PerplexityBot Allow: / Sitemap: https://yoursite.com/sitemap.xml
A note on precedence, since people get tangled here: a crawler obeys the single most specific user-agent group that matches it, not the union of plus its own. So if you want OAI-SearchBot allowed, give it its own group — it won't inherit rules from the block. Replace yoursite.com with your domain and drop the file at /robots.txt.
The WordPress trap
If you're on WordPress, your robots.txt is often generated dynamically by your SEO plugin (Yoast, Rank Math, All in One SEO), so a file you edited by hand can get overwritten. And security plugins — Wordfence, Solid Security, Sucuri — apply bad-bot rules that can rate-limit or outright 403 AI crawlers because those bots were filed under "scrapers" in early 2024. Some managed hosts inject AI-bot blocks at the platform level that don't even appear in your site's robots.txt. If the Robots Check says you're blocked but your file looks clean, the block is probably coming from one of these layers.
Fix yours in ten minutes
- Run your domain through the Robots Check and note which crawlers are blocked.
- Open your current robots.txt at
yoursite.com/robots.txt. - Remove any
Disallow: /lines under the citation crawlers (OAI-SearchBot,Claude-SearchBot,PerplexityBot) and confirmGooglebotandBingbotaren't blocked. - If you're on WordPress, make the change in your SEO plugin's robots.txt editor, and check whether a security plugin is throttling bots.
- Re-run the check. Give crawlers a couple of days to re-fetch, then test your brand name in ChatGPT and Perplexity.
It's the cheapest AI visibility win there is. Most sites need zero changes; some need one line removed; a few have been invisible for months and never knew. Run it, and if the result looks strange, hello@rankinglocal.ai reaches me directly.