You paste your domain into the box. You pick a category from the dropdown. You type in your city. You click the button.
Sixty seconds later you see a number. For most contractors and local shops I have shown this to, that number is somewhere between 10 and 35 the first time they run it. Mine was 23. The first reaction is always the same — a quiet pause, then a question about whether the tool is broken.
It is not broken. You just have not been getting cited by AI engines, and until this moment nobody told you that in plain English. This post walks through what the AI Visibility Checker actually does under the hood, what the score means, and what to do with the top three issues it hands you.
What the tool actually does
The checker at /free-tools/ai-visibility/ is not a keyword tool. It does not scrape Google. It queries seven AI engines directly and asks them the kind of question a customer would ask.
The seven engines are ChatGPT, Perplexity, Claude, Google AI (the AI Overviews layer), Grok, DeepSeek, and Kimi. Each one gets hit through its API or public endpoint with a set of category-specific prompts built from your inputs. If you tell the tool you are a plumber in Calgary, it runs prompts like best emergency plumber in Calgary, 24 hour plumber Calgary recommendations, and who should I call for a burst pipe in Calgary.
The tool then parses each response. It is looking for one thing: does the AI name your domain or your business name as a recommended provider. That is what I call a citation. A mention in a list of seven options counts. A passing reference in a paragraph about the city counts. Being listed as the top recommendation counts more.
No signup. No credit card. No email capture before the result. You get the number, the breakdown, and the top three issues on the same page you started on.
What a citation actually means
This is the part most SEO people get wrong when they first see the tool.
A citation is not a backlink. It is not a Google ranking. It is the AI engine, in its generated answer, saying your name out loud when a prospect asks a question. That is the new shelf space. If ChatGPT answers who is the best roofer in Edmonton and it lists five companies and you are one of them, you got cited. If it lists five and you are not there, you did not.
The checker runs multiple prompts per engine because one answer is noise. Three or four answers across a category start to be signal. Across seven engines and several prompts each, you end up with a citation rate per engine — something like 2 out of 5 on Perplexity, 0 out of 5 on Claude, 4 out of 5 on ChatGPT.
How the 0 to 100 score is built
The GEO Score is a weighted average of your citation rate across all seven engines. Weighted, not flat, because some engines send more commercial traffic than others in a local context. ChatGPT and Perplexity carry more weight than Kimi at the moment. That weighting updates as traffic patterns shift.
A score of 0 means no engine named you in any prompt. A score of 100 means every engine named you in every prompt — I have never seen that, and I have run this on a lot of domains. A realistic good score for a local business is 60 to 75. My own shop, Yellow Pencil, sat at 23 when I first built the tool and ran it on myself.
Sample output for a fictional Calgary plumber — GEO Score 41 / 100. ChatGPT cited 3 of 5 prompts, Perplexity 1 of 5, Claude 0 of 5, Google AI 2 of 5, Grok 2 of 5, DeepSeek 1 of 5, Kimi 0 of 5. Top 3 issues: no structured service-area schema on homepage, business name inconsistent across 4 directory citations, no category-specific landing pages for emergency service queries.
That is what you see on screen. No fluff, no 40-tab dashboard, no upsell before the number.
The top 3 issues list is ranked by fixability
The tool does not just tell you what is broken. It ranks the three most-fixable gaps first. Fixable means two things at once: the issue is actually contributing to your low score, and you can address it in a reasonable amount of time without rebuilding your site.
So you will not see abstract advice like improve your E-E-A-T. You will see things like add FAQ schema to your service pages, fix the name mismatch on your Google Business Profile, or publish a city-specific service page for your top two revenue categories. Concrete, countable, doable.
The reason I cap it at three is the same reason I stopped handing clients 40-page audits years ago. Nobody fixes 40 things. People fix two or three, if you are lucky. Give them the two or three that move the score.
The 23 to 67 delta on Yellow Pencil
I ran the checker on my own domain the day I finished building it. Score: 23. The top three issues were missing organization schema, no author bios on blog posts, and business name inconsistency between my footer and my Google Business Profile.
I fixed all three in a weekend. Schema took an hour. Author bios took two hours. The name thing took fifteen minutes and a coffee.
I then did the slower work over the next ten weeks: one new service page per week tied to a real category, a monthly cadence of case-study posts with proper schema, and cleaning up the six directories where my NAP was off by a word or a suite number.
Ninety days later I re-ran the checker. Score: 67. ChatGPT went from 1 of 5 to 4 of 5. Perplexity went from 0 of 5 to 3 of 5. Claude is still stubborn at 2 of 5, which is a whole other post.
How often you should re-run it
Monthly is the baseline. The AI engines update their training and retrieval pipelines constantly, so a score from three months ago is stale. Monthly gives you a clean trend line.
Weekly if you are actively changing things. Publishing new pages, fixing schema, launching a new service line — weekly runs tell you which change actually moved the number. This is the closest thing to a feedback loop you will get in AI search right now.
I would not run it daily. The variance inside a week is mostly noise. Give your changes seven days to propagate before you re-check.
Go run it
Head to /free-tools/ai-visibility/, punch in your domain, your category, and your city. Sixty seconds. No email. You will get a number you can argue with, three issues you can actually fix, and a baseline you can measure against next month.
If the number stings, good. That is useful information. If you want to talk through what you saw, hello@rankinglocal.ai is read by me directly.