Home Docs On-Page SEO Crawler
Crawler Documentation

GSCWizard-Bot | the GSC Wizard On-Page SEO Checker

GSCWizard-Bot/1.0 (On-Page SEO Checker) is the user agent used by GSC Wizard to power the On-Page SEO report. It fetches a single HTML page at a time to check whether a page's top-ranking queries in Google Search Console appear in the page's <title>, <meta name="description">, and <h1> tags.

User agent string
GSCWizard-Bot/1.0 (On-Page SEO Checker)
Purpose
On-page SEO checks
Trigger
User-initiated only
Respects robots.txt
Yes
Cache
7 days per URL

What is GSCWizard-Bot?

GSCWizard-Bot is the crawler that powers the On-Page SEO report inside GSC Wizard. It runs only when a signed-in user opens the report for a Google Search Console property they own - it does not spider the open web, follow links, or build any kind of index.

If you see this user agent in your access logs, it means a verified owner of your site (or someone with access to your Search Console property) opened the On-Page SEO Checker in GSC Wizard and asked it to analyse your pages.

What the crawler checks

For every page it fetches, GSCWizard-Bot extracts three HTML elements and compares them against the highest-impression query that page already ranks for in Google Search Console over the last 28 days.

1 Page title - <title>

Reads the contents of the <title> tag and checks whether every significant word (3+ characters) from the top GSC query appears in it. Surfaces the current title verbatim alongside a pass/fail flag.

2 Meta description - <meta name="description">

Reads the content attribute of the meta description and checks for the same keyword coverage. Also records the current character length.

3 H1 headings - <h1>

Extracts every <h1> on the page and checks whether any of them contain the target keywords.

That's it. GSCWizard-Bot does not execute JavaScript, does not render pages, does not download images or stylesheets, does not follow links inside your pages, and does not collect any other data. It reads the raw HTML response only.

Crawl behaviour

One request per page, 10-second timeout

Each URL is fetched exactly once per run with a hard 10-second timeout. Redirects are followed (up to the platform default).

Results cached for 7 days

Once we have successfully crawled a URL, we do not re-fetch it for 7 days - the same user reopening the report will see the cached HTML snapshot.

Capped at 200 pages per run

A single user-triggered analysis crawls at most 200 pages - always a subset of the property's own pages, selected from the most-impressed URLs in Search Console.

HTML only, no JS rendering

Sends Accept: text/html and parses the response body directly. No headless browser, no script execution, no asset fetching.

SSRF-protected

URLs that resolve to private, internal, or loopback IP ranges are blocked before any network request is made.

How to allow GSCWizard-Bot

Most sites need no changes. If you run a WAF, bot-management layer, or restrictive firewall and the On-Page SEO report comes back empty (or with HTTP 403 / 429 errors), whitelist the user agent.

robots.txt (explicit allow)

User-agent: GSCWizard-Bot
Allow: /

Cloudflare WAF / Bot Fight Mode

Add a WAF custom rule under Security → WAF → Custom rules:

(http.user_agent contains "GSCWizard-Bot")

Action: Skip → All remaining custom rules, Bot Fight Mode, Super Bot Fight Mode

Nginx

if ($http_user_agent ~* "GSCWizard-Bot") {
    set $allow_bot 1;
}
# use $allow_bot to bypass rate limits / challenges

Apache

SetEnvIfNoCase User-Agent "GSCWizard-Bot" allow_bot
<RequireAll>
    Require all granted
    Require env allow_bot
</RequireAll>

How to block GSCWizard-Bot

We respect robots.txt. Add the following rule to stop GSCWizard-Bot from fetching any page on your site:

User-agent: GSCWizard-Bot
Disallow: /

If you prefer a hard block at the edge, any rule that rejects requests where the User-Agent header matches GSCWizard-Bot will work - invert the allow-rules above.

Verifying the user agent

The full, exact user-agent string sent by our crawler is:

GSCWizard-Bot/1.0 (On-Page SEO Checker)

Requests always carry Accept: text/html and originate from our hosted workers. If someone else is sending a different user-agent string that contains "GSCWizard", it is not us - treat it accordingly.

Questions or abuse reports

If you believe GSCWizard-Bot is misbehaving, crawling more than described above, or you'd like us to stop crawling your site entirely, please get in touch: