AIクローラー robots.txt ビルダーとは
A robots.txt file tells crawlers which parts of your site they may access. With the rise of AI training crawlers from OpenAI, Anthropic, Google, Perplexity, and Common Crawl, publishers now use robots.txt to signal whether their content may be used for AI training. This tool builds AI-specific robots.txt blocks with clear categories for training bots versus search/referral bots.
クイックアンサー
Build robots.txt rules to control which AI crawlers access your site. Use a selective policy to block training bots (GPTBot, ClaudeBot, Google-Extended) while allowing search engines (Googlebot, Bingbot). Place specific crawler blocks above general wildcard rules in robots.txt.
制限事項
- robots.txt is a voluntary standard — not all crawlers respect it. Some AI data collection happens through means other than web crawling, and robots.txt has no enforcement mechanism.
- Some CDNs and WAFs (including Cloudflare Bot Management) can override robots.txt with their own bot-blocking rules. Check your CDN configuration after deploying robots.txt changes.
- New AI crawlers appear regularly. This tool includes crawlers known as of early 2026. Check for new crawler names periodically and update your robots.txt accordingly.
使い方
- Choose a policy preset: Open (block nothing), Selective (block training bots, allow search engines), or Strict (block all AI crawlers).
- Customize individual crawler blocks by checking or unchecking specific bots.
- Copy the generated robots.txt blocks and add them to your site's robots.txt file, above any general wildcard rules.
主な用途
- Block AI training crawlers from OpenAI, Anthropic, and others while keeping Google and Bing search indexing active.
- Create a strict policy that blocks all known AI crawlers from accessing any content.
- Add explanatory comments to robots.txt so other developers understand the policy decisions.