What is robots.txt Builder?
A robots.txt file gives crawlers site-level access rules. It can allow or disallow paths, point to the sitemap, and document crawler policies. Static sites often need a small, predictable robots.txt file because build outputs and GitHub Pages deployments only publish what exists in the final folder.
quickAnswer
Use robots.txt to tell crawlers which parts of your site they should not access. It controls crawl traffic, not indexing -- directives like noindex belong in meta tags or HTTP headers.
Last updated: 2026-05-25
limitations
- robots.txt directives are advisory. Bad actors and some AI crawlers may ignore them entirely.
- Blocking a page in robots.txt does not prevent indexing if other pages link to it. Use noindex meta tags or HTTP headers to prevent indexing.
- Each subdomain needs its own robots.txt file. The file at example.com/robots.txt does not apply to subdomain.example.com.
Sources:MDN Web Docs · W3C Specifications · jquery.app on GitHub
How to use this tool
- Enter the public site URL and sitemap URL.
- Choose whether normal crawlers should be allowed across the site.
- Add disallowed paths only when there is a real reason to block crawling.
- Copy the result into robots.txt at the published site root.
What you can use it for
- Create a clean robots.txt file for GitHub Pages.
- Add a sitemap reference without hand-writing the file.
- Document public crawler access before launch.