CheckSEO
Free Robots.txt Guide & Generator

Robots.txt Examples

Learn how robots.txt controls search engine crawling. Study basic and advanced examples, then generate or fetch a robots.txt for any website.

What Is a Robots.txt File?

A robots.txt file tells search engine crawlers (like Googlebot, Bingbot) which pages or sections of your website they are allowed or not allowed to crawl. It is placed at the root of your domain: yourwebsite.com/robots.txt.

Control Crawling

Block crawlers from accessing private or irrelevant sections

Save Crawl Budget

Focus crawler resources on your most important pages

Google Standard

Follows the Robots Exclusion Protocol recognized by all major engines

Sitemap Declaration

Point crawlers to your sitemap for better URL discovery

Basic Robots.txt Examples

A basic robots.txt uses User-agent to target crawlers and Disallow to block paths. An empty Disallow: means everything is allowed.

Allow All Crawlers

robots.txt - Allow Everything
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

User-agent: *
Disallow:

Sitemap: https://example.com/sitemap.xml

Block All Crawlers

robots.txt - Block Everything
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

User-agent: *
Disallow: /

Block Specific Directories

robots.txt - Block Selected Paths
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

User-agent: *
Disallow: /admin/
Disallow: /private/
Disallow: /tmp/
Allow: /

Sitemap: https://example.com/sitemap.xml

Directive Reference

DirectiveRequiredDescription
User-agentYesWhich crawler the rules apply to. Use * for all crawlers.
DisallowYesPath or prefix to block. Empty value means nothing is blocked.
AllowNoOverride a broader Disallow for a specific sub-path.
SitemapNoFull URL of your XML sitemap. Can appear multiple times.

Advanced Robots.txt Examples

Advanced robots.txt files use multiple user-agent blocks, wildcards, Crawl-delay, and specific rules for different bots.

Multiple User-Agents with Different Rules

robots.txt - Per-Bot Rules
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

# Rules for all crawlers
User-agent: *
Disallow: /admin/
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /search?
Allow: /

# Googlebot gets full access
User-agent: Googlebot
Disallow:

# Block AI training bots
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

Sitemap: https://example.com/sitemap.xml

E-Commerce Robots.txt

robots.txt - E-Commerce Website
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

User-agent: *
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /wishlist/
Disallow: /search?
Disallow: /*?sort=
Disallow: /*?filter=
Disallow: /*?page=
Allow: /products/
Allow: /categories/
Allow: /blog/
Allow: /

Crawl-delay: 1

Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap-products.xml
Sitemap: https://example.com/sitemap-blog.xml

WordPress Robots.txt

robots.txt - WordPress Website
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /trackback/
Disallow: /feed/
Disallow: /*?replytocom=
Disallow: /*?s=
Allow: /wp-admin/admin-ajax.php
Allow: /wp-content/uploads/
Allow: /

Sitemap: https://example.com/sitemap_index.xml

Wildcard Patterns

Google and Bing support * (match any sequence) and $ (match end of URL) in Disallow and Allow directives.

robots.txt - Wildcard Patterns
# Generated by CheckSEO (https://checkseo.in/robots-txt-generator.html)

User-agent: *
# Block all PDF files
Disallow: /*.pdf$

# Block all URLs with query parameters
Disallow: /*?

# Block all URLs containing /print/
Disallow: /*/print/

# Allow specific file types
Allow: /*.js$
Allow: /*.css$
Allow: /*.png$
Allow: /*.jpg$

Sitemap: https://example.com/sitemap.xml

Advanced Directive Reference

DirectiveSupportDescription
Crawl-delayPartialSeconds between requests. Supported by Bing/Yandex but ignored by Google.
* (wildcard)Google/BingMatch any sequence of characters in Allow/Disallow paths.
$ (end match)Google/BingMatch end of URL. Example: /*.pdf$ blocks all PDFs.
Multiple SitemapAllDeclare multiple sitemap files. Place outside User-agent blocks.

Google Robots.txt Guidelines

Do
  • Place robots.txt at the root of your domain
  • Use UTF-8 encoding
  • Keep rules simple and test with Google's robots.txt tester
  • Declare your sitemap URL in robots.txt
  • Use Allow to override broader Disallow rules
  • Use specific User-agent when rules differ by bot
Don't
  • Use robots.txt as a security tool (it doesn't hide content)
  • Block CSS/JS files that Google needs to render pages
  • Block pages you want indexed (use noindex meta tag instead)
  • Forget that Disallow only prevents crawling, not indexing
  • Set Crawl-delay too high (it slows discovery)
  • Use robots.txt to remove pages from search results

Fetch Robots.txt from Website

Enter any website URL to fetch its current robots.txt file. If no robots.txt exists, we will generate a recommended default for you.

Build Robots.txt

Use this visual builder to create a custom robots.txt file. Fill in the fields and download your file.

Want to Monitor Your Robots.txt?

CheckSEO validates your robots.txt automatically, tracks directive changes, detects crawl-blocking risks, and alerts you when robots.txt issues affect your search visibility. Go beyond static files with continuous robots.txt monitoring.

Automatic robots.txt validation User-agent rule tracking Crawl risk detection Sitemap declaration check