SEOApril 2026 · 6 min read

How to Create a robots.txt File (With Examples)

The robots.txt file sits at your domain's root (example.com/robots.txt) and tells search engine crawlers which URLs they can and cannot access. It's a crawl directive, not a security mechanism — it politely asks crawlers to skip certain paths, but doesn't prevent access. Every public website should have one.

🤖

Try the Robots.txt Generator

Free, no signup

→

Derek Giordano

Designer & Developer

In this guide

01Basic robots.txt Syntax02Common Rules03Common Mistakes04Testing Your robots.txt

⚡ Key Takeaways

Covers basic robots.txt syntax.
Covers common rules.
Covers common mistakes.
Covers testing your robots.txt.

Basic robots.txt Syntax

A robots.txt file consists of one or more blocks, each starting with a User-agent line followed by Allow or Disallow rules:

User-agent: *

Disallow: /admin/

Disallow: /api/

Allow: /api/public/

Sitemap: https://example.com/sitemap.xml

User-agent: * applies to all crawlers. Disallow prevents crawling of the specified path. Allow overrides a Disallow for a more specific path. The Sitemap line tells crawlers where to find your XML sitemap.

Common Rules

Block admin areas: Disallow: /admin/

💡 Tip

Use 3+ color stops instead of 2 to avoid the muddy gray band that appears in the center of complementary-color gradients.

Block staging/test pages: Disallow: /staging/

Block search result pages: Disallow: /search?

Block print versions: Disallow: /*?print

Allow everything (default): User-agent: * followed by no Disallow rules.

Block everything (pre-launch): User-agent: * Disallow: /

Be careful with 'Disallow: /' — it blocks your entire site from search engines. This is useful before launch but catastrophic if left in production.

Common Mistakes

Blocking CSS and JavaScript files — Google needs to render your pages to assess quality. Don't disallow your /css/ or /js/ directories. Blocking the entire site accidentally with Disallow: / during development and forgetting to remove it. Using robots.txt for security — it doesn't prevent access, it just asks crawlers to skip the URL. Sensitive pages should require authentication, not rely on robots.txt. Forgetting the trailing slash on directory paths — Disallow: /admin blocks the path /admin but not /admin/ (though most crawlers treat them the same).

⚠ Warning

CSS gradients used as backgrounds cannot be animated with standard transitions. Use background-size animation or @property registered custom properties instead.

Testing Your robots.txt

Google Search Console has a robots.txt tester that shows which URLs are blocked and which are allowed. Test specific URLs against your rules before deploying. The Robots.txt Generator builds the file interactively — add rules, see the output, and test URLs against it in real time.

Frequently Asked Questions

What is robots.txt?+

A text file at your domain root that tells search engine crawlers which URLs they should and shouldn't crawl. It controls crawl behavior, not indexing — use noindex meta tags to prevent indexing.

Does robots.txt prevent pages from being indexed?+

No. It prevents crawling, not indexing. If other sites link to a disallowed URL, Google may still index it (without the page content). Use a noindex meta tag to prevent indexing.

Where do I put robots.txt?+

At the root of your domain: https://example.com/robots.txt. It must be at the root — subdirectory placement (example.com/blog/robots.txt) won't work.

Try it yourself

Use the Robots.txt Generator — free, no signup required.

⚡ Open Robots.txt Generator

Derek Giordano

Written by the creator of Ultimate Design Tools. BA in Business Marketing.

📚 References & Further Reading

⚡ Try the free .htaccess Generator →