How to Create a robots.txt File (With Examples)
The robots.txt file sits at your domain's root (example.com/robots.txt) and tells search engine crawlers which URLs they can and cannot access. It's a crawl directive, not a security mechanism โ it politely asks crawlers to skip certain paths, but doesn't prevent access. Every public website should have one.
- Covers basic robots.txt syntax.
- Covers common rules.
- Covers common mistakes.
- Covers testing your robots.txt.
Basic robots.txt Syntax
A robots.txt file consists of one or more blocks, each starting with a User-agent line followed by Allow or Disallow rules:
User-agent: *
Disallow: /admin/
Disallow: /api/
Allow: /api/public/
Sitemap: https://example.com/sitemap.xml
User-agent: * applies to all crawlers. Disallow prevents crawling of the specified path. Allow overrides a Disallow for a more specific path. The Sitemap line tells crawlers where to find your XML sitemap.
Common Rules
Block admin areas: Disallow: /admin/
Block staging/test pages: Disallow: /staging/
Block search result pages: Disallow: /search?
Block print versions: Disallow: /*?print
Allow everything (default): User-agent: * followed by no Disallow rules.
Block everything (pre-launch): User-agent: * Disallow: /
Be careful with 'Disallow: /' โ it blocks your entire site from search engines. This is useful before launch but catastrophic if left in production.
Common Mistakes
Blocking CSS and JavaScript files โ Google needs to render your pages to assess quality. Don't disallow your /css/ or /js/ directories. Blocking the entire site accidentally with Disallow: / during development and forgetting to remove it. Using robots.txt for security โ it doesn't prevent access, it just asks crawlers to skip the URL. Sensitive pages should require authentication, not rely on robots.txt. Forgetting the trailing slash on directory paths โ Disallow: /admin blocks the path /admin but not /admin/ (though most crawlers treat them the same).
background-size animation or @property registered custom properties instead.Testing Your robots.txt
Google Search Console has a robots.txt tester that shows which URLs are blocked and which are allowed. Test specific URLs against your rules before deploying. The Robots.txt Generator builds the file interactively โ add rules, see the output, and test URLs against it in real time.
Frequently Asked Questions
What is robots.txt?
Does robots.txt prevent pages from being indexed?
Where do I put robots.txt?
Use the Robots.txt Generator โ free, no signup required.
โก Open Robots.txt Generator