Crawling

How to write a good robots.txt file

A practical guide to keeping crawler rules narrow, readable, and easy to maintain on a small website.

Published Jun 23, 2026 | Updated Jun 23, 2026

A good robots.txt file is usually small. Its job is not to control every indexing outcome on the site. It should mainly help crawlers avoid sections that are clearly not meant for public discovery, such as admin paths, internal search pages, or temporary utility folders.

The safest pattern is to keep the rules explicit and easy to read. Name the paths you actually want to limit, avoid broad blocks that are hard to audit later, and make sure you are not using robots.txt as a substitute for noindex, redirects, or removal when those are the better tools.

When a robots.txt file stays narrow, it is easier to maintain and much easier to reason about during a launch or audit. That matters because the best crawl controls are the ones your future self can still understand without guessing what each line was supposed to protect.

Why this guide matters

Use this guide when you want a little more context before publishing, need a quick refresher on best practices, or want to avoid the mistakes that commonly lead to crawl or indexing issues later.

Use this with the matching tool

Robots.txt Generator

If you want to apply this advice immediately, use the related tool and compare the output against the points covered in this guide.

Open Robots.txt Generator

Open related tool Browse all articles