How to build a clean XML sitemap
A simple approach to keeping only preferred URLs in your sitemap and avoiding crawl noise.
A clean XML sitemap should feel like a curated index of the pages you want search systems to care about. That means canonical URLs only, no redirect chains, and no low-value duplicates that dilute the file's usefulness.
The easiest way to keep it clean is to generate it from the same preferred URL source you use for internal links and canonicals. If those three signals all agree, the sitemap stays much more trustworthy and you spend less time cleaning up drift later.
For small sites, less is usually better. A shorter sitemap with the right pages in it is more valuable than a larger one filled with every route the platform can produce. The file should help discovery, not act like a dump of everything your CMS knows about.
Use this guide when you want a little more context before publishing, need a quick refresher on best practices, or want to avoid the mistakes that commonly lead to crawl or indexing issues later.
If you want to apply this advice immediately, use the related tool and compare the output against the points covered in this guide.