NNorthstar SEO ToolkitTechnical SEO tools for lean websites
Sitemaps

How to build a clean XML sitemap

A simple approach to keeping only preferred URLs in your sitemap and avoiding crawl noise.

Published Jun 23, 2026 | Updated Jun 23, 2026

A clean XML sitemap should feel like a curated index of the pages you want search systems to care about. That means canonical URLs only, no redirect chains, and no low-value duplicates that dilute the file's usefulness.

The easiest way to keep it clean is to generate it from the same preferred URL source you use for internal links and canonicals. If those three signals all agree, the sitemap stays much more trustworthy and you spend less time cleaning up drift later.

For small sites, less is usually better. A shorter sitemap with the right pages in it is more valuable than a larger one filled with every route the platform can produce. The file should help discovery, not act like a dump of everything your CMS knows about.

Why this guide matters

Use this guide when you want a little more context before publishing, need a quick refresher on best practices, or want to avoid the mistakes that commonly lead to crawl or indexing issues later.

Use this with the matching tool
Sitemap Generator

If you want to apply this advice immediately, use the related tool and compare the output against the points covered in this guide.