> ## Documentation Index
> Fetch the complete documentation index at: https://watermelon.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Best practices for creating a sitemap

> Learn how to create an optimized sitemap

A well-structured **.xml sitemap** is an essential component when using **Website** in **Agents**. It ensures that all the important pages of your website are included in the crawl, helping your AI Agent access the most relevant content.

<Info>
  A sitemap can be used in all licenses.
</Info>

## **Why is a sitemap important?**

A sitemap is a roadmap of your website for your Agent. Using the Website feature, you can easily add website content. The sitemap lists the URLs you want Watermelon to access, helping it know exactly which pages to crawl and add to your AI Agent’s knowledge base.

**With a properly set up sitemap, Watermelon can:**

* **Access all key pages**: Ensure important pages (like product pages, FAQs, or blogs) are included.
* **Save time**: Instead of manually adding individual URLs, you can use your sitemap to automatically fetch a list of all your key URLs.
* **Ensure content accuracy**: A sitemap ensures that your AI Agent stays up to date with the most current version of your website’s content.

## **How to set up a sitemap**

Setting up a sitemap is relatively easy, and there are various tools available to help you create one. Here are a few options:

* **CMS Plugins**: Many content management systems (CMS) like WordPress have plugins (e.g., Yoast SEO, All in One SEO) that automatically generate an XML sitemap for your site.
* **Online Tools**: You can also use free online sitemap generators like [XML-sitemaps.com](http://XML-sitemaps.com) to create a sitemap quickly.
* **Manual Creation**: If you’re comfortable with code, you can create a custom XML sitemap manually. For detailed instructions, see [Google’s official guide on sitemaps](https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap).

Once your sitemap is ready, you can upload it to **Website** in Agents for quick and accurate crawling of your website’s content.

## **Best practices for an effective sitemap**

### 1. Only include important pages

Ensure your sitemap contains the most relevant and important pages you want Watermelon to access. Avoid including URLs for irrelevant or duplicate content (such as filtered versions of the same page or admin pages).

Examples of important pages to include:

* Home page
* Product or service pages
* Blog and FAQ sections
* Contact and pricing pages

### **2. Create a clean and simple URL structure in your .xml sitemap**

Sitemaps should follow a clear and organized URL structure. Make sure your URLs are clean, concise, and easy to understand. It's recommended to use a structure similar to the one shown below.

```text theme={null}
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://www.example.com/foo.html</loc>
    <lastmod>2026-06-04</lastmod>
  </url>
</urlset>
```

### **3. Descriptive URLs**

Make sure your URLs are clear and describe the content of the page. For example, use **/blog/best-practices-for-ai-agent** rather than **/page?id=12345**. This helps both Watermelon and search engines understand what each page is about.

### **4. Limit the size of your sitemap**

Depending on your license, you have a limit on the amount of crawls per month. It's recommended to take this into consideration when creating your Agent.

If you have a large website, a sitemap can include many URLs, then it's recommended to limit each sitemap to **50,000 URLs** or **50MB in size** to avoid performance issues. Consider splitting it into multiple sitemaps to make it easier for Watermelon to handle.\
If your website contains a lot of product pages, we'd recommend to add these into a xml-feed other pages can be added via a sitemap. As a xml-feed reads product details better than the Website functionality.

For more information, see [Google’s guidelines on sitemap limits](https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap).

### **5. Keep your sitemap updated**

Whenever you add, remove, or change content on your website, make sure to update your sitemap. This ensures that Watermelon is always accessing the latest version of your site.

### **6. Avoid adding blocked URLs**

Ensure that your sitemap doesn’t include any URLs that are blocked by **robots.txt** or have a “noindex” tag. These pages won’t be crawled, which could lead to incomplete knowledge for your AI Agent. Watermelon will skip this URL and not add the content to the Agent.

<Tip>
  For more information on how to add the sitemap to your Agent, check [this article](/help-center/features/website-source).
</Tip>
