Skip to main content
A well-structured .xml sitemap is an essential component when using Website in Agents. It ensures that all the important pages of your website are included in the crawl, helping your AI Agent access the most relevant content

Why is a sitemap important?

A sitemap is like a roadmap for the Website functionality in your Agent. You can easily add the information from your website via the Website feature.
A sitemap lists all the URLs on your website that you want Watermelon to access. By uploading a well-organized sitemap, you ensure that Watermelon knows exactly which pages to crawl and integrate into your AI Agent’s knowledge base. 
With a properly set up sitemap, Watermelon can:
  • Access all key pages: Ensure important pages (like product pages, FAQs, or blogs) are included.
  • Save time: Instead of manually adding individual URLs, you can use your sitemap to automatically fetch a list of all your key URLs.
  • Ensure content accuracy: A sitemap ensures that your AI Agent stays up to date with the most current version of your website’s content.

How to set up a sitemap

Setting up a sitemap is relatively easy, and there are various tools available to help you create one. Here are a few options:
  • CMS Plugins: Many content management systems (CMS) like WordPress have plugins (e.g., Yoast SEO, All in One SEO) that automatically generate an XML sitemap for your site.
  • Online Tools: You can also use free online sitemap generators like XML-sitemaps.com to create a sitemap quickly.
  • Manual Creation: If you’re comfortable with code, you can create a custom XML sitemap manually. For detailed instructions, see Google’s official guide on sitemaps.
Once your sitemap is ready, you can upload it to Website in Agents for quick and accurate crawling of your website’s content.

Best practices for an effective sitemap

1. Only include important pages

Ensure your sitemap contains the most relevant and important pages you want Watermelon to access. Avoid including URLs for irrelevant or duplicate content (such as filtered versions of the same page or admin pages). Examples of important pages to include:
  • Home page
  • Product or service pages
  • Blog and FAQ sections
  • Contact and pricing pages

2. Create a clean and simple URL structure in your .xml sitemap

Sitemaps should follow a clear and organized URL structure. Make sure your URLs are clean, concise, and easy to understand. It’s recommended to use a structure similar to the one shown below.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://www.example.com/foo.html</loc>
    <lastmod>2022-06-04</lastmod>
  </url>
</urlset>

3. Descriptive URLs

Make sure your URLs are clear and describe the content of the page. For example, use /blog/best-practices-for-AI Agent rather than /page?id=12345. This helps both Watermelon and search engines understand what each page is about.

4. Limit the size of your sitemap

Depending on your license, you have a limit on the amount of crawls per month. It’s recommended to take this into consideration when creating your Agent. If you have a large website, a sitemap can include many URLs, then it’s recommended to limit each sitemap to 50,000 URLs or 50MB in size to avoid performance issues. Consider splitting it into multiple sitemaps to make it easier for Watermelon to handle.
If your website contains a lot of product pages, we’d recommend to add these into a xml-feed other pages can be added via a sitemap. As a xml-feed reads product details better than the Website functionality.
For more information, see Google’s guidelines on sitemap limits.

5. Keep your sitemap updated

Whenever you add, remove, or change content on your website, make sure to update your sitemap. This ensures that Watermelon is always accessing the latest version of your site.

6. Avoid adding blocked URLs

Ensure that your sitemap doesn’t include any URLs that are blocked by robots.txt or have a “noindex” tag. These pages won’t be crawled, which could lead to incomplete knowledge for your AI Agent.