Technical 11 min read

Sitemap Index Files: When and How to Use Them (Complete Guide)

Sitemap Index Files: When and How to Use Them (Complete Guide)

Got more than 50,000 URLs on your website? Or maybe your sitemap file is pushing 50MB? Then you need a sitemap index.

Here's the deal: XML sitemaps have hard limits—50,000 URLs or 50MB per file, whichever comes first. If you exceed these limits, Google won't process your sitemap properly. The solution? Split your sitemap into multiple files and use a sitemap index to organize them.

Think of a sitemap index as a "table of contents" for your sitemaps. It's a master file that points to all your individual sitemap files, telling search engines where to find everything.

In this guide, I'll show you when you need a sitemap index, how to create one, and the best ways to organize your sitemaps for maximum SEO benefit.

What is a Sitemap Index?

A sitemap index is an XML file that contains a list of other sitemap files. Instead of listing individual URLs, it lists sitemap locations.

Simple example:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-posts.xml</loc>
    <lastmod>2025-11-26</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
    <lastmod>2025-11-25</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
    <lastmod>2025-11-20</lastmod>
  </sitemap>
</sitemapindex>

Key differences from regular sitemaps:

Feature Regular Sitemap Sitemap Index
Root element <urlset> <sitemapindex>
Contains Individual URLs Links to other sitemaps
Child elements <url> entries <sitemap> entries
Purpose List pages to index Organize multiple sitemaps
File limit 50,000 URLs or 50MB 50,000 sitemaps

When Do You Need a Sitemap Index?

Scenario 1: You Have More Than 50,000 URLs

This is the most common reason. If your site has:

  • 100,000 product pages (e-commerce)
  • 75,000 blog posts (news site)
  • 200,000 user-generated pages (forum, marketplace)

You must split into multiple sitemaps.

Example structure for 150,000 URLs:

sitemap_index.xml (main file)
├── sitemap-1.xml (50,000 URLs)
├── sitemap-2.xml (50,000 URLs)
└── sitemap-3.xml (50,000 URLs)

Scenario 2: Your Sitemap Exceeds 50MB

Even if you have fewer than 50,000 URLs, if your sitemap file is larger than 50MB uncompressed, you need to split it.

This happens when:

  • You include lots of image sitemap data
  • You include video sitemap metadata
  • You have very long URLs
  • You include extensive hreflang annotations

Solutions:

  1. Compress with gzip: Can reduce file size by 80-90%
  2. Split into multiple files: If compression isn't enough
  3. Remove unnecessary data: Strip out optional elements

Scenario 3: You Want Better Organization

Even if you're under the limits, sitemap indexes help you organize content logically.

Benefits of organization:

  • Easier debugging: Isolate issues to specific content types
  • Faster updates: Regenerate only changed sections
  • Better monitoring: Track indexing by content type in Search Console
  • Cleaner architecture: Separate concerns (blog vs products vs pages)

Example for a mid-sized site (20,000 URLs):

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-blog.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-categories.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
  </sitemap>
</sitemapindex>

Scenario 4: You Have Multiple Languages or Regions

For international sites, organize by language or region:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-en-us.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-en-gb.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-es-es.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-fr-fr.xml</loc>
  </sitemap>
</sitemapindex>

How to Create a Sitemap Index

Method 1: Manual Creation (Small Sites)

For simple setups, create the index file manually.

Step-by-step:

  1. Create your individual sitemaps:
  2. sitemap-posts.xml
  3. sitemap-pages.xml
  4. sitemap-products.xml

  5. Create the index file (sitemap_index.xml):

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://yourdomain.com/sitemap-posts.xml</loc>
    <lastmod>2025-11-26</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://yourdomain.com/sitemap-pages.xml</loc>
    <lastmod>2025-11-26</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://yourdomain.com/sitemap-products.xml</loc>
    <lastmod>2025-11-26</lastmod>
  </sitemap>
</sitemapindex>
  1. Upload all files to your website root
  2. Submit the index to Google Search Console

Important: Only submit the index file, not the individual sitemaps. Google will automatically discover and crawl the child sitemaps.

Method 2: WordPress Plugins (Automatic)

Popular SEO plugins automatically create sitemap indexes when needed.

Yoast SEO:

  • Automatically creates /sitemap_index.xml
  • Splits by content type (posts, pages, categories, tags)
  • Updates automatically when you publish content

Rank Math:

  • Creates /sitemap_index.xml
  • More granular control over what's included
  • Can set custom limits per sitemap

All in One SEO:

  • Creates /sitemap.xml (which is actually an index)
  • Organizes by post type
  • Includes images automatically

To verify:

  1. Visit https://yourdomain.com/sitemap_index.xml
  2. You should see a list of child sitemaps
  3. Click each one to verify it contains URLs

Method 3: Python Script (Custom Sites)

For custom sites, generate the index programmatically.

Example Python script:

from datetime import datetime

def generate_sitemap_index(sitemap_files):
    """Generate a sitemap index from a list of sitemap URLs"""
    xml = '<?xml version="1.0" encoding="UTF-8"?>\n'
    xml += '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n'

    for sitemap_url in sitemap_files:
        xml += '  <sitemap>\n'
        xml += f'    <loc>{sitemap_url}</loc>\n'
        xml += f'    <lastmod>{datetime.now().strftime("%Y-%m-%d")}</lastmod>\n'
        xml += '  </sitemap>\n'

    xml += '</sitemapindex>'
    return xml

# Usage
sitemaps = [
    'https://example.com/sitemap-posts.xml',
    'https://example.com/sitemap-pages.xml',
    'https://example.com/sitemap-products.xml',
]

index_xml = generate_sitemap_index(sitemaps)

# Write to file
with open('sitemap_index.xml', 'w', encoding='utf-8') as f:
    f.write(index_xml)

print("Sitemap index created successfully!")

For dynamic generation (Flask example):

from flask import Flask, Response
from datetime import datetime
import os

app = Flask(__name__)

@app.route('/sitemap_index.xml')
def sitemap_index():
    # Discover all sitemap files
    sitemap_files = []
    for filename in os.listdir('./sitemaps'):
        if filename.startswith('sitemap-') and filename.endswith('.xml'):
            sitemap_files.append(f'https://example.com/sitemaps/{filename}')

    # Generate index
    xml = '<?xml version="1.0" encoding="UTF-8"?>\n'
    xml += '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n'

    for sitemap_url in sorted(sitemap_files):
        xml += '  <sitemap>\n'
        xml += f'    <loc>{sitemap_url}</loc>\n'
        xml += f'    <lastmod>{datetime.now().strftime("%Y-%m-%d")}</lastmod>\n'
        xml += '  </sitemap>\n'

    xml += '</sitemapindex>'

    return Response(xml, mimetype='application/xml')

Method 4: Node.js Script

Express.js example:

const express = require('express');
const app = express();

const sitemapFiles = [
  'https://example.com/sitemap-posts.xml',
  'https://example.com/sitemap-pages.xml',
  'https://example.com/sitemap-products.xml',
];

app.get('/sitemap_index.xml', (req, res) => {
  const now = new Date().toISOString().split('T')[0];

  let xml = '<?xml version="1.0" encoding="UTF-8"?>\n';
  xml += '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n';

  sitemapFiles.forEach(url => {
    xml += '  <sitemap>\n';
    xml += `    <loc>${url}</loc>\n`;
    xml += `    <lastmod>${now}</lastmod>\n`;
    xml += '  </sitemap>\n';
  });

  xml += '</sitemapindex>';

  res.header('Content-Type', 'application/xml');
  res.send(xml);
});

app.listen(3000);

Best Practices for Organizing Sitemaps

1. Organize by Content Type

Recommended structure:

sitemap_index.xml
├── sitemap-posts.xml (blog posts)
├── sitemap-pages.xml (static pages)
├── sitemap-products.xml (product pages)
├── sitemap-categories.xml (category pages)
└── sitemap-authors.xml (author archives)

Benefits:

  • Easy to debug issues with specific content types
  • Can exclude entire content types if needed
  • Clear organization in Search Console
  • Easier to regenerate specific sections

2. Organize by Update Frequency

For sites with mixed content freshness:

sitemap_index.xml
├── sitemap-daily.xml (news, trending content)
├── sitemap-weekly.xml (blog posts)
├── sitemap-monthly.xml (product pages)
└── sitemap-static.xml (rarely changing pages)

Benefits:

  • Google can prioritize fresh content
  • Accurate <lastmod> dates per sitemap
  • Efficient crawl budget usage

3. Organize by Date (For Large Archives)

For sites with massive archives:

sitemap_index.xml
├── sitemap-2025-11.xml
├── sitemap-2025-10.xml
├── sitemap-2025-09.xml
└── ... (one per month)

Benefits:

  • Old sitemaps rarely need regeneration
  • Easy to add new months
  • Historical organization

4. Organize by Language/Region

For international sites:

sitemap_index.xml
├── sitemap-en-us.xml (English - US)
├── sitemap-en-gb.xml (English - UK)
├── sitemap-es-mx.xml (Spanish - Mexico)
├── sitemap-fr-fr.xml (French - France)
└── sitemap-de-de.xml (German - Germany)

Benefits:

  • Clear separation of language versions
  • Easier hreflang management
  • Can target specific regions in Search Console

Sitemap Index Limits and Rules

Hard Limits

Limit Value What Happens If Exceeded
Max sitemaps per index 50,000 Google won't process extras
Max file size (uncompressed) 50MB Google won't process
Max file size (gzipped) 10MB recommended Slower processing

Best Practices

Keep individual sitemaps manageable:

  • Aim for 10,000-40,000 URLs per sitemap
  • Don't max out at exactly 50,000
  • Leave room for growth

Use consistent naming:

✅ Good:
sitemap-posts.xml
sitemap-pages.xml
sitemap-products.xml

❌ Bad:
blog.xml
static.xml
prod_sitemap.xml

Include lastmod dates:

<sitemap>
  <loc>https://example.com/sitemap-posts.xml</loc>
  <lastmod>2025-11-26</lastmod>   Include this!
</sitemap>

Use absolute URLs:

 Good:
<loc>https://example.com/sitemap-posts.xml</loc> Bad:
<loc>/sitemap-posts.xml</loc>
<loc>sitemap-posts.xml</loc>

Common Mistakes to Avoid

Mistake #1: Submitting Individual Sitemaps

Wrong approach:

  • Submit sitemap-posts.xml to Search Console
  • Submit sitemap-pages.xml to Search Console
  • Submit sitemap-products.xml to Search Console

Right approach:

  • Submit only sitemap_index.xml
  • Google automatically discovers child sitemaps

Mistake #2: Mixing URL and Sitemap Entries

Wrong (mixing in one file):

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-posts.xml</loc>
  </sitemap>
  <url>   Can't mix these!
    <loc>https://example.com/about</loc>
  </url>
</sitemapindex>

Right: Keep them separate. Index files contain only sitemap references, not URLs.

Mistake #3: Circular References

Wrong:

<!-- sitemap_index.xml -->
<sitemap>
  <loc>https://example.com/sitemap_index.xml</loc>   Don't reference itself!
</sitemap>

Right: Index files should only reference child sitemaps, never themselves.

Mistake #4: Broken Child Sitemap URLs

If a child sitemap returns 404, Google can't process it.

How to avoid:

  • Test all child sitemap URLs before submitting
  • Use absolute URLs
  • Verify files are uploaded correctly

Mistake #5: Outdated lastmod Dates

Wrong:

<sitemap>
  <loc>https://example.com/sitemap-posts.xml</loc>
  <lastmod>2020-01-01</lastmod>   5 years old!
</sitemap>

Right: Update lastmod when the child sitemap changes.

Monitoring and Maintenance

Check Google Search Console

  1. Go to Sitemaps section
  2. Submit your sitemap index
  3. Wait 24-48 hours
  4. Check the "Discovered URLs" count

What to look for:

  • All child sitemaps discovered
  • No errors or warnings
  • URL count matches expectations

Regular Audits

Monthly checklist:

  • [ ] Verify all child sitemaps are accessible
  • [ ] Check for 404 errors in child sitemaps
  • [ ] Confirm lastmod dates are accurate
  • [ ] Review URL counts per sitemap
  • [ ] Check for new content types that need new sitemaps

Automated Monitoring

Set up alerts for:

  • Child sitemap 404 errors
  • Sitemap file size approaching limits
  • URL count approaching 50,000
  • Processing errors in Search Console

Migrating to a Sitemap Index

Already have a single large sitemap? Here's how to migrate:

Step 1: Create child sitemaps

  • Split your existing sitemap by content type or date
  • Keep each under 50,000 URLs

Step 2: Create the index file

  • Reference all new child sitemaps
  • Use consistent naming

Step 3: Upload everything

  • Upload index and all child sitemaps
  • Test each URL to verify they work

Step 4: Update robots.txt

Old:
Sitemap: https://example.com/sitemap.xml

New:
Sitemap: https://example.com/sitemap_index.xml

Step 5: Submit to Search Console

  • Remove old sitemap
  • Submit new index
  • Monitor for errors

Step 6: Keep old sitemap temporarily

  • Leave old sitemap in place for 30 days
  • Helps with transition
  • Remove after Google processes new index

Advanced: Dynamic Sitemap Index Generation

For very large sites, generate the index dynamically based on available sitemaps.

Python example with auto-discovery:

import os
import glob
from datetime import datetime
from flask import Flask, Response

app = Flask(__name__)

@app.route('/sitemap_index.xml')
def sitemap_index():
    # Auto-discover all sitemap files
    sitemap_pattern = './sitemaps/sitemap-*.xml'
    sitemap_files = glob.glob(sitemap_pattern)

    # Generate index
    xml = '<?xml version="1.0" encoding="UTF-8"?>\n'
    xml += '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n'

    for filepath in sorted(sitemap_files):
        filename = os.path.basename(filepath)
        mod_time = datetime.fromtimestamp(os.path.getmtime(filepath))

        xml += '  <sitemap>\n'
        xml += f'    <loc>https://example.com/sitemaps/{filename}</loc>\n'
        xml += f'    <lastmod>{mod_time.strftime("%Y-%m-%d")}</lastmod>\n'
        xml += '  </sitemap>\n'

    xml += '</sitemapindex>'

    return Response(xml, mimetype='application/xml')

Benefits:

  • Automatically includes new sitemaps
  • Uses actual file modification times
  • No manual updates needed

Next Steps

Now that you understand sitemap indexes:

  • Keep it under limits: 50,000 sitemaps per index, 50,000 URLs per sitemap
  • Monitor regularly: Check Search Console for errors and processing status
  • Update lastmod dates: Keep them accurate for efficient crawling

Bottom line: Sitemap indexes aren't just for massive sites. Even mid-sized sites benefit from the organization and flexibility they provide.

Ready to see how your sitemap index is structured? Visualize it with our tool to understand your site's organization at a glance.

Ready to audit your sitemap?

Visualize your site structure, spot errors, and improve your SEO with our free tool.

Launch Sitemap Explorer