Troubleshooting: XML Sitemap when hosting on GitHub
TLDR: Don’t commit your auto-generated sitemap.xml to GitHub. It won’t be updated
Why sitemap is important?
Sitemaps encourage crawling. So if you want your site to be “discovered” by Google, it’s better to have an up-to-date sitemap. You can check your website’s indexing state in Google Search Console by going to the Coverage tab or simply by typing site:yourwebsite.com
in Google.
Note: For small site’s with fewer than 500 pages and well-linked structure, sitemap is not necessary - Google will index all pages anyway. Still, it’s better to monitor the indexing “health” of your site as it grows.
Context
I host my Minimal Mistakes Jekyll site on GitHub Pages and commit all the changes there. sitemap.xml is autogenerated with jekyll-sitemap plugin
Problem
Unindexed pages as shown in Google Search Console
Index Coverage report in Google Search Console gave me 1 error message and 12 pages that were indexed, but not submitted in sitemap:
Solution
Delete sitemap.xml from GitHub
Building site locally showed no sitemap problems, so jekyll-plugin worked as intended. The problem was that I committed the sitemap.xml to GitHub Pages where I host my site. When generating the site, GitHub also generates the new sitemap.xml via jekyll-plugin. But because such file already existed (the old version that was committed earlier), it simply ignored the new version.
The mistake is really stupid. Hope you learned from this post not to make it.
Leave a comment