loading

Create Sitemaps and use a Sitemap Index

Sitemap

Sitemaps are often overlooked when creating a website, largely because most are unaware of the benefits and process. Having a properly formatted sitemap index and sitemap files that are also submitted to Google WebmasterTools will result in increased indexed pages, which means Google (and other search engines) have greater visibility of your website that they may not have otherwise. For new websites, it also helps your pages get indexed quicker, and with greater coverage. Lets look at an example:

Depending on the size of your website, you may want to split your URLs into several sitemaps, which will all be referenced in a sitemap index file. This is useful for websites with more than 30 or so webpage, and can help more of your pages get indexed, and faster. It also organizes things a bit better. I’ll get to that later.

Also worth noting is that there are several tools that can be used to generate sitemaps, but it is also highly desirable to have your developers, if you have developers, customize a stiemap to exclude any pages with session IDs, shopping carts, etc. You can do this manually, but it takes more time. A few tools that can crawl websites fairly well include GSiteCrawler and Xenu Link Sleuth. The only downfall here is that they basically see what search engines see. If your sites have pages that aren’t directly linked to on your site anywhere, these tools will not see them and you need to add those into the sitemaps manually.

Standard Sitemap Format:

This format should be used for each sitemap, whether that sitemap is the only one on your site or if it is included as one of many.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.decisivedesign.com</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/webpage1</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
</urlset>

The urlset tag and line is required, including the url for the namespace (xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″). The url and loc tags are also required. Lastmod, changefreq, and priority are all optional. You can read more about them here on the Sitemaps.org Protocol page.

If you have a small website, with no large categories of products, services, or pages, you can include all of your URLs into one sitemap file using the above code as an example (I would generate it using one of the above tools, though), save it in your root web directory as sitemap.xml (or something else), and submit it to the search engines. Bam, you’re done.

Splitting sitemaps and using sitemap indexes:

Is your website a bit larger? Do you want quicker indexing of your pages and a larger percentage of them to get indexed? Read on.

If your website is large and has product or service categories, or any hierarchical navigation, you can use that as a logical way to split your sitemaps up. This will keep things organized, and we have also seen an improved percentage of our webpage urls get indexed, faster, using this method.

Setup

Lets say the DecisiveDesign site sold website browsers (it’s a fictional story, people!).

You have Internet Explorer, Firefox, and Chrome (only 3, to keep it simple). These are your categories, silos, types of product, whatever you want to call them.

Then you have individual pages within each silo, lets say 3 articles each (at individual urls).

So what we want here is a sitemap for each set of articles, then a sitemap index that references those 3 sitemaps.

Sitemaps

Explorer-sitemap.xml

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.decisivedesign.com/Internet-Explorer/IE-text-shadow-not-recognized.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/Internet-Explorer/IE-not-following-web-standards.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/Internet-Explorer/why-doesnt-IE-work-right.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
</urlset>

 

Firefox-sitemap.xml

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.decisivedesign.com/Firefox/firefox-is-a-good-browser.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/Firefox/firefox-renders-text-shadow-kind-of-funny.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/Firefox/use-firefox-instead-of-IE.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
</urlset>

 

Chrome-sitemap.xml

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.decisivedesign.com/Chrome/chrome-is-probably-the-fastest-browser.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/Chrome/chrome-will-continue-improving.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.decisivedesign.com/Chrome/google-owns-our-interwebs.html</loc>
<lastmod>2011-08-26T05:11:19Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
</urlset>

 

Sitemap Index (sitemap-index.xml)

Now it’s time to create the sitemap index, which will reference the URLs of the sitemaps you created above. In this instance, this is the only file you need to submit to search engines like Google. Their spiders will follow the index and pick up the individual sitemaps and URLs within automatically. Host this file as sitemap-index.xml (or similar) on the root of the director, and make sure to upload all the other sitemap files to whatever path you choose. Then submit it to your search engines.

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>http://www.decisivedesign.com/sitemaps/Explorer-sitemap.xml</loc></sitemap>
<sitemap><loc>http://www.decisivedesign.com/sitemaps/Firefox-sitemap.xml</loc></sitemap>
<sitemap><loc>http://www.decisivedesign.com/sitemaps/Chrome-sitemap.xml</loc></sitemap>
</sitemapindex>

That’s about it. It looks simple (or it may not), but honestly I manually pulled out 85,000 urls today into 3 sets of 51 sitemaps… and it took me several hours. I highly suggest having a coder run through it and automate the process if you have the means. However, like in my case, doing it now and spending the time on it will likely still be worth every minute.

Let me know if you have any questions – hope I haven’t missed anything or poorly explained something, it’s kinda late!

 

7 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  • Hi there! Would you mind if I share your blog with my twitter group? There’s a lot of folks that I think would really enjoy your content. Please let me know. Cheers

    Reply
    • Jason

      Of course not!

      Reply
  • I am really loving the theme/design of your site. Do you ever run into any browser compatibility issues? A couple of my blog readers have complained about my site not working correctly in Explorer but looks great in Firefox. Do you have any suggestions to help fix this problem?

    Reply
    • Jason

      Internet Explorer has many compliancy issues. They attempt to create their own standards and push them because they have a lot of software that is widespread. Things are slowly changing though. I always try to let people know about IE’s shenanigans and offer better alternatives like Google Chrome & Firefox, but until everyone is standardized we will have to hack things to appear correctly in IE, unfortunately.

      If you are having specific problems, Google the issues and see if you can’t find fixes/hacks/tweaks that will make it look ‘right’ in IE.

      Reply
  • When I originally commented I clicked the “Notify me when new comments are added” checkbox and now each time a comment is added I get four e-mails with the same comment. Is there any way you can remove me from that service? Appreciate it!

    Reply
    • Jason

      Perhaps it is because you have made multiple comments. I’ll see if I can track it down.

      Reply
  • Thanks a lot…

    Hi, I really appreciate your post, it seems that you know what are you doing. I’ll be looking forward for your next post….

    Reply