Truth about Sitemaps

In all my years of WordPress Development, Traffic Growth and SEO, I have seen several clients with wild mistaken beliefs concerning XML sitemaps. They’re useful, for sure– but like any tool, a little training, and background on exactly how all the little bits work goes a long way.

Submitting A Sitemap Doesn’t Make Google Index Your Site

The most common misunderstanding is that the XML sitemap assists get your pages indexed. The first thing we’ve got to get straight is this: Google does not index your web pages even if you asked nicely. Google indexes pages since (a) they located them and crawled them, and (b) they consider them good enough quality to be worth indexing. Pointing Google to a page and asking to index it doesn’t matter.

Understand by submitting an XML sitemap to Google Webmaster Tools; you’re giving Google the idea that you think your website is good-quality and ready to be indexed.

Consistency is the key to success

One of the most common errors I see clients make is lacking consistency in the messaging to Google about a web page. If you block a web page in a robots.txt then include it in an XML sitemap, you’re teasing Google. After that, your robots.txt takes it away. The very same thing with meta robotics: Don’t include a web page in an XML sitemap and set meta robots “noindex,follow.”.

While I go to it, let me rant briefly about meta robots: “noindex” suggests don’t index the page. “Nofollow” means don’t index the page. It means “don’t follow to the links outbound from that page,” In all my experience I have found only a few reasons ever set meta robots to “noindex,follow.”

Every page on your website should fall into one of two categories?

  • Premium level copy that you hope to have Google index correctly and use as a landing page for your website.
  • Secondary pages (useful to viewers, yet not anything you would undoubtedly expect to be a search landing page).

Both categories are important, the biggest difference is that you want to promote the “Premium” level copy more. Therefore you should always make sure it is showing up in your sitemaps correctly. If you are using your secondary pages to promote your primary content, you need to make sure that you use internal links, and proper SEO keywords to help build the linking.

General Site Quality

It would appear that Google measures the website quality, as well as making use of that site-wide metric to impact ranking.

If you look at this from Googles standpoint, you could have one perfect web page with all the proper SEO, the right amount of content and everything else they look for. Then you have 300 other pages of content that are not as good. Google doesn’t want to send people you a site like this, because they understand that people will explore your site. Really, why would they want to send someone to a website like this?

Google understand that every website has a specific number of “secondary” pages that are useful to users, yet not always content-type web pages that should be landing pages from search: pages for sharing web content with others, responding to remarks, logging in, obtaining a lost password, etc

If your XML sitemap includes all these pages, precisely what are you telling Google? That you have no idea regarding what is your quality web content on your site and what is secondary content.

Here’s what you need to be telling Google. Yes, we have a website right here with 300 web pages, these 75 are our primary content pages. And you could overlook the others– they’re secondary pages.

Behind the Scene

Keep in mind; Google is going to use what you submit in your XML sitemap as a hint to what’s most likely crucial on your website. Just because it’s not in your XML sitemap does not necessarily mean that Google will ignore those web pages. You might still have numerous countless pages with barely enough material as well as link equity to get them indexed, yet really should not be.

It’s important to do a site: search to see all the web pages that Google is indexing from your site in order to uncover pages that you ignored, and clean those out of that “average grade” Google is going to give your site by establishing meta robots “noindex, follow” (or blocking in robots.txt). The weakest web pages that still made the index are most likely to be listed last in a website: search.

Noindex vs. robots.txt

There’s a vital but refined distinction between utilizing meta robots and utilizing robots.txt to prevent indexation of a page. Using meta robotics “noindex,follow” allows the link equity going to that web page to drain the web pages it links to. If you block the page with robots.txt, you’re simply purging it, to help your other pages.

Consider a page like an About page or a Privacy Policy page- possibly linked to every page on your website either by the main menu or the footer menu. So there’s a ton of web link juice most likely go to those pages; do you want to throw that away? Or would you instead let that web link equity drain to whatever in your primary menu?

Indexation debugging

Let’s say you’re a shopping website as well as you have 30,000 product pages, 2,000 category web pages, and 10,000 subcategory web pages. You send your XML sitemap of 42,000 pages and also figure out that Google is indexing 23,000 of them. How do you figure out which pages they are indexing?

First of all, your category and subcategory web pages are probably VERY IMPORTANT search targets for you. I would create a category-sitemap.xml and subcategory-sitemap.xml as well as submit those individually. Using SEO Yoast Premium will make sure that this happens for you, the right way. Are you expecting to see near 100% indexation here? Well if you’re not obtaining it, then you need to start creating more content, building more links to your content and keep checking what is indexed.

Chances are, the trouble depends on a few of the 30,000 product web pages– however which ones?

The best way to handle this is to split your website into multiple sitemaps to see which ones are not performing correctly. Or again use SEO by Yoast Premium, and it will do almost all of the work for you.

If you decide to do it manually or are not using WordPress, then you should start by narrowing down the problems, this is a good way to start:

  • If your product doesn’t have a picture on it, typically it will not be automatically indexed on a sitemap, even if you are using SEO by Yoast
  • Pages that have less than 300 words of text will likely not be indexed as well

Dynamic Sitemaps with SEO by Yoast

You don’t have to build sitemaps manually. You shouldn’t be creating them manually. They need to be dynamic just like your website should be. Take the time to set them right the first time, and you will be rewarded with positive growth to your website and your business.


Whether you are building a new website or fixing an old one, your sitemap is a powerful tool that should be utilized to help drive traffic to your website.  Just remember that a properly build sitemap can make your website thrive, while a poor one can hurt your website.  We build every website that goes through our company with SEO by Yoast for a reason, it works.

As always we are here to help you and your company grow, if you have any questions for the Tier One staff please let us know.

Like this article?

Share on facebook
Share on Facebook
Share on twitter
Share on Twitter
Share on linkedin
Share on Linkdin
Share on pinterest
Share on Pinterest

Leave a comment

Call Us

Send Us A Message