Thursday, December 26, 2013

Google: Sitemaps Do Not Guarantee Indexing

This is likely obvious to most the readers here but it is simple, submitting an XML sitemap to Google does not mean the pages in that sitemap will be fully indexed.A Google Webmaster Help thread has Google's Gary Illyes responding to a question about why a site that has submitted 40,000 pages only has 100 pages indexed in Google.
For example, here are two sites that have submitted their URLs to Google via an XML sitemap file. One, has submitted 17,987 pages and Google has actually index all of them, plus one. :) The other has submitted over 7 million pages, but Google has only indexed about 4 million of them, which is about 53% of the pages submitted.
Why did Google index all the pages on one site but only about half on this other site? Why did Google only index 100 pages of the 40,000 of the site complaining above?

Gary from Google explains:
First and foremost, submitting a Sitemap doesn't guarantee the pages referenced in it will be indexed. Think of a Sitemap as a way to help Googlebot find your content: if the URLs weren't included in the Sitemap, the crawlers might have a harder time finding those URLs and thus they might be indexed slower. Another thing you want to pay attention to is that our algorithms may decide not to index certain URLs at all. For instance, if the content is shallow, it may totally happen it will not be indexed at all.
Google make take a look and decide based on the content or the PageRank that the page is not worth indexing.
Forum discussion at Google Webmaster Help.

No comments:

Post a Comment