<< by Kaitlyn Smeland Dhanaliwala on October 14th, 2009
Vanessa Fox wrote a post yesterday over at Search Engine Land on a couple new features being tested in the labs section of Google Webmaster Tools. One of those new tools available to webmasters is called Fetch as Googlebot.
The Fetch as Googlebot tool displays a page’s code as the Googlebot sees it when crawling the site’s content. Within the Webmaster Tools interface, you can access the tool through “Labs” on the left-hand navigation. Once you add a URL to be fetched, wait until the status reads “Success.” Click the “Success” link to view the page’s code, as the Googlebot sees it.
Here’s an example of what you’ll see:
At first glance, this looks a lot like the source code that you can view through any old browser. But there are a few additional attributes of the data from Fetch as Googlebot:
- The HTTP header information is displayed. The HTTP header is the response provided by the site’s server to a request from the Googlebot. The HTTP header includes the:
- response status (200, 404, 301, etc.)
- type of server
- cookie settings for the domain
- time/date last modified
- content type (html, xml, etc.)
- You can check for instances of cloaking. Cloaking is the practice of delivering one page to regular users and a different page to search engines. The aim is usually to achieve higher rankings for certain search queries without changing the content visible to users. (To be clear, this is not allowed by Google and is generally considered a “black hat” SEO technique!) Obviously this information will not clearly visible from the source code in your browser, because you (a user) would not trigger the content that might be intended exclusively for search engine bots. The Fetch as Googlebot tool, however, would reveal any potential cloaking. And as Vanessa Fox points out, if you’re a consultant joining a project this tool would be helpful in allowing you to see any instances of cloaking right away.
- The Fetch as Googlebot content is retrieved and delivered in real time. You don’t have to wait for the Googlebot to re-crawl your site.
- Therefore, this might be an especially useful tool when testing site edits and redirects immediately after pushing those changes live.
However, one downside to the new tool is that it does not allow you to tell if the Googlebot can access content contained within rich digital content like a Flash presentation. Since typically search engines cannot read Flash, this is still something you want to check on your site.
For example, here is a Flash-rich page for the Samsung Jet phone. As you can see, there is a little bit of text describing the phone attributes and processing power, which Samsung might want to be crawled.
But what does the search engine see?
Not very descriptive content, huh?