Sebastian's Pamphlets

If you've read my articles somewhere on the Internet, expect something different here.

MOVED TO SEBASTIANS-PAMPHLETS.COM

Please click the link above to read actual posts, this archive will disappear soon!

Stay tuned...

Monday, July 18, 2005

Mozilla-Googlebot Helps with Debugging

Tracking Googlebot-Mozilla is a great way to discover bugs in a Web site. Try it for yourself, filter your logs by her user agent name:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Although Googlebot-Mozilla can add pages to the index, I see her mostly digging in 'fishy' areas. For example, she explores URLs where I redirect spiders to a page without query string to avoid indexing of duplicate content. She is very interested in pages with a robots NOINDEX,FOLLOW tag, when she knows another page carrying the same content, available from a similar URL but stating INDEX,FOLLOW. She goes after unusual query strings like 'var=val&&&&' resulting from a script bug fixed months ago, but still represented by probably thousands of useless URLs in Google's index. She fetches a page using two different query strings, checking for duplicate content and alerting me to a superflous input variable used in links on a forgotten page. She fetches dead links to read my very informative error page ... and her best friend is the AdSense bot since they seem to share IPs as well as the interest in page updates before Googlebot is aware of them.

Tags: ()
Share this post at StumbleUpon
Stumble It!
    Share this post at del.icio.us
Post it to
del.icio.us
 


-->

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home