I checked my website bandwidth overview tonight. So far for the month of January, the bandwidth served from the main multimedia.cx domain is actually much higher than the bandwidth served by my gaming blog, which never happens (lots more pictures over there). I dug a little deeper into the details and found this:
So who is 66.249.67.1? Why, none other than crawl-66-249-67-1.googlebot.com. Why has it taken such an interest in my site? Oh, little pages like this:
66.249.67.1 – – [01/Jan/2009:00:58:01 -0500] “GET /fate/index.php?stderr=41851 HTTP/1.1” 200 69107 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.67.1 – – [01/Jan/2009:00:58:06 -0500] “GET /fate/index.php?build_record=43652 HTTP/1.1” 200 3297 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
You see, I thought I had administered my FATE web database responsibly by adding the appropriate robots exclusion file at http://fate.multimedia.cx/robots.txt by simply disallowing crawlers at this point. I completely neglected that http://multimedia.cx/fate/ is a perfectly valid route into the site.
now you can search some fate stuff in google…
http://www.google.com/search?num=100&hl=en&c2coff=1&safe=off&rls=en&hs=S2Q&q=site%3Amultimedia.cx%2Ffate&btnG=Search