Jump to content
xisto Community
Sign in to follow this  
Zero Ziat

Help! I Am Hit... By the GOOGLEBOT

Recommended Posts

Yeah, the first day and my Drupal CMS warning showed up this while I was setting up my website:

 

Type page not found

Date Sunday, September 17, 2006 - 22:09

User Anonymous/Guest

Location http://forums.xisto.com/no_longer_exists/

Referrer

Message robots.txt not found.

Severity warning

Hostname 66.249.65.133


So, I executed a tracert on that mysterious IP as NO ONE had an account on my site except ME.

 

CONSOLE
Tracing route to crawl-66-249-65-133.googlebot.com [66.249.65.133]

over a maximum of 30 hops:

 

1 41 ms 19 ms 19 ms rcor2bras1.antel.net.uy [*]

2 21 ms 18 ms 20 ms gcorecor2-acc.antel.net.uy [*]

3 22 ms 19 ms 20 ms ibordeagu1-backb.antel.net.uy [*]

4 179 ms 178 ms 178 ms POS1-1.GW4.MIA4.ALTER.NET [157.130.83.137]

5 180 ms 178 ms 225 ms 0.so-0-1-0.XL1.MIA4.ALTER.NET [152.63.82.126]

6 202 ms 214 ms 253 ms 0.so-7-0-0.XL1.CHI2.ALTER.NET [152.63.68.81]

7 201 ms 201 ms 286 ms POS6-0.BR3.CHI2.ALTER.NET [152.63.68.1]

8 206 ms 201 ms 205 ms so-4-2-1.mpr1.ord7.us.above.net [64.125.12.245]

9 205 ms 201 ms 202 ms so-1-0-0.cr2.ord2.us.above.net [64.125.30.142]

10 205 ms 203 ms 222 ms so-2-0-0.cr1.dca2.us.above.net [64.125.27.14]

11 207 ms 202 ms 202 ms so-0-0-0.cr2.dca2.us.above.net [64.125.29.122]

12 204 ms 205 ms 204 ms so-6-0-0.mpr2.iad1.us.above.net [64.125.28.130]

13 203 ms 204 ms 204 ms so-3-0-0.mpr1.iad2.us.above.net [64.125.29.134]

14 209 ms 205 ms 205 ms main1.above.net [209.249.73.66]

15 206 ms 204 ms 205 ms

crawl-66-249-65-133.googlebot.com[66.249.65.133]

 

Trace complete.

 

Well, if you aren't experienced, that guy is the Googlebot, which basically is just a bot(duh) that indexes keywords and similar stuff and then puts it into google's database so it shows up when it querys that keyword or similar keyword.

 

I am sure it will be wroking on by tomorrow, but if you search my name on google you will get a lot of gaming profiles and a lot of spanish/english stuff. (:) Stalk, Stalk!)

 

What hinted me was the IP(I remembered some octets) and that it was looking for robots.txt (A file to restrict certain bots programmed for indexing or whatever else like translation engines to enter your website).

 

Well, I am happy by now. See ya!

Edited by Zero Ziat (see edit history)

Share this post


Link to post
Share on other sites

Congratulations on getting 'botted'.

 

As you mentioned above, all bots seek out a file named robots.txt which contains the permissions and restrictions that most Bots adhere to, but it isn't 'required' that Bots even stop to read the file. Bot visits can be good for indexing your site on the search engines, but you run the risk of a Bot picking up information which you don't want made public. One trick is to store your 'personal files' in a folder "above" the public_html folder on your site. Apache servers don't allow unauthorised access to these files if the user isn't specificly allowed access.

Share this post


Link to post
Share on other sites

Robots.txt is very easy to understand and you can write it with notepad.

Visit http://forums.xisto.com/no_longer_exists/ for more information. Basically it tells which bot agent to view or not to view (a.k.a. indexing) certain directories and files.

This is useful when you hold files/filenames that may change or deleted in the future. You don't want bots to report back as "broken link." This will cause search engines to flag your site/content as "unreliable."

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.