Jump to content
xisto Community
Sign in to follow this  
longtimeago

Google And Its Caching Service

Recommended Posts

I am sure that every person in this world who use Google service will be aware of the Google Caching system, especially those who have hosted websites must have had a detailed study about that, Google caching service is one where its so called ?Google Spider? crawls the web and takes snap shots ( a kind of snap shot I would call it ) and stores it in its cache, not only stores it, but it also gives it to the public to view it, so every one must have notices it, whenever you search something in Google the search results display the links too, and near the link we can find a small word ?Cached?, and when we click it we can see the cached pages of that particular website. So not only this, Google also updates these cached pages at regular intervals.Now my question is that, how can this be legal, It also gets the snap shot of several copyrighted stuffs, isn?t it ?? So if some one has some sensitive data, it caches that too and stores it and gives it to the public. So how come caching of copyrighted data be legal?? Moreover is there anyway where one can stop or prevent Google spider entering his/her website so that the contents wont be cached. What I mean here is that if some one is hosting some sensitive data such as personnel information or so and if the concerned person doesn?t want Google to cache that particular page and store it in its cache, then what must the person do ??

Share this post


Link to post
Share on other sites

Well you said it, Google is a great resource where you can find cached pages of content that has previously been removed because of copyright infrigment or whatever else made the webmaster to remove it.

 

But if the webmaster is smart enough to think of this then he should dissallow any robot to cache his page, a very speedy procedure.

 

You just need to add a meta tag in the <head> section of your web-pages you do not wish to be cached:

 

<meta name="robots" content="noarchive">

That would be just about it. :)

Share this post


Link to post
Share on other sites

So ..miladinoskim, thats it ?? is it so simple as that ?? If so if every one follows that i guess google spider will have no place to crawl then right ?

Share this post


Link to post
Share on other sites

So ..miladinoskim, thats it ?? is it so simple as that ?? If so if every one follows that i guess google spider will have no place to crawl then right ?

No, the Google spider and/or others like Yahoo! Slurp or MSN will crawl but they won't cache the content of your webpage. Your webpage will show up but the 'cached' link won't.

Share this post


Link to post
Share on other sites

does any one have any idea when will this Google Bot crawl across ones website ? To be clear i wanna know does it do a random crawling or does it have regular intervals or does it see when the traffic is less to the site?? I Just wanna know when and all will the Bot crawl, moreover how much bandwidth does these bots take when they crawl a site ? Especially how much bandwidth is consumed by google bot when it crawls and takes that snap shot ?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.