Is A Php File Searchable?

rockarolla · February 13, 2008

Hi,I have made my web site solely stored in a SQL database...that will say if I need to load a page I take it our from the data base and then display it. My question is: is any searcg engine able to ``crawl'' into my web site content?I would appreciate some info so that I can change the way my web works.

Sten · February 13, 2008

Provided this is what you mean, yes.So I gather you've stored everything in a database and using PHP to get a part of it and show it, just like a CMS?PHP is server side, it does everything on the server and then displays it to the user as normal HTML, which if a search engine finds, it will show. If you're not using search engine friendly URLs though then it probably wouldn't find it since it's only picking up the index.php page.

yordan · February 13, 2008

By the way, pages needing passwords cannot be crawled, because the robots cannot guess your passwords.So, pages you protected or scripts needing you to login before entering make the things not searchable by robots.

rockarolla · February 14, 2008

By the way, pages needing passwords cannot be crawled, because the robots cannot guess your passwords.So, pages you protected or scripts needing you to login before entering make the things not searchable by robots.

Thats my point. The database table has an user/pass so it might not be accessible for a bot or it might be...I think they ``crawl'' a particular site by ``virtually opening'' it and then check the content...at least this is my experience...as I really need to know this I'm making some crawling around too to become an expert...

see how if I'll share it ...

here are the advises given by G:

https://support.google.com/webmasters/?answer=40349

yordan · February 14, 2008

Hi, RockaRolla,When I do crawling for fun, it simply opens the php scripts, so I get files named somehthing.php down to my PC, nothing more. For instance, in 4gallery photo galleries, I don't have the pictures. However, I have the php scripts, so I can re-create the website. But of course I cannot connect to the database (for instance at asta it's your database so, even guessing the passwords I could not connect) so I cannot reach dynamic pages which are displayed using a particular user's rights and name and passwords.

Quatrux · February 14, 2008

Well, if your pages are dynamic, it would be very good for you to send the right headers to the bot/crawler and even to the browser, due to when apache or as I know any other http server sends a simple html file with the right content length and etc. but when a PHP file is generated/parsed apache can't really know the length of it, but it can be done with some php output buffering and sending the header with php, I think google is full of these kind of suggestions.

Athlon1600 · August 7, 2008

that's what I thought too at first :mellow:

don't worry about it, every extension is indexable to some point.
here is the useful tool:

http://tools.seochat.com/tools/search-spider-simulator/

enter your websites url and you will see your page exactly as google search engine spider would see.

FirefoxRocks · August 7, 2008

My thought on this is that pages needing POST form data (user logins, not database logins) are not accessible by search engines. This is because robots cannot guess the input required to access the content (passwords, email addresses, etc). If you have links to content outputted by PHP from a database, it should be searchable.

toby · August 7, 2008

You can tell google a user/pass, and sitemaps, but generally what it can't find a link to, it won't index.

rnd-am · February 7, 2009

Neither Google, nor any other search engine will not crawl ( and, hence, index) pages, that hadn't link on it on other, previously crawled, pages. So you can have entire universe in your DB, but if there is no link to those particular generated pages, then those pages are invisible for SE.BTW ordinar forums are frequently set to settings which allow to browse pages even for not loggid in user, i.e. crawler, indexing robot.

Ahsaniqbalkmc · February 10, 2011

By the way, pages needing passwords cannot be crawled, because the robots cannot guess your passwords.So, pages you protected or scripts needing you to login before entering make the things not searchable by robots.

Websites designed on CMS like wordpress also have a password on them I mean the databases always have passwords on them and yet they are crawled by search engine bots. I don't get the concept of password protected pages not being crawled by search engines because on the internet today, almost everything is protected by password.
Can anyone help me get the concept behind this.
Thankyou in advance.

Sign In

Is A Php File Searchable?

Recommended Posts

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Important Information