marijnnn 0 Report post Posted February 18, 2005 well, haven't seen too much topics about se technology in this forum, however it's named that way :Pbut here we go:for school, i had to make a search engine. i got lucky because it was in asp.net on a windows machine, so i could just use the indexing server included in windows. you have extra filters for pdf files and it searches through doc, xls,...but let's say i wasn't that lucky and had to do it in php on a linux server... how could one do it? i mean: a little search bar that returns wether a file containts the searchwords or not?my idea was to open every file periodicaly and make a database, but that would use a lot of space. and what to do with database-driven sites, that get their site-content from the db?having all your pages crawled every time a search is executed seems a bad idea too...i'm out of inspiration how would you do it? got any good scripts? Share this post Link to post Share on other sites
miCRoSCoPiC^eaRthLinG 0 Report post Posted February 18, 2005 Is it possible anyhow by using a combination of the "locate/slocate" command with grep ??? Might be worth a try.. locate & slocate do exactly what the indexing service in windows does... it creates a db of filenames/folders using the command "updatedb" and helps you find out files... and grep of course, is a regular pattern/expression matcher for files.. umm... I don't see any reason why it can't be done...though I'm not sure how. I'm pretty new to php myself. Share this post Link to post Share on other sites
marijnnn 0 Report post Posted February 18, 2005 silly me, should've thought of that myself. i'll look into how one can search in the database :Pwonder if you can do it on a shared server. i think the database would contain documents of all hosted sites :sanyway, it's a nice idea, i'll have to look into it one day soon. Share this post Link to post Share on other sites
miCRoSCoPiC^eaRthLinG 0 Report post Posted February 18, 2005 silly me, should've thought of that myself. i'll look into how one can search in the database wonder if you can do it on a shared server. i think the database would contain documents of all hosted sites :s  anyway, it's a nice idea, i'll have to look into it one day soon. <{POST_SNAPBACK}> Yup.. the locate db should ideally contain a complete list of files of all the hosted sites.. even though a normal user wouldn't be able to get the complete listing using locate... maybe if you create a user with locate executtion privileges and cut out the rest of the system privs - it could be run on a shared hosting system... but then again i dont know if any admin would be ready to do it.. it's almost a gaping security hole that you're setting up by opening the whole directory structure to the world.  Think you should be able to come up with some workaround though.. Share this post Link to post Share on other sites
daniwood 0 Report post Posted March 3, 2005 The market of sites of search is very restricted the people whom they desire to start a new financial market in sight with this subject. Share this post Link to post Share on other sites
Ryan1405241476 0 Report post Posted March 3, 2005 I agree I wouldn't bother with it, somethings are hard to pursue, if tyou mean this for a small internal search engine for a website, i am pretty sure their are ones that exist pre-made or you can use google search, as long as google has added you to their engine. other search engines like yahoo and whatnot may also give you that ability like google. Share this post Link to post Share on other sites
jcguy 0 Report post Posted March 12, 2005 Yah, there are also other free search engines that you can get for your site, like:http://www.freefind.com/The catch is that they put ads on the results page.Also, I think that if you want to design a serach engine that indexes the world's website, its very tough because your database would be so large that the costs of running it is very high. Share this post Link to post Share on other sites
webguide 0 Report post Posted March 29, 2005 Have you tried looking at the source code for a popular open source search engine like nutch? Share this post Link to post Share on other sites