I would recommend excluding all words under 4 characters.Its a good idea, but as you said it puts a lot of load on the server. Maybe using a table, each post is parse for the keywords, the added to the table if field exists the add to a count, if it doesnt insert a new row.Then use the caching feature to cache the top words in the forum. That would mean the load would be minimal as the query is added to a query already in use