Page 1 of 1

Some tips on searching...

Posted: Mar 17th, '05, 13:45
by spanky
Recently, there have been some inquiries about search not working properly and people have not been getting expected search results.

Let me assure you, the search is not broken. The search function in this BB software contains things called "stopwords". These are words that, if used in a search query, will deliberately cause the search to fail. These stopwords are very common words, such as "the", "how", "many", etc, that if used in a search, would eat up server resources horribly.

These stopwords are not indexed and will not yield search results. An example of this would be to try to search for CAPBOY's thread titled "How Many Days" (viewtopic.php?t=231). Unfortunately each of the 3 words in that thread title are stopwords. Searching for any/all of those words will not yield any search results.

You may ask, "What's the reason for this?" Could you imagine indexing every word in every post on this forum? That index would be huge and searching on it would be very costly (resource-wise).

This will eventually be moved to the FAQ section for safe keeping.

Posted: Mar 17th, '05, 13:52
by RedRider
Thanks Spanky, that answers many questions, and CAPBOY you are a genius, imagine starting a thread that is unsearchable!

Talk about not leaving any traces........

I'm glad I found it though

Re: Some tips on searching...

Posted: Mar 17th, '05, 14:22
by BigKahuna13
spanky wrote:Recently, there have been some inquiries about search not working properly and people have not been getting expected search results.

Let me assure you, the search is not broken. The search function in this BB software contains things called "stopwords". These are words that, if used in a search query, will deliberately cause the search to fail. These stopwords are very common words, such as "the", "how", "many", etc, that if used in a search, would eat up server resources horribly.

These stopwords are not indexed and will not yield search results. An example of this would be to try to search for CAPBOY's thread titled "How Many Days" (viewtopic.php?t=231). Unfortunately each of the 3 words in that thread title are stopwords. Searching for any/all of those words will not yield any search results.

You may ask, "What's the reason for this?" Could you imagine indexing every word in every post on this forum? That index would be huge and searching on it would be very costly (resource-wise).

This will eventually be moved to the FAQ section for safe keeping.
Spanky,

Clarification. Is it the case where if you use any stop words the search fails or only if you only specify stop words. I'd expect the latter. Having a search fail because one term is a stop word would be dumb.

Does the source code document the stop words? Publishing them might be useful.

Re: Some tips on searching...

Posted: Mar 17th, '05, 14:54
by spanky
BigKahuna13 wrote:
spanky wrote:Recently, there have been some inquiries about search not working properly and people have not been getting expected search results.

Let me assure you, the search is not broken. The search function in this BB software contains things called "stopwords". These are words that, if used in a search query, will deliberately cause the search to fail. These stopwords are very common words, such as "the", "how", "many", etc, that if used in a search, would eat up server resources horribly.

These stopwords are not indexed and will not yield search results. An example of this would be to try to search for CAPBOY's thread titled "How Many Days" (viewtopic.php?t=231). Unfortunately each of the 3 words in that thread title are stopwords. Searching for any/all of those words will not yield any search results.

You may ask, "What's the reason for this?" Could you imagine indexing every word in every post on this forum? That index would be huge and searching on it would be very costly (resource-wise).

This will eventually be moved to the FAQ section for safe keeping.
Spanky,

Clarification. Is it the case where if you use any stop words the search fails or only if you only specify stop words. I'd expect the latter. Having a search fail because one term is a stop word would be dumb.

Does the source code document the stop words? Publishing them might be useful.
The search stopwords are not indexed. So searching on them doesn't produce any hits. A few trial searches leads me to believe that stopwords are ignored. So, using them has no impact on the query.

Here is the list of stopwords: search_stopwords.txt

Posted: Mar 17th, '05, 15:17
by tyrolean_skier
Definitely a candidate for the FAQ section.

Posted: Mar 17th, '05, 15:23
by Bling Skier
"And At Band Camp"....