Re: IDF scoring issue

2008-12-17 Thread Grant Ingersoll
On Dec 17, 2008, at 9:26 AM, Rajiv2 wrote: Because, the search term is provided by a user, and that user would explicity have to put quotes around "marietta ga" when I beleive the search text as it is : fleming roofing inc., marietta ga -- should score higher for "marietta ga" Just

Re: IDF scoring issue

2008-12-17 Thread Matthew Hall
Well, you could also do a simple test of removing IDF from the scoring equation and seeing if the query then reacts the way you want it to. Simply write your own custom similarity that does this, and test out to see how it works. Handily enough, I've already done this, so here's some code you

Re: IDF scoring issue

2008-12-17 Thread Rajiv2
Because, the search term is provided by a user, and that user would explicity have to put quotes around "marietta ga" when I beleive the search text as it is : fleming roofing inc., marietta ga -- should score higher for "marietta ga" rajiv Grant Ingersoll-6 wrote: > > > On Dec 16, 2008, at

Re: IDF scoring issue

2008-12-17 Thread Grant Ingersoll
On Dec 16, 2008, at 8:19 PM, Rajiv2 wrote: Hello, I'm using the default lucene Queryparser on the search text : fleming roofing inc., marietta ga Also, I don't want to modify the search text by putting quotes around "marietta ga" which forces the query parser to make a phrase query. Why no

Re: IDF scoring issue

2008-12-16 Thread Anshum
Hi Rajiv, If 'm interpreting your problem correctly, I'd suggest you to try using a phraseQuery with an appropriate slop value. Though again it depends on what is it that you exactly are trying to fetch. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to e

Re: IDF scoring issue

2008-12-16 Thread Rajiv2
To answer your questions, 1. there are only two words in the document I'm searching -- city and state abbrev. lowercased and analyzed by whitespaceanalyzer 2. the only field and default field is text, so the query becomes text: fleming text:roofing txt:inc. ...etc. Using query operator AND inst

Re: IDF scoring issue

2008-12-16 Thread Erick Erickson
Note a couple of things: 1> how a doc scores also takes into account how many other words are in the field you're querying on. 2> Is "text" your default field? Because what you posted is really searching text:fleming :roofing :inc.. Not also the implicit OR between each of them.