On Fri, Jun 5, 2009 at 21:31, Abhi<abhirama.b...@gmail.com> wrote: > Say I have indexed the following strings: > > 1. "cool gaming laptop" > 2. "cool gaming lappy" > 3. "gaming laptop cool" > > Now when I search with a query say "cool gaming computer", I want string 1 > and 2 to appear on top (where search terms are closer to each other) > followed by 3. > > I can use a Term query to search but, the problem is that word proximity > does not come into picture. All 3 document get an even score. The behaviour > that I want is documents that have "cool" and "gaming" and "computer" (these > words might be present or not in the indexed document) as close to each > other as possible should get a higher score. > > I can use a Phrase query so that proximity of search terms affect scoring > but, I do not get any result because string "computer" is not present in any > of the indexed documents. > > Is there a way to achieve the above?
I would rewrite it to this: cool gaming computer "cool gaming" "gaming computer" "cool gaming computer" Naively assuming a score of 1.0 for each hit, you would get something like... 1. "cool gaming laptop" => 3 (cool, gaming, "cool gaming") 2. "cool gaming lappy" => 3 (cool, gaming, "cool gaming") 3. "gaming laptop cool" => 2 (cool, gaming) And of course if it actually finds "cool gaming computer" it would get 6. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org