Hi Joel, I encounter the same problem. Could you please elaborate a bit on this?
Many thanks, Liat 2009/11/2 Joel Halbert <j...@su3analytics.com> > I opted to use the following query to solve this problem, since it meets > my requirements, for the time being. > > +(cheese sandwich) "cheese sandwich"~slop > > This includes documents with one of more of the terms, but prefers those > with an edit distance <= the slop. > > > -----Original Message----- > From: Joel Halbert <j...@su3analytics.com> > Reply-To: java-user@lucene.apache.org > To: java-user@lucene.apache.org > Subject: Re: scoring adjacent terms without proximity search > Date: Sat, 31 Oct 2009 08:38:29 +0000 > > Thank you all for your suggestions, I shall have a little think about > the best way forward, and report back if I do anything interesting that > works well. > > In answer to Grant's question, why not use PhraseQuery, we do not want > to have an artificial upper limit on the slop, i.e. we do want to > include documents that might only have a subset of words from the > phrase. (e.g. just cheese, or just sandwich, but not both). > > > -----Original Message----- > From: Robert Muir <rcm...@gmail.com> > Reply-To: java-user@lucene.apache.org > To: java-user@lucene.apache.org > Subject: Re: scoring adjacent terms without proximity search > Date: Fri, 30 Oct 2009 16:04:03 -0400 > > > I suppose you could precompute the proximity associations by indexing > > n-grams (in this case, called Lucene calls them shingles), such that > there > > is a single token in your index containing cheese_sandwich (effectively) > > > > > doh, I see Grant already lead you in this direction. (sorry for the > duplicate mail) > on average its worked for me for some things like this. > > although, I'll try to contribute something actually useful, and mention > that > if you use things like shingles, its good to consider modifying > DefaultSimilarity, look at setDiscountOverlaps param. > otherwise, i've measured cases where injecting additional tokens will cause > more harm than good, because it has an adverse affect on lengthnorm. > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >