Radha: replying/reforwarding the same message over and over doesn't tend to be a useful way to encourage additional replies. if you do have something to add to an existing discussion that you've started, you should at least do it as a reply to the orriginal discussion so people have the full context... http://www.nabble.com/Need-help-%3A-SpanNearQuery-to23077372.html
I'm really not sure that anyone has any new additional info to offer you. this is a particularly hard problem, that doesn't have a very efficient solution that i know of. it sounds like you've already tried the obvious solution, but aren't happy with the scores produced -- if you can't get the scores you want by tweaking the Similarity options available, then implementing custom Query/Scorer classes is really the only remaining option. : Date: Sun, 19 Apr 2009 17:52:27 +0530 : From: Radha Sreedharan <radh...@gmail.com> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: Proximity and Percentage match search in Lucene : : What I need is the following : : If my document field is ( ab,bc,cd,ef) and Search tokens are (ab,bc,cd). : : Given the following : : I should get a hit even if all of the search tokens aren't present : If the tokens are found they should be found within a distance x of : each other ( proximity : search) : > : > I need the percentage match of the search tokens with the document field. : > : > Currently this is my query : : > 1) I form all possible permutation of the search tokens : > 2) do a spanNearQuery of each permutation : > 3) Do a DisjunctionMaxQuery on the spannearqueries. : > : > This is how I compute % match : : > % match = ( Score by running the query on the document field ) / : > ( score by running the query on a document field created out of search tokens ) : > : > The numerator gives me the actual score with the search tokens run on the field. : > Denominator gives me the best possible or maximum possible score with the current search : tokens : > : > For this example << If my document field is ( ab,bc,cd,ef) and Search tokens are : (ab,bc,cd).>> I expect a % match of around 90%. : > : > However I get a match of only around 50% without a boost. Using a boost infact reduces : my percentage. : > : > I even overrode the queryNorm method to return a one, still the percentage did not increase. : * : Is there any way of implementing this using the current set of : implementation classes in Lucene and not making complex changes to the : structure by itself. : ( which is what i gather has to be done from the previous replies) : : Can anyone suggest an alternative way of implementing this requirement : using the existing bunch of classes in Lucene and not necessarily : using the ones I have used* : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org