What I need is the following : If my document field is ( ab,bc,cd,ef) and Search tokens are (ab,bc,cd).
Given the following : I should get a hit even if all of the search tokens aren't present If the tokens are found they should be found within a distance x of each other ( proximity search) > > I need the percentage match of the search tokens with the document field. > > Currently this is my query : > 1) I form all possible permutation of the search tokens > 2) do a spanNearQuery of each permutation > 3) Do a DisjunctionMaxQuery on the spannearqueries. > > This is how I compute % match : > % match = ( Score by running the query on the document field ) / > ( score by running the query on a document field created out of > search tokens ) > > The numerator gives me the actual score with the search tokens run on the > field. > Denominator gives me the best possible or maximum possible score with the > current search tokens > > For this example << If my document field is ( ab,bc,cd,ef) and Search tokens > are (ab,bc,cd).>> I expect a % match of around 90%. > > However I get a match of only around 50% without a boost. Using a boost > infact reduces my percentage. > > I even overrode the queryNorm method to return a one, still the percentage > did not increase. * Is there any way of implementing this using the current set of implementation classes in Lucene and not making complex changes to the structure by itself. ( which is what i gather has to be done from the previous replies) Can anyone suggest an alternative way of implementing this requirement using the existing bunch of classes in Lucene and not necessarily using the ones I have used*