I am not an expert but I think you can solve problem 1 by overriding the coord function in the similarity class:
1. coord(q,d) is a score factor based on how many of the query terms are found in the specified document. Typically, a document that contains more of the query's terms will receive a higher score than another document with fewer query terms. This is a search time factor computed in coord(q,d) by the Similarity in effect at search time. the default similarity class defines coord as thus: coord public float coord(int overlap, int maxOverlap) Implemented as overlap / maxOverlap. Donna L. Gresh Services Research, Mathematical Sciences Department IBM T.J. Watson Research Center (914) 945-2472 http://www.research.ibm.com/people/g/donnagresh [EMAIL PROTECTED] Tobias Hill <[EMAIL PROTECTED]> wrote on 10/31/2007 09:51:12 AM: > My documents all hava a field with variables number of terms > (but rather few): > Doc1.field = "foo bar gro" > Doc2.field = "foo bar gro mot slu" > Now I would like to search using the terms "foo bar gro" > > Problem 1: > I like to express that at least any two of the three terms > must match. Do I have to construct this clause myself - i.e. > "(foo & bar) | (foo & gro) | (bar & gro)", or is there some > clever way to do this? > > Problem 2: > I like to express that if the doc.field has too many terms > that wasn't matched it should not be included at all in the > result. In the example above Doc2 might have too many > terms that was not matched to be included in the result. > Is this kind of query possible, and how? > > The general case: > I want to find those docs that has X% of the search terms > matched and that the acctual match covers at least Y% of > the available terms on the document. > > > I am very thankful for any feedback on this. > Tobias > > > >