Soeren Pekrul wrote:
The score for a document is the sum of the term weights w(tf, idf) for each containing term. So you have already the combination of coordination level matching with IDF. Now it is possible that your query requests three terms A, B and C. Two of them (A and B) are quite often in the collection one (C) is very rare. It could be possible that documents are matching just C have a higher score than documents containing A and B. To avoid this you can give the coordination a higher influence by multiplying the sum of term weights with the coordination as additional factor.
Addendum: For the query Q(A, B, C) with A: df++ (ifd--) B: df++ (idf--) C: df-- (idf++) the user would probably expect the following ranking: 1. D(A, B, C) 2. D(A, C), D(B, C) 3. D(A, B) 4. D(C) 5. D(A), D(B) Sören --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]