We have an application where every term position in a document is associated 
with an "engine score".
A term query should then be scored according to the sum of "engine scores" of 
the term in a document, rather than on the term frequency.
For example, term frequency of 5 with an average engine score of 100 should be 
equivalent to term frequency of 1 with engine score 500.
 
I understood that if I keep the engine score per position in the payload, I 
will be able to use scorePayload in combination of a summary version of 
PayloadFunction to get the sum of engine scores of a term in a document, and so 
will be able to achieve my goal.
 
There are two issues with this solution:
1. Even the simplest term query should scan the positions file in order to get 
the payloads, which could be a performance issue.
We would prefer to index the sum of engine scores in advance per document, in 
addition to the term frequency. This is some sort of payload in the document 
level. Does Lucene support that or have any other solution for this issue ?
 
2. The "engine score" of a phrase occurrence is defined as the multiplication 
of engine scores of the terms that compose the phrase.
So in scorePayload I need the payloads of all the terms in the phrase in order 
to be able to appropriately score the phrase occurrence.
As much as I understand, the current interface of scorePayload does not provide 
this information.
Is there another way this can be achieved in Lucene ?
 
Thanks in advance,
Arnon.

Reply via email to