I've some thoughts about Lucene and Relevance Feedback. I want to
implement some variation of the Roccio Formula and there is the problem.
The formula is like this:
Query(new) = alpha * Query(old) + beta * Sum(Relevant Documents) - gamma
* Sum(Non Relevant Documents)
The relevant documents in this formula should be in a vector
representation. This is the problem If I work with TermFreqVectors then
the vectors are not equally long and contains different terms. My
solution now is to take the TermFreqVectors and minimize them to the
least common multiple and perform then the computation.
So my questions are:
Is this the only way to do so? ( I hope so not)
Is there an add on for lucene to get a real vector representation?
Does anyone has experiences with this issue?
Thanks
Stefan
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]