Hi Grant and Jose,
just to give some more details, as Jose said avg_length is precalculated
at indexing time using an specific Similarity class. Basically this can
be done through the lengthNorm method, for each document and field the
total length is stored, when the indexing process is finish
Hi Grant,
Our query expansion approach is quite simple, we apply pseudo-
relevance feedback techniques, where a number of top retrieved
documents are used to extract the terms candidates to expand the
original query. We have used TermPositions in query time to extract
the term statistics n
Hi José,
Can you explain your approach to implementing? I'm curious how you
incorporated in the avg. doc length. Also, have you followed any of
the flexible indexing discussions?
Finally, what's the license on this code?
Thanks,
Grant
On Oct 21, 2008, at 10:14 AM, José Ramón Pérez Agüer
Hello,
We have implemented a research module for lucene using BM25 and our
structured version of BM25 as ranking functions and a couple of
state-of-art query expansion algoritms.
This implementation is quite different to other query expansion
modules for Lucene that are available in the web.
We