Re: Query Expansion Module for Lucene based on BM25 ranking function

2008-10-23 Thread Joaquin Perez Iglesias
Hi Grant and Jose, just to give some more details, as Jose said avg_length is precalculated at indexing time using an specific Similarity class. Basically this can be done through the lengthNorm method, for each document and field the total length is stored, when the indexing process is finish

Re: Query Expansion Module for Lucene based on BM25 ranking function

2008-10-22 Thread José Ramón Perez Aguera
Hi Grant, Our query expansion approach is quite simple, we apply pseudo- relevance feedback techniques, where a number of top retrieved documents are used to extract the terms candidates to expand the original query. We have used TermPositions in query time to extract the term statistics n

Re: Query Expansion Module for Lucene based on BM25 ranking function

2008-10-22 Thread Grant Ingersoll
Hi José, Can you explain your approach to implementing? I'm curious how you incorporated in the avg. doc length. Also, have you followed any of the flexible indexing discussions? Finally, what's the license on this code? Thanks, Grant On Oct 21, 2008, at 10:14 AM, José Ramón Pérez Agüer