Hi,
Thanks for the fix.
I also wonder if you know any collection (free ones) to test pruning
approaches. Almost all the papers use TREC collections which I don't have!!
For now, I use Reuters21578 collection and Carmel's Kendall's tau extension
to measure similarity. But I need a collection with
On 29/03/2012 11:14, Andrzej Bialecki wrote:
The problem in our implementation is that we use a within-document term
frequency (the number of occurrences of t in the current document) and
not a collection-wide term frequency... so, it looks to me that the fix
would be to first fully traverse the
On 27/03/2012 20:25, Zeynep P. wrote:
While using the pruning package, I realised that ridf is calculated in
RIDFTermPruningPolicy as follows:
Math.log(1 - Math.pow(Math.E, termPositions.freq() / maxDoc)) - df
However, according to the original paper (Blanco et al.) for residual idf,
it should b
While using the pruning package, I realised that ridf is calculated in
RIDFTermPruningPolicy as follows:
Math.log(1 - Math.pow(Math.E, termPositions.freq() / maxDoc)) - df
However, according to the original paper (Blanco et al.) for residual idf,
it should be -log(df/D) + log (1 - e^(*-*tf/D)). T
That is perfect
Thank you very much
Best regards
ZP
--
View this message in context:
http://lucene.472066.n3.nabble.com/delete-entries-from-posting-list-Lucene-4-0-tp3838649p3839095.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
On 19/03/2012 11:24, Zeynep P. wrote:
I need to delete entries from posting list. How to do it in Lucene 4.0? I
need to do this to test different pruning algorithms.
Thanks in advance
http://issues.apache.org/jira/browse/LUCENE-1812
http://issues.apache.org/jira/browse/LUCENE-2632
--
Best reg
I need to delete entries from posting list. How to do it in Lucene 4.0? I
need to do this to test different pruning algorithms.
Thanks in advance
ZP
--
View this message in context:
http://lucene.472066.n3.nabble.com/delete-entries-from-posting-list-Lucene-4-0-tp3838649p3838649.html
Sent from