Re: Doc length nomalization in Lucene LM

2016-07-22 Thread Ahmet Arslan
Hi, Yes, as you discovered, there is some precision loss during the encode/decode process. Ahmet On Friday, July 22, 2016 1:59 PM, Dwaipayan Roy wrote: Thanks for your reply. But I still have some doubts. >From your answer, I think you mean to say that the document length is just saved in

Re: Doc length nomalization in Lucene LM

2016-07-22 Thread Dwaipayan Roy
Thanks for your reply. But I still have some doubts. >From your answer, I think you mean to say that the document length is just saved in byte format for less memory consumption. But while debugging, I found that the doc length, that is passed in score() is 2621.44 where the actual doc length is 2

Re: Doc length nomalization in Lucene LM

2016-07-22 Thread Ahmet Arslan
Hi Roy, It is about storing the document length into a byte (to use less memory). Please edit the source code to avoid this encode/decode thing: /** * Encodes the document length in a lossless way */ @Override public long computeNorm(FieldInvertState state) { return state.getLength() - state.getN