Re: understanding the norm encode and decode

2015-03-05 Thread Ahmet Arslan
Hi András, Thats a good catch! Do you want to correct that javadoc mistake and create a patch? https://wiki.apache.org/lucene-java/HowToContribute If you don't have a jira account, anyone can create it. https://issues.apache.org/jira/browse/lucene Ahmet On Thursday, March 5, 2015 11:15 AM,

Re: understanding the norm encode and decode

2015-03-05 Thread wangdong
thank you for your detail answer.I get it As the document i have read is offical materials,I doubt it is correct. so i start a question. thank you again! andrew 在 2015/3/5 17:14, András Péteri 写道: Sorry, I also got it wrong in the previous message. :) It goes 0.89f -> 123 -> 0.875f. On Thu

Re: understanding the norm encode and decode

2015-03-05 Thread András Péteri
Sorry, I also got it wrong in the previous message. :) It goes 0.89f -> 123 -> 0.875f. On Thu, Mar 5, 2015 at 10:08 AM, András Péteri wrote: > Hi Andrew, > > If you are using Lucene 3.6.1, you can take a look at the method which > creates a single byte value out of the received float using bit >

Re: understanding the norm encode and decode

2015-03-05 Thread András Péteri
Hi Andrew, If you are using Lucene 3.6.1, you can take a look at the method which creates a single byte value out of the received float using bit manipulation at [1]. There is also a 256-element decoder table in Similarity, where each byte corresponds to a decoded float value computed by [2]. The

Re: understanding the norm encode and decode

2015-03-04 Thread wangdong
thank you for your disscussion. I am a junior user of lucene, so i am not**familiar with some deep concept you mentioned. my question is simple. I just want to know how to get 0.75 from decode(encode(0.89)) in offical document. why not 0.875? (0.875=0.5+0.25+0.125) thanks andrew 在 2015/3/

Re: understanding the norm encode and decode

2015-03-04 Thread Adrien Grand
Norms and doc values are indeed using the same API. However implementations differ a bit (eg. norms are stored in memory and use different compression schemes). The precision loss is up to the similarity. You could write a similarity impl which keeps full float precision, but scoring being fuzzy a

Re: understanding the norm encode and decode

2015-03-04 Thread Ahmet Arslan
Hi Adrien, I read somewhere that norms are stored using docValues. In my understanding, docvalues can store lossless float values. So the question is, why are still several decode/encode methods exist in similarity implementations? Intuitively switching to docvalues for norms should prevent prec

Re: understanding the norm encode and decode

2015-03-04 Thread Adrien Grand
Hi, Floats require 32 bits but norms are encoded on a single byte. So there is a precision loss when encoding float values into a single byte. In your example, 0.75 and 0.89 are sufficiently close to each other so that they are encoded to the same byte. On Wed, Mar 4, 2015 at 4:48 AM, wangdong w

understanding the norm encode and decode

2015-03-03 Thread wangdong
I read the article about the scoring section in lucene as follows: Encoding and decoding of the resulted float norm in a single byte are done by the static methods of the class Similarity:encodeNorm()