Norms and doc values are indeed using the same API. However implementations differ a bit (eg. norms are stored in memory and use different compression schemes).
The precision loss is up to the similarity. You could write a similarity impl which keeps full float precision, but scoring being fuzzy anyway this would multiply your memory needs for norms by 4 while not really improving the quality of the scores of your documents. This precision loss is the right trade-off for most use-cases. On Wed, Mar 4, 2015 at 3:04 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi Adrien, > > I read somewhere that norms are stored using docValues. > In my understanding, docvalues can store lossless float values. > So the question is, why are still several decode/encode methods exist in > similarity implementations? > Intuitively switching to docvalues for norms should prevent precision loss > thing. > > Ahmet > > > On Wednesday, March 4, 2015 3:22 PM, Adrien Grand <jpou...@gmail.com> wrote: > Hi, > > Floats require 32 bits but norms are encoded on a single byte. So > there is a precision loss when encoding float values into a single > byte. In your example, 0.75 and 0.89 are sufficiently close to each > other so that they are encoded to the same byte. > > > On Wed, Mar 4, 2015 at 4:48 AM, wangdong <hrdxwa...@gmail.com> wrote: >> I read the article about the scoring section in lucene as follows: >> >> Encoding and decoding of the resulted float norm in a single byte are done >> by the static methods of the class Similarity:encodeNorm() >> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html#encodeNorm%28float%29>anddecodeNorm() >> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html#decodeNorm%28byte%29>. >> Due to loss of precision, it is not guaranteed that decode(encode(x)) = x, >> e.g. decode(encode(0.89)) = 0.75. At scoring (search) time, this norm is >> brought into the score of document as*norm(t, d)*, as shown by the formula >> inSimilarity >> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html>. >> >> I can not understand the formula decode(encode(0.89)) = 0.75 >> how can i get the 0.75 from the left. >> >> Is anyone can help me ? >> thanks ahead! >> >> andrew > > > > -- > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org