Hi Andrew,

If you are using Lucene 3.6.1, you can take a look at the method which
creates a single byte value out of the received float using bit
manipulation at [1]. There is also a 256-element decoder table in
Similarity, where each byte corresponds to a decoded float value
computed by [2].

The first method encodes 0.89f to byte 123. 123 is decoded to 0.85f
via the second method, so it seems that the documentation is incorrect
in this regard.

[1] 
https://github.com/apache/lucene-solr/blob/lucene_solr_3_6_1/lucene/core/src/java/org/apache/lucene/util/SmallFloat.java#L75
[2] 
https://github.com/apache/lucene-solr/blob/lucene_solr_3_6_1/lucene/core/src/java/org/apache/lucene/util/SmallFloat.java#L88

On Thu, Mar 5, 2015 at 3:45 AM, wangdong <hrdxwa...@gmail.com> wrote:
> thank you for your disscussion.
>
> I am a junior user of lucene, so i am not**familiar with some deep concept
> you mentioned.
> my question is simple. I just want to know how to get 0.75 from
> decode(encode(0.89)) in offical document.
>
> why not 0.875?   (0.875=0.5+0.25+0.125)
>
> thanks
> andrew
>
> 在 2015/3/4 22:54, Adrien Grand 写道:
>>
>> Norms and doc values are indeed using the same API. However
>> implementations differ a bit (eg. norms are stored in memory and use
>> different compression schemes).
>>
>> The precision loss is up to the similarity. You could write a
>> similarity impl which keeps full float precision, but scoring being
>> fuzzy anyway this would multiply your memory needs for norms by 4
>> while not really improving the quality of the scores of your
>> documents. This precision loss is the right trade-off for most
>> use-cases.
>>
>> On Wed, Mar 4, 2015 at 3:04 PM, Ahmet Arslan <iori...@yahoo.com.invalid>
>> wrote:
>>>
>>> Hi Adrien,
>>>
>>> I read somewhere that norms are stored using docValues.
>>> In my understanding, docvalues can store lossless float values.
>>> So the question is, why are still several decode/encode methods exist in
>>> similarity implementations?
>>> Intuitively switching to docvalues for norms should prevent precision
>>> loss thing.
>>>
>>> Ahmet
>>>
>>>
>>> On Wednesday, March 4, 2015 3:22 PM, Adrien Grand <jpou...@gmail.com>
>>> wrote:
>>> Hi,
>>>
>>> Floats require 32 bits but norms are encoded on a single byte. So
>>> there is a precision loss when encoding float values into a single
>>> byte. In your example, 0.75 and 0.89 are sufficiently close to each
>>> other so that they are encoded to the same byte.
>>>
>>>
>>> On Wed, Mar 4, 2015 at 4:48 AM, wangdong <hrdxwa...@gmail.com> wrote:
>>>>
>>>> I read the article about the scoring section in lucene as follows:
>>>>
>>>> Encoding and decoding of the resulted float norm in a single byte are
>>>> done
>>>> by the static methods of the class Similarity:encodeNorm()
>>>>
>>>> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html#encodeNorm%28float%29>anddecodeNorm()
>>>>
>>>> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html#decodeNorm%28byte%29>.
>>>> Due to loss of precision, it is not guaranteed that decode(encode(x)) =
>>>> x,
>>>> e.g. decode(encode(0.89)) = 0.75. At scoring (search) time, this norm is
>>>> brought into the score of document as*norm(t, d)*, as shown by the
>>>> formula
>>>> inSimilarity
>>>>
>>>> <http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/search/Similarity.html>.
>>>>
>>>> I can not understand the formula decode(encode(0.89)) = 0.75
>>>> how can i get the 0.75 from the left.
>>>>
>>>> Is anyone can help me ?
>>>> thanks ahead!
>>>>
>>>> andrew
>>>
>>>
>>>
>>> --
>>> Adrien
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>
>>
>

-- 
András

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to