[ 
https://issues.apache.org/jira/browse/LUCENE-7179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15226699#comment-15226699
 ] 

Nicholas Knize commented on LUCENE-7179:
----------------------------------------

I don't disagree with the "pain" points. But you have to remember that 
{{GeoPointField}} works by way of a quad tree represented in unsigned long 
space. This isn't "quantization" for memory/disk purposes, its a dimensionality 
reduction technique. {{GeoPointTermsEnum}} relations simply reduce to a bunch 
of prefix masking and bit operations. The fact that the space filling curve is 
represented as a 64 bit long is only for bit operation simplicity. I could 
change it to a bigger bit space and make it closer to lossless, it just makes 
the enum code harrier.

bq. I tried to port LatLonPoints "rounds down" test and it fails

If you're referring to {{TestEncodingUtils.testEncodeDecodeRoundsDown}} it 
passes fine with the LUCENE-7164 64 bit space change. It won't pass if you 
change the GeoPoint encoding to use {{Math.round}} But again... all of these 
inconsistencies are occurring within the expected accepted TOLERANCE so they 
shouldn't be a surprise. Its the same as casting a double to a float and back 
and expecting numerical stability.

bq. its buggy for some double values

?? Not sure I follow. Its not lossless if that's what you mean? But that's also 
a known limitation for using 32bit unsigned space. 

bq. Otherwise I don't think we should do quantization!

Its needed for GeoPointField. But if we don't want to handle the dimensional 
reduction limitations we can remove this approach altogether. Noting that we 
haven't even begun to tap into its optimization potential.

> GeoPoint and LatLonPoint test data should quantize once
> -------------------------------------------------------
>
>                 Key: LUCENE-7179
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7179
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Nicholas Knize
>         Attachments: LUCENE-7179.patch
>
>
> {{LatLonPoint}} and {{GeoPointField}} tests pre quantizes test data to ensure 
> consistency with indexed (encoded) data. The pre quantized data then becomes 
> indexed, undergoing another quantization. To guarantee numerical stability 
> this should be changed such that the test data is quantized after indexing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to