mkaravel commented on PR #2971:
URL: https://github.com/apache/parquet-java/pull/2971#issuecomment-2814278248

   > Then I have some questions:
   >
   > What is an invalid value in a geometry feature? NaN? +/-Inf? Anything else?
   > From the above discussion, it seems that +/-Inf are invalid values in 
terms of a bbox. If that's true, definitely we should not make it as the final 
bbox to persist in the file. Is NaN a valid value in a bbox? Is it a good way 
to use NaN values for an empty bbox?
   > Is it a good approach to drop the entire bbox if any NaN or invalid value 
appears? In this way, we do not fail the writer at the cost of missing bbox. 
I'm in favor of this so we do not produce any confusing stats to users. It is 
really hard to downstream users to decide if the provided stats are reliable 
for predicate push down.
   
   @wgtmac Just wanted to share my view/opinion on this:
   * In a geometry feature NaN or +/-inf values do not make sense. +/-inf 
values could make sense in a geometric (not geographic) bounding box, but this 
would be a convention.
   * Given the above statement, +/-inf values in a bounding box could 
potentially be persisted but this would be corrupt or invalid data. The 
engine/reader could choose how to handle them.
   * I think using NaN values for representing empty boxes makes a lot of sense 
and it provides information compared to dropping a box of NaN values (by "drop" 
I mean write nothing instead). Specifically, if I see no bounding box I am 
basically forced to believe that I know nothing about my data. If I see an 
empty box than I know I can safely skip this piece of data for certain 
operations (like spatial predicates).
   
   Hope this makes sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to