orlp commented on PR #514:
URL: https://github.com/apache/parquet-format/pull/514#issuecomment-3183142450

   > Am I understanding correctly that the change to add nan_counts means that 
NaNs are now excluded from the min/max statistics? I believe this creates a new 
problem for engines that use total ordering as the sign of the NaNs is not 
known, and therefore there are scenarios where predicates can't be pushed down 
where they could be in the absence of nan_counts.
   
   The predicates still can be pushed down. If your engine uses total ordering 
with predicate `col > c` you match a page when the statistics indicate `col_min 
> c || nan_count != 0`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to