JFinis commented on PR #221:
URL: https://github.com/apache/parquet-format/pull/221#issuecomment-2943678700

   > There is actually a problem with the singular NaN count for data systems 
which use IEEE 754 total ordering (such as datafusion), they would need two 
counts for efficient page filtering in the face of NaNs: one for positive NaNs 
and one for negative NaNs.
   
   I don't think that's a big problem. It just means that if the system needs 
to include either -NaN or +NaN in a query, any page that has a non-zero 
`nan_count` has to be scanned. Yes, that might mean that you scan a page in 
vain, if you're only looking for, say, +NaN, but the page happens to only 
include -NaN, but this seems to be a rather small problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to