It could also be reasonable to introduce additional metadata to SortField
to declare more details about the sort. Java and IEEE754 sort order are
both valid depending on context.

On Thu, Feb 27, 2025 at 6:00 PM Gang Wu <ust...@gmail.com> wrote:

> FYI: there was an effort from Jan (cc'd) to introduce a total order for
> floating-point numbers on the Parquet side: [1][2].
>
> [1] https://github.com/apache/parquet-format/pull/221
> [2] https://github.com/apache/parquet-format/pull/196
>
> On Thu, Feb 27, 2025 at 4:24 AM Devin Smith
> <devinsm...@deephaven.io.invalid> wrote:
>
>> The spec https://iceberg.apache.org/spec/#sorting says
>>
>> Sorting floating-point numbers should produce the following behavior:
>>> -NaN < -Infinity < -value < -0 < 0 < value < Infinity < NaN. This
>>> aligns with the implementation of Java floating-point types comparisons.
>>
>>
>> As far as I know, this does not align with the implementation of Java
>> floating-point types comparison as there is no concept of -NaN. There
>> may be some more explicit total ordering regimes, such as IEEE 754-2019
>> - Standard for Floating-Point Arithmetic
>> <https://ieeexplore.ieee.org/document/8766229> (or maybe, IEEE
>> 754-2008), but it's unclear if that was the intention of the Iceberg spec.
>> If the intention is to use this IEEE 754 total ordering, it probably makes
>> sense to link to the specification along with the implications (regarding
>> qNan, sNan, sign-bit on NaN, etc). If the intention is to use the Java
>> ordering, it probably makes sense to remove the reference to -NaN and to
>> link to the relevant javadoc.
>>
>>
>> https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation
>>
>>
>> https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#compareTo(java.lang.Double)
>>
>> https://en.wikipedia.org/wiki/IEEE_754#Total-ordering_predicate
>>
>> What is the correct interpretation?
>>
>> Thanks,
>> -Devin
>>
>>

Reply via email to