Hello Iceberg Community,

I'm reaching out regarding a correctness issue I've encountered while
working with Iceberg tables with UUID type columns. In Java based spec
implementation github.com/apache/iceberg UUID literals are compared using
`java.util.UUID.compareTo` which has a known bug due to its non-compliance
with RFC 4122 and RFC 9562 for comparisons. I opened an issue for this at
https://github.com/apache/iceberg/issues/14216. Also note that
implementations based on Go, C++, rust, and python do not contain this bug
as far as I can tell and only affects Java based implementation.

Background:
The Apache Iceberg specification has supported the UUID data type for
several years now.
According to RFC 4122 and RFC 9562, UUIDs should be compared using unsigned
byte-wise comparisons. The Apache Parquet specification is in compliance
with what is stipulated in the RFCs. However, the Java implementation of
Apache Iceberg specification relies on java.util.UUID, which performs
comparisons using 2 signed long values - a well-known bug that violates the
RFC specification.
Correctness issues may be encountered when reading Iceberg tables: during
ManifestEntry filtering, ManifestEvaluator uses UUID MIN/MAX metrics from
ManifestEntry to prune partitions and data files based on filter
expressions. Due to the incorrect comparison semantics, some ManifestEntry
that should NOT be pruned are incorrectly pruned, potentially leading to
missing data in query results.

While interest for UUID data type may not be that high (possibly the reason
this bug did not surface), I think we should consider fixing this bug. I am
thinking of two potential approaches and would appreciate the community's
input:

Option 1: Fix the comparison logic directly in Apache Iceberg Java
implementation
While this provides correct UUID comparison semantics going forward and
aligns with RFC specifications and Parquet implementation, it could cause
backward-compatibility issues with existing tables and folks relying on the
bug. Existing UUID metrics would need migration or invalidation (?)

Option 2: Disable UUID filter pushdown to ManifestEvaluator.
It avoids "different query results" issues with existing tables. No
migration required for existing metrics (but what do we do with these
MIN/MAX UUID metrics? leave dormant?) Some performance degradation due to
inability to prune partitions and data files based on UUID predicates
Doesn't address the underlying bug.

Questions for the Community
What does (or should) the Iceberg specification say how UUID values are to
be compared? Note that some engines may use Apache Parquet (RFC-compliant)
to prepare MIN/MAX metrics for ManifestFile.
Has anyone else encountered this issue or have thoughts on the best path
forward?
For Option 1, what would be the recommended migration strategy for existing
tables with UUID MIN/MAX metrics?
Are there other approaches I haven't considered?
Would the community be open to a fix that includes a table format version
bump or a feature flag to handle backward compatibility?

I can try to contribute a fix once we align on the approach. Any guidance
or feedback would be greatly appreciated.

Thank you for your time and consideration.

Cheers,
Vishal Boddu

Reply via email to