The background is that the result of the day function and dates are
basically the same: the number of days from the Unix epoch. When we started
using metadata tables, we realized that a lot of people use the day
function but then get a weird ordinal value out, but if we just change the
type to `date`, engines could correctly display the value. This isn't
required by the spec, it's just a convenience.

On Fri, Sep 27, 2024 at 8:30 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> Good thing DateType is an Integer :)
> https://github.com/apache/iceberg/blob/113c6e7d62e53d3e3cb15b1712f3a1db473ca940/api/src/main/java/org/apache/iceberg/types/Type.java#L37
>
> On Thu, Sep 26, 2024 at 8:38 PM Kevin Liu <kevin.jq....@gmail.com> wrote:
>
>> Hey folks,
>>
>> While reviewing a PR to fix DayTransform in PyIceberg (#1208
>> <https://github.com/apache/iceberg-python/pull/1208>), we found an
>> inconsistency between the spec and the Java Iceberg library.
>>
>> According to the spec
>> <https://iceberg.apache.org/spec/#partition-transforms>, the result type
>> for the "day partition transform" should be `int`, similar to other
>> time-based partition transforms (year/month/hour). However, in the Java
>> Iceberg library, the result type for day partition transform is `DateType` (
>> source
>> <https://github.com/apache/iceberg/blob/dddb5f423b353d961b8a08eb2cb4371d453c2959/api/src/main/java/org/apache/iceberg/transforms/Days.java#L47>).
>> This seems to be a discrepancy from the spec, as the day partition
>> transform is the only time-based transform with a non-int result
>> type—whereas the others use IntegerType (source
>> <https://grep.app/search?q=getResultType&filter[repo][0]=apache/iceberg&filter[path][0]=api/src/main/java/org/apache/iceberg/>
>> ).
>>
>> Could someone confirm if my understanding is correct? If so, is there any
>> historical context for this difference? Lastly, how should we approach
>> resolving this moving forward?
>>
>> Best,
>> Kevin
>>
>>

Reply via email to