Thanks for the summary @Micah, also sorry I couldn't be in the meeting
yesterday. I do hope we can get the wider physical type size for both
Duration and Timestamp Nanos in the near future as well.

On Thu, Jul 10, 2025 at 11:49 AM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> As an update/TL;DR; The current proposal in parquet is (there is still
> some active discussions):
>
> Two new logical types.
> 1.  YearMonth interval annotates an int32 (interval is stored as the
> number of months)
> 2.  DurationNanos annotates an int64 (interval is stored as the number of
> nanoseconds)
>
> Two key decisions here are:
> 1.  not block progress on defining a larger width physical type (either
> FLBA[10 or16] or int128).  This topic is being discussed separately in
> Parquet. This means that the storable value will not fulfill ANSI SQL's +/-
> 10000  year requirement but still provide a reasonable range.
> 2.  The naming of DurationNanos better reflects that the semantics at
> least as proposed do not involve any "calendar logic" (a day is always 24
> hours).  While the ANSI SQL standard uses these definitions other engines
> record the number of days separately (and days can be more than or less
> than 24 hours).
>
> Iceberg is not bound by the naming convention chosen in Parquet but if
> there are strong opinions, either of  these decisions please chime in on
> the Parquet discussion (
> https://lists.apache.org/thread/n8jdft4mltdcf91v7t8qf1hz5cg8nbnz).
> Hopefully that will help avoid rehashing the conversations on this mailing
> list.
>
> Thanks,
> Micah
>
>
>
> On Thu, Jul 3, 2025 at 6:33 PM yun zou <yunzou.colost...@gmail.com> wrote:
>
>> Hi Laurent,
>>
>> Thank you for raising the Parquet and Arrow compatibility topic. The
>> discussion is currently ongoing in the Parquet community.
>> You can follow the thread here:
>> https://lists.apache.org/thread/n8jdft4mltdcf91v7t8qf1hz5cg8nbnz
>>
>> Best Regards,
>> Yun
>>
>> On Thu, Jul 3, 2025 at 8:42 AM Laurent Goujon <laur...@dremio.com.invalid>
>> wrote:
>>
>>> Like Russell, addition of new types which are widely used in analytics
>>> seems like a good thing.
>>>
>>> The document still has various open comments regarding the
>>> representation, and so I wonder if things have been settled or not. I'm
>>> also curious if this proposal will also be joined by proposals on Parquet
>>> and Arrow projects to align the types and representations, similar to what
>>> happened with the variant type?
>>>
>>> Laurent
>>>
>>>
>>> On Wed, Jun 18, 2025 at 3:46 PM yun zou <yunzou.colost...@gmail.com>
>>> wrote:
>>>
>>>> Dear Community,
>>>>
>>>> I would like to bump this thread for the discussion of adding Interval
>>>> Type support.
>>>>
>>>> How does everyone feel about moving forward with the support of
>>>> Year-Month and Day-Time Intervals, especially for the part about having
>>>> 16-byte signed values to represent nanoseconds.
>>>>
>>>> The change will first be made on the parquet community, and here is the
>>>> PR :
>>>> https://github.com/apache/parquet-format/pull/496/files
>>>>
>>>> Please feel free to provide any feedback or suggestions!
>>>>
>>>> Best Regards,
>>>> Yun
>>>>
>>>>
>>>>
>>>> On Mon, Apr 21, 2025 at 10:29 AM Russell Spitzer <
>>>> russell.spit...@gmail.com> wrote:
>>>>
>>>>> I think this is a pretty good idea for us to adopt in terms of
>>>>> compatibility with other systems
>>>>> and I really appreciate that Naren made sure to use a broad enough
>>>>> definition to support all
>>>>> available engines. I'm really interested to know how other folks feel
>>>>> about this proposal and
>>>>> I hope we can reach some common ground here.
>>>>>
>>>>> On Mon, Apr 21, 2025 at 12:24 PM Naren Krishna
>>>>> <naren.kris...@snowflake.com.invalid> wrote:
>>>>>
>>>>>> Dear Community,
>>>>>>
>>>>>> I want to propose the addition of the Interval types to the Iceberg
>>>>>> spec. A value of an Interval type represents a duration of time, and can 
>>>>>> be
>>>>>> calculated by the difference between two dates or times. Intervals are
>>>>>> supported across a variety of different engines (e.g. Parquet, Spark,
>>>>>> Arrow, Oracle, Snowflake) and are widely used in time-series analysis for
>>>>>> calculations and comparisons of time spans and date arithmetic.
>>>>>>
>>>>>> For more information, see this high-level proposal
>>>>>> <https://docs.google.com/document/d/12ghQxWxyAhSQeZyy0IWiwJ02gTqFOgfYm8x851HZFLk/edit?usp=sharing>
>>>>>> providing a recommendation to build Interval types in Iceberg following 
>>>>>> the
>>>>>> ANSI SQL standard specification. Per ANSI SQL standard, this proposal
>>>>>> recommends the creation of two types of Intervals: Year-Month and 
>>>>>> Day-Time
>>>>>> Intervals. The linked document also details the implementations of 
>>>>>> Interval
>>>>>> types in various engines and is intended to spur discussion in the 
>>>>>> Iceberg
>>>>>> community.
>>>>>>
>>>>>> Thanks,
>>>>>> Naren Krishna
>>>>>>
>>>>>

Reply via email to