Created https://github.com/apache/iceberg/pull/8981 to finalize the thread.

On Fri, Oct 27, 2023 at 8:55 PM Renjie Liu <liurenjie2...@gmail.com> wrote:

> You are right. Null always  needs  special treatment. I think allowing
> null value in equality id is reasonable, but should we treat it as
> distinct? PG treats it as destinct by default, but allows configuration to
> treat it as no distinct:
>
>
> https://stackoverflow.com/questions/8289100/create-unique-constraint-with-null-columns
>
>
> On Sat, Oct 28, 2023 at 04:00 Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
>> Iceberg spec has a clear definition of constraints about identifier id
>>> fields <https://iceberg.apache.org/spec/#identifier-field-ids> . I
>>> think it would make sense if equality id fields share similar constraints.
>>
>>
>> Makes sense, however it appears that for equality delete null values are
>> intentionally allowed, whereas they aren't for identifier IDs, so either
>> way there is an exception.
>>
>> On Thu, Oct 26, 2023 at 2:19 AM Renjie Liu <liurenjie2...@gmail.com>
>> wrote:
>>
>>> Hi, Micah:
>>>
>>> Iceberg spec has a clear definition of constraints about identifier id
>>> fields <https://iceberg.apache.org/spec/#identifier-field-ids> . I
>>> think it would make sense if equality id fields share similar constraints.
>>>
>>> On Thu, Oct 26, 2023 at 4:24 AM Micah Kornfield <emkornfi...@gmail.com>
>>> wrote:
>>>
>>>> Sorry I think I missed a question:
>>>>
>>>> Similarly, I think we could handle fields with primitive or struct types
>>>>
>>>>
>>>> struct types add another dimension of complexity, I'd don't think it is
>>>> harmful to necessarily support them, but it also doesn't seem like they add
>>>> a lot of value when compared to enumerating the leaf columns.  Since the
>>>> change is potentially backwards incompatible, we might not be able to get
>>>> away with disallowing them?
>>>>
>>>> Thanks,
>>>> Micah
>>>>
>>>>
>>>> On Wed, Oct 25, 2023 at 1:22 PM Micah Kornfield <emkornfi...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think nesting in struct makes sense to support as this is consistent
>>>>> with partitioning input columns.
>>>>>
>>>>> I can propose a PR if there aren't any more opinions here.
>>>>>
>>>>> On Fri, Oct 20, 2023 at 3:49 PM Ryan Blue <b...@tabular.io> wrote:
>>>>>
>>>>>> You're right. It calls out that `float` and `double` columns can't be
>>>>>> used, but there's a question around what is "equal" for maps, at the 
>>>>>> least.
>>>>>>
>>>>>> I think the reasonable thing to do is to allow top-level fields and
>>>>>> fields that are nested within only struts. Any field nested within a map 
>>>>>> or
>>>>>> list should not be allowed. Similarly, I think we could handle fields 
>>>>>> with
>>>>>> primitive or struct types but fields that contain lists or maps should 
>>>>>> not
>>>>>> be allowed.
>>>>>>
>>>>>> Does that sound reasonable to you? We could be more conservative and
>>>>>> disallow deletion by struct fields as well.
>>>>>>
>>>>>> Ryan
>>>>>>
>>>>>> On Fri, Oct 20, 2023 at 1:40 PM Micah Kornfield <
>>>>>> emkornfi...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Iceberg Dev,
>>>>>>> Are equality delete files intended to support nested columns  of
>>>>>>> nested types (lists, structs and maps) or "children" of nested types?  I
>>>>>>> couldn't find anything prohibiting it in the specification [1]   
>>>>>>> (apologies
>>>>>>> if I missed it) but it seems like this adds a fair amount of complexity 
>>>>>>> and
>>>>>>> ambiguity if they are supported.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Micah
>>>>>>>
>>>>>>> [1] https://iceberg.apache.org/spec/
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ryan Blue
>>>>>> Tabular
>>>>>>
>>>>>

Reply via email to