You are right. Null always  needs  special treatment. I think allowing null
value in equality id is reasonable, but should we treat it as distinct? PG
treats it as destinct by default, but allows configuration to treat it as
no distinct:

https://stackoverflow.com/questions/8289100/create-unique-constraint-with-null-columns


On Sat, Oct 28, 2023 at 04:00 Micah Kornfield <emkornfi...@gmail.com> wrote:

> Iceberg spec has a clear definition of constraints about identifier id
>> fields <https://iceberg.apache.org/spec/#identifier-field-ids> . I think
>> it would make sense if equality id fields share similar constraints.
>
>
> Makes sense, however it appears that for equality delete null values are
> intentionally allowed, whereas they aren't for identifier IDs, so either
> way there is an exception.
>
> On Thu, Oct 26, 2023 at 2:19 AM Renjie Liu <liurenjie2...@gmail.com>
> wrote:
>
>> Hi, Micah:
>>
>> Iceberg spec has a clear definition of constraints about identifier id
>> fields <https://iceberg.apache.org/spec/#identifier-field-ids> . I think
>> it would make sense if equality id fields share similar constraints.
>>
>> On Thu, Oct 26, 2023 at 4:24 AM Micah Kornfield <emkornfi...@gmail.com>
>> wrote:
>>
>>> Sorry I think I missed a question:
>>>
>>> Similarly, I think we could handle fields with primitive or struct types
>>>
>>>
>>> struct types add another dimension of complexity, I'd don't think it is
>>> harmful to necessarily support them, but it also doesn't seem like they add
>>> a lot of value when compared to enumerating the leaf columns.  Since the
>>> change is potentially backwards incompatible, we might not be able to get
>>> away with disallowing them?
>>>
>>> Thanks,
>>> Micah
>>>
>>>
>>> On Wed, Oct 25, 2023 at 1:22 PM Micah Kornfield <emkornfi...@gmail.com>
>>> wrote:
>>>
>>>> I think nesting in struct makes sense to support as this is consistent
>>>> with partitioning input columns.
>>>>
>>>> I can propose a PR if there aren't any more opinions here.
>>>>
>>>> On Fri, Oct 20, 2023 at 3:49 PM Ryan Blue <b...@tabular.io> wrote:
>>>>
>>>>> You're right. It calls out that `float` and `double` columns can't be
>>>>> used, but there's a question around what is "equal" for maps, at the 
>>>>> least.
>>>>>
>>>>> I think the reasonable thing to do is to allow top-level fields and
>>>>> fields that are nested within only struts. Any field nested within a map 
>>>>> or
>>>>> list should not be allowed. Similarly, I think we could handle fields with
>>>>> primitive or struct types but fields that contain lists or maps should not
>>>>> be allowed.
>>>>>
>>>>> Does that sound reasonable to you? We could be more conservative and
>>>>> disallow deletion by struct fields as well.
>>>>>
>>>>> Ryan
>>>>>
>>>>> On Fri, Oct 20, 2023 at 1:40 PM Micah Kornfield <emkornfi...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Iceberg Dev,
>>>>>> Are equality delete files intended to support nested columns  of
>>>>>> nested types (lists, structs and maps) or "children" of nested types?  I
>>>>>> couldn't find anything prohibiting it in the specification [1]   
>>>>>> (apologies
>>>>>> if I missed it) but it seems like this adds a fair amount of complexity 
>>>>>> and
>>>>>> ambiguity if they are supported.
>>>>>>
>>>>>> Thanks,
>>>>>> Micah
>>>>>>
>>>>>> [1] https://iceberg.apache.org/spec/
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Tabular
>>>>>
>>>>

Reply via email to