Created https://github.com/apache/iceberg/pull/8981 to finalize the thread.
On Fri, Oct 27, 2023 at 8:55 PM Renjie Liu <liurenjie2...@gmail.com> wrote: > You are right. Null always needs special treatment. I think allowing > null value in equality id is reasonable, but should we treat it as > distinct? PG treats it as destinct by default, but allows configuration to > treat it as no distinct: > > > https://stackoverflow.com/questions/8289100/create-unique-constraint-with-null-columns > > > On Sat, Oct 28, 2023 at 04:00 Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> Iceberg spec has a clear definition of constraints about identifier id >>> fields <https://iceberg.apache.org/spec/#identifier-field-ids> . I >>> think it would make sense if equality id fields share similar constraints. >> >> >> Makes sense, however it appears that for equality delete null values are >> intentionally allowed, whereas they aren't for identifier IDs, so either >> way there is an exception. >> >> On Thu, Oct 26, 2023 at 2:19 AM Renjie Liu <liurenjie2...@gmail.com> >> wrote: >> >>> Hi, Micah: >>> >>> Iceberg spec has a clear definition of constraints about identifier id >>> fields <https://iceberg.apache.org/spec/#identifier-field-ids> . I >>> think it would make sense if equality id fields share similar constraints. >>> >>> On Thu, Oct 26, 2023 at 4:24 AM Micah Kornfield <emkornfi...@gmail.com> >>> wrote: >>> >>>> Sorry I think I missed a question: >>>> >>>> Similarly, I think we could handle fields with primitive or struct types >>>> >>>> >>>> struct types add another dimension of complexity, I'd don't think it is >>>> harmful to necessarily support them, but it also doesn't seem like they add >>>> a lot of value when compared to enumerating the leaf columns. Since the >>>> change is potentially backwards incompatible, we might not be able to get >>>> away with disallowing them? >>>> >>>> Thanks, >>>> Micah >>>> >>>> >>>> On Wed, Oct 25, 2023 at 1:22 PM Micah Kornfield <emkornfi...@gmail.com> >>>> wrote: >>>> >>>>> I think nesting in struct makes sense to support as this is consistent >>>>> with partitioning input columns. >>>>> >>>>> I can propose a PR if there aren't any more opinions here. >>>>> >>>>> On Fri, Oct 20, 2023 at 3:49 PM Ryan Blue <b...@tabular.io> wrote: >>>>> >>>>>> You're right. It calls out that `float` and `double` columns can't be >>>>>> used, but there's a question around what is "equal" for maps, at the >>>>>> least. >>>>>> >>>>>> I think the reasonable thing to do is to allow top-level fields and >>>>>> fields that are nested within only struts. Any field nested within a map >>>>>> or >>>>>> list should not be allowed. Similarly, I think we could handle fields >>>>>> with >>>>>> primitive or struct types but fields that contain lists or maps should >>>>>> not >>>>>> be allowed. >>>>>> >>>>>> Does that sound reasonable to you? We could be more conservative and >>>>>> disallow deletion by struct fields as well. >>>>>> >>>>>> Ryan >>>>>> >>>>>> On Fri, Oct 20, 2023 at 1:40 PM Micah Kornfield < >>>>>> emkornfi...@gmail.com> wrote: >>>>>> >>>>>>> Hi Iceberg Dev, >>>>>>> Are equality delete files intended to support nested columns of >>>>>>> nested types (lists, structs and maps) or "children" of nested types? I >>>>>>> couldn't find anything prohibiting it in the specification [1] >>>>>>> (apologies >>>>>>> if I missed it) but it seems like this adds a fair amount of complexity >>>>>>> and >>>>>>> ambiguity if they are supported. >>>>>>> >>>>>>> Thanks, >>>>>>> Micah >>>>>>> >>>>>>> [1] https://iceberg.apache.org/spec/ >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ryan Blue >>>>>> Tabular >>>>>> >>>>>