Xuanwo, do you favor deprecating or removing `distinct_count`? Due to lack of any real implementation, I myself favor removal (PR 12183).
Jacob Marble 🔥🐅 On Tue, Feb 11, 2025 at 10:25 PM Xuanwo <xua...@apache.org> wrote: > Here is my +1 binding. > > The current status of `distinct_count` is quite confusing, which has also > led to additional discussions in `iceberg-rust` about whether we need to > add it and how to maintain it. > > Removing it seems reasonable to me, as there are no known use cases for > `distinct_count` in a single data file. > > On Tue, Feb 11, 2025, at 23:05, Fokko Driesprong wrote: > > My mistake, I suggested sending out an email with a quick vote on the PR. > I like the suggestion to use this thread for discussion since the number of > options is limited. > > I'm in favor of deprecating the field, to avoid that we re-use the > field-id in the future. > > Kind regards, > Fokko > > Op di 11 feb 2025 om 05:46 schreef Manu Zhang <owenzhang1...@gmail.com>: > > Hi Jacob, > > Thanks for initiating the vote. > Typically, we would first have a DISCUSSION thread to reach a consensus on > the preferred option and then follow it up with a VOTE thread for > confirmation. > > Maybe we can take this as a DISCUSSION thread? > > Best, > Manu > > > On Tue, Feb 11, 2025 at 7:20 AM Jacob Marble > <jacobmar...@firetiger.com.invalid> wrote: > > This vote will be open for at least 72 hours. > > I propose that distinct_counts be either deprecated (#12182 > <https://github.com/apache/iceberg/pull/12182>) or removed (#12183 > <https://github.com/apache/iceberg/pull/12183>) from the spec. > > According to #767 <https://github.com/apache/iceberg/issues/767> > data_file.distinct_counts was deprecated about four years ago. Furthermore, > it not implemented in the canonical Java and Python implementations > > Please share your thoughts, and vote one of the following: > - remove > - deprecate > - no-op > > Jacob Marble > 🔥🐅 > > Xuanwo > > https://xuanwo.io/ > >