Xuanwo, do you favor deprecating or removing `distinct_count`?

Due to lack of any real implementation, I myself favor removal (PR 12183).

Jacob Marble
🔥🐅


On Tue, Feb 11, 2025 at 10:25 PM Xuanwo <xua...@apache.org> wrote:

> Here is my +1 binding.
>
> The current status of `distinct_count` is quite confusing, which has also
> led to additional discussions in `iceberg-rust` about whether we need to
> add it and how to maintain it.
>
> Removing it seems reasonable to me, as there are no known use cases for
> `distinct_count` in a single data file.
>
> On Tue, Feb 11, 2025, at 23:05, Fokko Driesprong wrote:
>
> My mistake, I suggested sending out an email with a quick vote on the PR.
> I like the suggestion to use this thread for discussion since the number of
> options is limited.
>
> I'm in favor of deprecating the field, to avoid that we re-use the
> field-id in the future.
>
> Kind regards,
> Fokko
>
> Op di 11 feb 2025 om 05:46 schreef Manu Zhang <owenzhang1...@gmail.com>:
>
> Hi Jacob,
>
> Thanks for initiating the vote.
> Typically, we would first have a DISCUSSION thread to reach a consensus on
> the preferred option and then follow it up with a VOTE thread for
> confirmation.
>
> Maybe we can take this as a DISCUSSION thread?
>
> Best,
> Manu
>
>
> On Tue, Feb 11, 2025 at 7:20 AM Jacob Marble
> <jacobmar...@firetiger.com.invalid> wrote:
>
> This vote will be open for at least 72 hours.
>
> I propose that distinct_counts be either deprecated (#12182
> <https://github.com/apache/iceberg/pull/12182>) or removed (#12183
> <https://github.com/apache/iceberg/pull/12183>) from the spec.
>
> According to #767 <https://github.com/apache/iceberg/issues/767>
> data_file.distinct_counts was deprecated about four years ago. Furthermore,
> it not implemented in the canonical Java and Python implementations
>
> Please share your thoughts, and vote one of the following:
> - remove
> - deprecate
> - no-op
>
> Jacob Marble
> 🔥🐅
>
> Xuanwo
>
> https://xuanwo.io/
>
>

Reply via email to