I feel like this shouldn't be a big problem going forward since all new tables
will be using the V2 format where the snapshot ID inheritance is enabled. There
is currently a bug in our rewrite manifests action that checks the flag but
doesn't check the format version. I have a fix for that local
Thanks Rayn, then I think I have a path forward. And I will file a feature
request on thread-safe appendManifest on github. Thanks again.
On Thu, Aug 31, 2023 at 10:04 AM Ryan Blue wrote:
> I think making the operation thread-safe and parallelizing is a good idea.
> It should be pretty easy.
>
>
I think making the operation thread-safe and parallelizing is a good idea.
It should be pretty easy.
And yes, versions of Iceberg older than the one where that config property
was added would be the ones where it is unsafe. It's probably safe for most
people, but we still can't change the default.
Thanks Ryan, what might you consider an "older" version of Iceberg? Is it
fair to say any version before
https://github.com/apache/iceberg/commit/c3dc9824b381e5e479e356be5e0f4fcf61a9fc37
? If that is the case, my organization controls the Iceberg reader so might
be a less concern for me.
Another o
There are a couple problems with default values. First, they are part of v3
and haven’t been implemented yet. But the second larger issue is that null
is a value. A default doesn’t replace a null that was written in the data.
I don’t think default values would help out here.
What I meant by derive
It isn't safe for this to be set for any table that may be read by an older
version of Iceberg.
On Thu, Aug 31, 2023 at 9:49 AM Pucheng Yang
wrote:
> Ray, in what cases we can not set it as True for v1 tables?
>
> My major goal is to reduce the runtime of snapshot creation jobs.
>
> Is it OK if
Ray, in what cases we can not set it as True for v1 tables?
My major goal is to reduce the runtime of snapshot creation jobs.
Is it OK if I set it True during snapshot table creation and set it to
false when finished?
On Thu, Aug 31, 2023 at 9:44 AM Ryan Blue wrote:
> This isn't something that
This isn't something that we can set to `true` because it is a
forward-incompatible change. That's why we added a flag.
However, we should make sure that this is the default behavior in v2
tables, since it is safe for v2 (where inheritance happens automatically).
If I remember correctly, we still
Hi community,
Table prop "compatibility.snapshot-id-inheritance.enabled" is introduced to
avoid manifest rewrite if possible (PR:
https://github.com/apache/iceberg/commit/c3dc9824b381e5e479e356be5e0f4fcf61a9fc37
).
During my recent investigation on a super long snapshot table creation on a
huge t
We generally don't recommend fanout writers because they create lots of
small data files. It also isn't clear why the table's partitioning isn't
causing Spark to distribute the data properly -- maybe you're using an old
Spark version?
In any case, you can distribute the data yourself to align with
10 matches
Mail list logo