Ray, in what cases we can not set it as True for v1 tables? My major goal is to reduce the runtime of snapshot creation jobs.
Is it OK if I set it True during snapshot table creation and set it to false when finished? On Thu, Aug 31, 2023 at 9:44 AM Ryan Blue <b...@tabular.io> wrote: > This isn't something that we can set to `true` because it is a > forward-incompatible change. That's why we added a flag. > > However, we should make sure that this is the default behavior in v2 > tables, since it is safe for v2 (where inheritance happens automatically). > If I remember correctly, we still rewrite by default in v2 even though it's > safe. > > On Thu, Aug 31, 2023 at 9:40 AM Pucheng Yang <py...@pinterest.com.invalid> > wrote: > >> Hi community, >> >> Table prop "compatibility.snapshot-id-inheritance.enabled" is introduced >> to avoid manifest rewrite if possible (PR: >> https://github.com/apache/iceberg/commit/c3dc9824b381e5e479e356be5e0f4fcf61a9fc37 >> ). >> >> During my recent investigation on a super long snapshot table creation on >> a huge table, I found that the majority of time spent is on >> manifest rewrite during appendManifest operation (code link: >> https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L279) >> due to this table prop being default as False. >> >> Russell brought a point of considering setting this table prop to True >> and suggested I start a discussion on the dev list. >> >> Correct me if I am wrong, after looking at the code, my understanding of >> the implications are: >> 1. There will be manifests not having snapshot id in some cases. For >> example, during snapshot table creation, we append manifest files without >> snapshot id to a table. >> 2. The manifest file name will be the name specified during the "first >> write" (the "second write" is manifest copy during appendManifest >> operation). An example will be "stage-%d-task-%d-manifest-%s" which is the >> name used during snapshot creation, but since the last param is UUID, it >> should be fine. >> >> Would like to hear from you, thanks! >> > > > -- > Ryan Blue > Tabular >