The PR-2303 defines how the batch job does the compaction work, the PR-2308 decides what's the behavior that compaction txn and row-delta txn commit at the same time. They should n't block each other, but we will need to resolve both of them.
On Tue, May 18, 2021 at 9:36 AM Huadong Liu <huadong...@gmail.com> wrote: > Thanks. Compaction is https://github.com/apache/iceberg/pull/2303 and it > is currently blocked by https://github.com/apache/iceberg/issues/2308? > > On Mon, May 17, 2021 at 6:17 PM OpenInx <open...@gmail.com> wrote: > >> Hi Huadong >> >> From the perspective of iceberg developers, we don't expose the format v2 >> to end users because we think there is still other work that needs to be >> done. As you can see there are still some unfinished issues from your link. >> As for whether v2 will cause data loss, from my perspective as a >> designer, semantics and correctness should be handled very rigorously if we >> don't do any compaction. Once we introduce the compaction action, we will >> encounter this issue: https://github.com/apache/iceberg/issues/2308, >> we've proposed a solution but still not reached an agreement in the >> community. I will suggest using v2 in production after we resolve this >> issue at least. >> >> On Sat, May 15, 2021 at 8:01 AM Huadong Liu <huadong...@gmail.com> wrote: >> >>> Hi iceberg-dev, >>> >>> I tried v2 row-level deletion by committing equality delete files after >>> *upgradeToFormatVersion(2)*. It worked well. I know that Spark actions >>> to compact delete files and data files >>> <https://github.com/apache/iceberg/milestone/4> etc. are in progress. I >>> currently use the JAVA API to update, query and do maintenance ops. I am >>> not using Flink at the moment and I will definitely pick up Spark actions >>> when they are completed. Deletions can be scheduled in batches (e.g. >>> weekly) to control the volume of delete files. I want to get a sense of the >>> risk level of losing data at some point because of v2 Spec/API changes if I >>> start to use v2 format now. It is not an easy question. Any input is >>> appreciated. >>> >>> -- >>> Huadong >>> >>