Re: Iceberg transaction experience

2021-02-10 Thread Ryan Blue
I think that the only way to do what you want to is to add extra validation hooks, or to turn off retries like you're doing. If you want to work on adding validations, I'm happy to work on it with you. On Wed, Feb 10, 2021 at 12:03 PM Omar Aloraini wrote: > Hi Ryan, thanks for the reply > > I ha

Re: Iceberg transaction experience

2021-02-10 Thread Omar Aloraini
Hi Ryan, thanks for the reply I have seen a discussion on a pull request that you (and others) have suggested using the snapshot summary for the tracking data, this was several weeks ago, and I think I tried it, but I didn't work out. If two instances of the jobs were running at the same time(by a

Re: Iceberg transaction experience

2021-02-10 Thread Ryan Blue
Thanks, Omar. For use cases like this, I recommend adding the tracking data into the snapshot summary rather than table metadata. That avoids needing to use a transaction and is what we do for similar cases, like streaming sinks that need to ensure checkpoints aren't committed twice. It sounds li

Iceberg transaction experience

2021-02-10 Thread Omar Aloraini
Hello All, Over the last two months I have been using Iceberg, for the most part it did what I expected, but when I started using the transaction API (Table::newTranscation) I came across a few of what I consider counter-intuitive, at least for my perception of what a transaction is. My team's go