The WAP feature still requires you to commit each write as a new snapshot,
just don't make it as the current one. In that case, each distributed
process still needs to commit their change and generate the new version
file. It won't help if you want to avoid high concurrent committing.
Best,
Yufei
This could also be achieved using the Write-Audit-Publish feature I
believe, where you audit a set of writes and then choose to publish them.
Though I'm not as familiar with that feature, but you might look into that
as well.
Thanks,
Kyle Bendickson
?
Thanks,
Mayur
From: Jack Ye
Sent: Friday, December 3, 2021 4:26 PM
To: Iceberg Dev List
Subject: Re: Single multi-process commit
Hi Mayur,
I think what you describe of writing in parallel and committing using a
coordinator is the strategy used by most of the engines today. The stream of
Hi Mayur,
I think what you describe of writing in parallel and committing using a
coordinator is the strategy used by most of the engines today. The stream
of DataFile (statistics collected from written data files) are passed to
the coordinator to do a single commit. In Spark, it's passed as
Write