It sounds good to me.

Thanks !
Regards
JB

On Wed, Dec 11, 2024 at 7:20 PM Russell Spitzer
<russell.spit...@gmail.com> wrote:
>
> Hi Y'all!
>
> Today we had a little discussion on the Apache Iceberg Catalog Community Sync
> about DROP and DROP WITH PURGE. Currently the SparkCatalog implementation
> inside of the reference library has a unique method of DROP WITH PURGE vs 
> other
> implementations. The pseudo code is essentially
>
>
> ```
> use Spark to list files to be removed and delete them
> send a drop table request to the Catalog
> ```
>
> As opposed to other systems
>
> ```
> send a drop table request to the Catalog with the purge flag enabled
> ```
>
> This has led us to a situation where it becomes difficult for REST Catalogs
> with custom purge implementations (or those with ignore purge) to
> work properly with Spark.
>
> Bringing this behavior in line with non-Spark implementations
> would have possibly dramatic impacts on users of the
> iceberg library but our consensus in the Catalog Sync today was that we should
> eventually have that be the default behavior. To this end I propose the 
> following
>
> We support a flag to allow current Spark users to delegate to the REST Catalog
> (all other catalog behaviors remain the same). PR available here from
> (Credit to Tobias who wrote the PR and brought up this topic)
>  We deprecate the client side delete for Spark
> In the next major release (Iceberg 2.0?) we change the behavior officially to 
> only
> send through the Drop Purge flag with no client side file removal.
> For all non-REST catalog implementations we keep the code the same for legacy 
> compatibility.
>
> A user of 1.8 will then have the ability to choose for their Spark DROP 
> PURGES whether
> or not to purge locally or Remotely for REST
>
> A user of 2.0 will only be able to do a remote purge
>
> Users of non-REST Catalogs will have no change in behavior.
>
>
> Thanks for your consideration,
> Russ

Reply via email to