It sounds good to me. Thanks ! Regards JB
On Wed, Dec 11, 2024 at 7:20 PM Russell Spitzer <russell.spit...@gmail.com> wrote: > > Hi Y'all! > > Today we had a little discussion on the Apache Iceberg Catalog Community Sync > about DROP and DROP WITH PURGE. Currently the SparkCatalog implementation > inside of the reference library has a unique method of DROP WITH PURGE vs > other > implementations. The pseudo code is essentially > > > ``` > use Spark to list files to be removed and delete them > send a drop table request to the Catalog > ``` > > As opposed to other systems > > ``` > send a drop table request to the Catalog with the purge flag enabled > ``` > > This has led us to a situation where it becomes difficult for REST Catalogs > with custom purge implementations (or those with ignore purge) to > work properly with Spark. > > Bringing this behavior in line with non-Spark implementations > would have possibly dramatic impacts on users of the > iceberg library but our consensus in the Catalog Sync today was that we should > eventually have that be the default behavior. To this end I propose the > following > > We support a flag to allow current Spark users to delegate to the REST Catalog > (all other catalog behaviors remain the same). PR available here from > (Credit to Tobias who wrote the PR and brought up this topic) > We deprecate the client side delete for Spark > In the next major release (Iceberg 2.0?) we change the behavior officially to > only > send through the Drop Purge flag with no client side file removal. > For all non-REST catalog implementations we keep the code the same for legacy > compatibility. > > A user of 1.8 will then have the ability to choose for their Spark DROP > PURGES whether > or not to purge locally or Remotely for REST > > A user of 2.0 will only be able to do a remote purge > > Users of non-REST Catalogs will have no change in behavior. > > > Thanks for your consideration, > Russ