Hi Y'all! Today we had a little discussion on the Apache Iceberg Catalog Community Sync about DROP and DROP WITH PURGE. Currently the SparkCatalog implementation inside of the reference library has a unique method of DROP WITH PURGE vs other implementations. The pseudo code is essentially
``` use Spark to list files to be removed and delete them send a drop table request to the Catalog ``` As opposed to other systems ``` send a drop table request to the Catalog with the purge flag enabled ``` This has led us to a situation where it becomes difficult for REST Catalogs with custom purge implementations (or those with ignore purge) to work properly with Spark. Bringing this behavior in line with non-Spark implementations would have possibly dramatic impacts on users of the iceberg library but our consensus in the Catalog Sync today was that we should eventually have that be the default behavior. To this end I propose the following - We support a flag to allow current Spark users to delegate to the REST Catalog (all other catalog behaviors remain the same). PR available here <https://github.com/apache/iceberg/pull/11317> from (*Credit to Tobias who wrote the PR and brought up this topic)* - We deprecate the client side delete for Spark - In the next major release (Iceberg 2.0?) we change the behavior officially <https://github.com/apache/iceberg/issues/11754> to only send through the Drop Purge flag with no client side file removal. - For all non-REST catalog implementations we keep the code the same for legacy compatibility. A user of 1.8 will then have the ability to choose for their Spark DROP PURGES whether or not to purge locally or Remotely for REST A user of 2.0 will only be able to do a remote purge Users of non-REST Catalogs will have no change in behavior. Thanks for your consideration, Russ