Re: [DISCUSS] DROP PARTITION in Spark

Yufei Gu Wed, 17 Jul 2024 11:39:54 -0700

Based on my observations, users don't appear to be missing this feature,
but I'm OK to add it in Spark for compatibility purposes.


Yufei


On Wed, Jul 17, 2024 at 11:14 AM Szehon Ho <szehon.apa...@gmail.com> wrote:

> Hi Gabor
>
> I'm neutral for this, but can be convinced.  My initial thoughts is that
> there would be no way to have ADD PARTITION (I assume old Hive workloads
> would rely on this), and these are not ANSI SQL standard statements as
> Spark moves to that direction.
>
> The second point of guaranteeing a metadata only operation is interesting,
> an alternate would be to have a flag to fail unless query can be answered
> by metadata.
>
> Thanks
> Szehon
>
>
>
> On Wed, Jul 17, 2024 at 2:12 AM Gabor Kaszab <gaborkas...@apache.org>
> wrote:
>
>> Hey Community,
>>
>> I learned recently that Spark doesn't support DROP PARTITION for Iceberg
>> tables. I understand this is because the DROP PARTITION is something being
>> used for Hive tables and Iceberg's model for hidden partitioning makes it
>> unnatural to have commands like this.
>>
>> However, I think that DROP PARTITION would still have some value for
>> users. In fact in Impala we implemented this even for Iceberg tables.
>> Benefits could be:
>>  - Users having workloads on Hive tables could use their workloads after
>> they migrated their tables to Iceberg.
>>  - Opposed to DELETE FROM, DROP PARTITION has a guarantee that this is
>> going to be a metadata only operation and no delete files are going to be
>> written.
>>
>> I'm curious what the community thinks of this.
>> Gabor
>>
>>

Re: [DISCUSS] DROP PARTITION in Spark

Reply via email to