Thanks Ryan for the confirmation, I'm definitely interested to take a
look.  if it can be done, the serializable isolation level could probably
be an option as for the other operations.  I will look a bit and ping you
when I get a chance.

Szehon

On Tue, Jul 20, 2021 at 5:11 PM Ryan Blue <b...@tabular.io> wrote:

> Szehon,
>
> We implemented the current behavior because that’s what was expected for 
> INSERT
> OVERWRITE. But the ReplacePartitions operation uses the same base class
> as the expression overwrite, so you could add more validation, including
> the conflict checks that you’re talking about by calling the
> validateAddedDataFiles helper method
> <https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L249-L250>
> on the base class.
>
> If you want to implement this, ping me on Slack and I can point you in the
> right direction.
>
> Ryan
>
> On Tue, Jul 20, 2021 at 4:20 PM Szehon Ho <szehon.apa...@gmail.com> wrote:
>
>> Hi,
>>
>> Does anyone know if its feasible to consider making Spark's "insert
>> overwrite" implement serializable transaction, like delete, update, merge?
>>
>> Maybe at least for "overwrite by filter", then it can narrow down the
>> conflict checks needed on the commitWithSerializableTransaction side.  I
>> don't have the full context on the Spark side if its feasible to do the
>> rewrite as Delete/Merge/Update does, to use this mechanism.
>>
>> Its for a use-case like "insert overwrite into table foo partition
>> (date=...) select ... from foo", which I understand is not the common use
>> case for insert overwrite, as its usually select from another table.
>>
>> Thanks in advance,
>> Szehon
>>
>>
>>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to