Szehon,

We implemented the current behavior because that’s what was expected for INSERT
OVERWRITE. But the ReplacePartitions operation uses the same base class as
the expression overwrite, so you could add more validation, including the
conflict checks that you’re talking about by calling the
validateAddedDataFiles helper method
<https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L249-L250>
on the base class.

If you want to implement this, ping me on Slack and I can point you in the
right direction.

Ryan

On Tue, Jul 20, 2021 at 4:20 PM Szehon Ho <szehon.apa...@gmail.com> wrote:

> Hi,
>
> Does anyone know if its feasible to consider making Spark's "insert
> overwrite" implement serializable transaction, like delete, update, merge?
>
> Maybe at least for "overwrite by filter", then it can narrow down the
> conflict checks needed on the commitWithSerializableTransaction side.  I
> don't have the full context on the Spark side if its feasible to do the
> rewrite as Delete/Merge/Update does, to use this mechanism.
>
> Its for a use-case like "insert overwrite into table foo partition
> (date=...) select ... from foo", which I understand is not the common use
> case for insert overwrite, as its usually select from another table.
>
> Thanks in advance,
> Szehon
>
>
>

-- 
Ryan Blue
Tabular

Reply via email to