Thanks Ryan for the confirmation, I'm definitely interested to take a look. if it can be done, the serializable isolation level could probably be an option as for the other operations. I will look a bit and ping you when I get a chance.
Szehon On Tue, Jul 20, 2021 at 5:11 PM Ryan Blue <b...@tabular.io> wrote: > Szehon, > > We implemented the current behavior because that’s what was expected for > INSERT > OVERWRITE. But the ReplacePartitions operation uses the same base class > as the expression overwrite, so you could add more validation, including > the conflict checks that you’re talking about by calling the > validateAddedDataFiles helper method > <https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L249-L250> > on the base class. > > If you want to implement this, ping me on Slack and I can point you in the > right direction. > > Ryan > > On Tue, Jul 20, 2021 at 4:20 PM Szehon Ho <szehon.apa...@gmail.com> wrote: > >> Hi, >> >> Does anyone know if its feasible to consider making Spark's "insert >> overwrite" implement serializable transaction, like delete, update, merge? >> >> Maybe at least for "overwrite by filter", then it can narrow down the >> conflict checks needed on the commitWithSerializableTransaction side. I >> don't have the full context on the Spark side if its feasible to do the >> rewrite as Delete/Merge/Update does, to use this mechanism. >> >> Its for a use-case like "insert overwrite into table foo partition >> (date=...) select ... from foo", which I understand is not the common use >> case for insert overwrite, as its usually select from another table. >> >> Thanks in advance, >> Szehon >> >> >> > > -- > Ryan Blue > Tabular >