Re: Serializable isolation for insert overwrites?

2021-07-20 Thread Szehon Ho
Thanks Ryan for the confirmation, I'm definitely interested to take a look. if it can be done, the serializable isolation level could probably be an option as for the other operations. I will look a bit and ping you when I get a chance. Szehon On Tue, Jul 20, 2021 at 5:11 PM Ryan Blue wrote:

Re: Java Deserialization Vulnerability

2021-07-20 Thread Steven Wu
Yeah, it is a general Java serialization wider than just Iceberg tables. Typically Flink won't recommend Java serialization for checkpoint state, as that won't be able to support schema evolution. Flink has built-in support for schema evolution for Pojo or Avro data types. On Mon, Jul 19, 2021 at

Re: Serializable isolation for insert overwrites?

2021-07-20 Thread Ryan Blue
Szehon, We implemented the current behavior because that’s what was expected for INSERT OVERWRITE. But the ReplacePartitions operation uses the same base class as the expression overwrite, so you could add more validation, including the conflict checks that you’re talking about by calling the vali

Serializable isolation for insert overwrites?

2021-07-20 Thread Szehon Ho
Hi, Does anyone know if its feasible to consider making Spark's "insert overwrite" implement serializable transaction, like delete, update, merge? Maybe at least for "overwrite by filter", then it can narrow down the conflict checks needed on the commitWithSerializableTransaction side. I don't h

Re: Proposal: Support for views in Iceberg

2021-07-20 Thread Anjali Norwood
Thank you Ryan (M), Piotr and Vivekanand for the comments. I have and will continue to address them in the doc. Great to know about Trino views, Piotr! Thanks to everybody who has offered help with implementation. The spec as it is proposed in the doc has been implemented and is in use at Netflix

Subsurface LIVE tomorrow morning

2021-07-20 Thread Dave Nielsen
Hey folks, a reminder - SubsurfaceConf starts tomorrow. It's free, and there are 3 Iceberg talks: - Why & How Netflix Created and Migrated to Iceberg - by Ted Gooch of Netflix - Iceberg Case Studies - by Ryan Blue - Enabling Analysts to Build a Lakehouse with Spark SQL & Iceberg - by Sachin Bansal

Re: Proposal: Support for views in Iceberg

2021-07-20 Thread Piotr Findeisen
Hi, FWIW, in Trino we just added Trino views support. https://github.com/trinodb/trino/pull/8540 Of course, this is by no means usable by other query engines. Anjali, your document does not talk much about compatibility between query engines. How do you plan to address that? For example, I am fa

Re: Proposal: Support for views in Iceberg

2021-07-20 Thread Ryan Murray
Thanks Anjali! I have left some comments on the document. I unfortunately have to miss the community meetup tomorrow but would love to chat more/help w/ implementation. Best, Ryan On Tue, Jul 20, 2021 at 7:42 AM Anjali Norwood wrote: > Hello, > > John Zhuge and I would like to propose the foll