Re: Java Deserialization Vulnerability

2021-07-20 Thread Steven Wu
Yeah, it is a general Java serialization wider than just Iceberg tables. Typically Flink won't recommend Java serialization for checkpoint state, as that won't be able to support schema evolution. Flink has built-in support for schema evolution for Pojo or Avro data types. On Mon, Jul 19, 2021 at

Re: Java Deserialization Vulnerability

2021-07-19 Thread Ryan Blue
Yes, I think so. Sounds like an unsecured bucket could lead to code execution running with data infrastructure privileges. While it isn't exactly Flink's problem, we should probably treat this like a potential privilege escalation issue. How does Flink handle this for other cases? I think it would

Re: Java Deserialization Vulnerability

2021-07-19 Thread Steven Wu
Let's assume the Flink checkpoint state is uploaded to S3. Attacker needs to be able to read from and write to S3 to manipulate the S3 files. Is this the scenario we are concerned about? On Mon, Jul 19, 2021 at 3:51 PM Ryan Blue wrote: > Thanks, Steven. Do you think that there is a potential pro

Re: Java Deserialization Vulnerability

2021-07-19 Thread Ryan Blue
Thanks, Steven. Do you think that there is a potential problem with an attacker having access to where the state is stored and using that to inject code? Is this something we should just update to avoid it entirely? On Mon, Jul 19, 2021 at 3:43 PM Steven Wu wrote: > I believe Flink source is the

Re: Java Deserialization Vulnerability

2021-07-19 Thread Steven Wu
I believe Flink source is the only place that uses Java serialization for checkpoint state: https://github.com/apache/iceberg/issues/1698. @OpenInx already updated Flink sink to avoid the Java serialization (long time ago) On Mon, Jul 19, 2021 at 1:53 PM Jack Ye wrote: > Yes I totally agree t

Re: Java Deserialization Vulnerability

2021-07-19 Thread Jack Ye
Yes I totally agree that the distributed system itself should make sure the integrity of objects passing across nodes. I am more concerned about the Flink case where some information is persisted and can be modified to execute arbitrary code. Maybe people working on Flink can comment on this a bit

Re: Java Deserialization Vulnerability

2021-07-19 Thread Ryan Blue
Jack, I might be incorrect here, but I'll at least throw out some thoughts. If I understand correctly, the attacker requires access to modify some serialized object so that deserialization leads to arbitrary code execution. I think that the best way to protect against that is to avoid making it po

Java Deserialization Vulnerability

2021-07-17 Thread Jack Ye
Hi everyone, We use Java serialization and deserialization a lot in Iceberg. I wonder if we have considered the potential of Java deserialization attack, where an attacker can replace serialized bytes to execute arbitrary code through the readObject method. Currently our SerializationUtil.deseria