Yeah, it is a general Java serialization wider than just Iceberg tables.
Typically Flink won't recommend Java serialization for checkpoint state, as
that won't be able to support schema evolution. Flink has built-in support
for schema evolution for Pojo or Avro data types.
On Mon, Jul 19, 2021 at
Yes, I think so. Sounds like an unsecured bucket could lead to code
execution running with data infrastructure privileges. While it isn't
exactly Flink's problem, we should probably treat this like a potential
privilege escalation issue. How does Flink handle this for other cases? I
think it would
Let's assume the Flink checkpoint state is uploaded to S3. Attacker needs
to be able to read from and write to S3 to manipulate the S3 files. Is this
the scenario we are concerned about?
On Mon, Jul 19, 2021 at 3:51 PM Ryan Blue wrote:
> Thanks, Steven. Do you think that there is a potential pro
Thanks, Steven. Do you think that there is a potential problem with an
attacker having access to where the state is stored and using that to
inject code? Is this something we should just update to avoid it entirely?
On Mon, Jul 19, 2021 at 3:43 PM Steven Wu wrote:
> I believe Flink source is the
I believe Flink source is the only place that uses Java serialization for
checkpoint state: https://github.com/apache/iceberg/issues/1698.
@OpenInx already updated Flink sink to avoid the Java
serialization (long time ago)
On Mon, Jul 19, 2021 at 1:53 PM Jack Ye wrote:
> Yes I totally agree t
Yes I totally agree that the distributed system itself should make sure the
integrity of objects passing across nodes. I am more concerned about the
Flink case where some information is persisted and can be modified to
execute arbitrary code. Maybe people working on Flink can comment on this a
bit
Jack,
I might be incorrect here, but I'll at least throw out some thoughts. If I
understand correctly, the attacker requires access to modify some
serialized object so that deserialization leads to arbitrary code
execution. I think that the best way to protect against that is to avoid
making it po
Hi everyone,
We use Java serialization and deserialization a lot in Iceberg. I wonder if
we have considered the potential of Java deserialization attack, where an
attacker can replace serialized bytes to execute arbitrary code through the
readObject method.
Currently our SerializationUtil.deseria