Hi Ivan,
Beam coders are wrapped in Flink's TypeSerializers. So I don't think it
will result in double serialization.
Thanks!
Eleanore
On Tue, May 19, 2020 at 4:04 AM Ivan San Jose
wrote:
> Perfect, thank you so much Arvid, I'd expect more people using Beam on
> top of Flink, but it seems is n
Perfect, thank you so much Arvid, I'd expect more people using Beam on
top of Flink, but it seems is not so popular.
On Tue, 2020-05-19 at 12:46 +0200, Arvid Heise wrote:
> Hi Ivan,
>
> I'm fearing that only a few mailing list users have actually deeper
> Beam experience. I only used it briefly 3
Hi Ivan,
I'm fearing that only a few mailing list users have actually deeper Beam
experience. I only used it briefly 3 years ago. Most users here are using
Flink directly to avoid these kinds of double-abstraction issues.
It might be better to switch to the Beam mailing list if you have
Beam-spec
Actually I'm also thinking about how Beam coders are related with
runner's serialization... I mean, on Beam you specify a coder per each
Java type in order to Beam be able to serialize/deserialize that type,
but then, is the runner used under the hood serializing/deserializing
again the result, so
Yep, sorry if I'm bothering you but I think I'm still not getting this,
how could I tell Beam to tell Flink to use that serializer instead of
Java standard one, because I think Beam is abstracting us from Flink
checkpointing mechanism, so I'm afraid that if we use Flink API
directly we might break
Hi Ivan,
The easiest way is to use some implementation that's already there [1]. I
already mentioned Avro and would strongly recommend giving it a go. If you
make sure to provide a default value for as many fields as possible, you
can always remove them later giving you great flexibility. I can gi
Thanks for your complete answer Arvid, we will try to approach all
things you mentioned, but take into account we are using Beam on top of
Flink, so, to be honest, I don't know how could we implement the custom
serialization thing (
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/s
Hi Ivan,
First let's address the issue with idle partitions. The solution is to use
a watermark assigner that also emits a watermark with some idle timeout [1].
Now the second question, on why Kafka commits are committed for in-flight,
checkpointed data. The basic idea is that you are not losing
Hi, we are starting to use Beam with Flink as runner on our
applications, and recently we would like to get advantages that Flink
checkpoiting provides, but it seems we are not understanding it
clearly.
Simplifying, our application does the following:
- Read meesages from a couple of Kafka topic