Re: Flink runner - restore from savepoint after evolving POJOs

2021-07-09 Thread Pavel Solomin
Hello Luke, thanks for your reply. I do not set serialVersionUID myself, but when I change Avro schema of my PoJo, the generated class has that VersionUID changed. So, as I understand your suggestion, it is basically this: 1 - don't use Avro-generated PoJo, cause there is no control of serialVers

Dataproc+Flink+Beam Portable Runner Tutorial

2021-07-09 Thread Joey Tran
Hello! I'm trying to just demo Beam/Flink and I tried following the instructions with Google's Dataproc but I get a bunch of errors ranging from jackson dependency issues to some issue about "No container id". Does anyone know if these dataproc instructions[1] are complete? I ran through it prett

Re: GroupIntoShards not sending bytes further when dealing with huge amount of data

2021-07-09 Thread Evan Galpin
I wanted to circle back here and state that using release 2.31.0, with _or_ without --experiments=use_deprecated_read, seems to resolve the slowness in my case. It's still on my radar to get a minimal pipeline extracted from my previously problematic pipeline so as to hopefully aid in debugging ef

[ANNOUNCE] Beam 2.31.0 Released

2021-07-09 Thread Andrew Pilloud
The Apache Beam team is pleased to announce the release of version 2.31.0. Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream (continuous) processing. See https://beam.apache.org You can download the release her

Calculating Top K element per window with Beam SQL

2021-07-09 Thread Talat Uyarer
Hi, I am trying to calculate the top k element based on a streaming aggregation per window. Do you know if there is any support for it on BeamSQL or How can achieve this goal with BeamSQL on stream ? Sample Query SELECT customer_id, app, sum(bytes_total) as bytes_total FROM PCOLLECTION GROUP BY c

Re: Dataproc+Flink+Beam Portable Runner Tutorial

2021-07-09 Thread Joey Tran
Hi all, Thank you very much for the responses! I feel a bit better about using dataproc since it's not in beta like flink on GKE. I rebuilt the flinkrunner as you specified but I still get an error. I've attached the stdout from trying to run with the patched flink runner. Here's instructions t

Re: Dataproc+Flink+Beam Portable Runner Tutorial

2021-07-09 Thread Joey Tran
Hi kyle, I noticed the one you linked didnt work when i copied and pasted it so chose the only one in the directory and that’s how when i got the error message Thanks! Joey On Fri, Jul 9, 2021 at 4:50 PM Kyle Weaver wrote: > Hi Joey, my mistake, I picked the wrong jar. The correct jar should b