Flink 1.12.8 release

2022-02-15 Thread Joey L
Hi, Is there a planned release date for 1.12.8 or scheduled release cycle for minor versions? Regards, J

Job manager slots are in bad state.

2022-02-15 Thread Josson Paul
We are using Flink version 1.11.2. At times if task managers are restarted for some reason, the job managers throw the exception that I attached here. It is an illegal state exception. We never had this issue with Flink 1.8. It started happening after upgrading to Flink 1.11.2. Why are the slots n

Re: Joining Flink tables with different watermark delay

2022-02-15 Thread Meghajit Mazumdar
Hi Francesco, Thank you so much for your reply. This was really helpful. In reply to your tips: *> As described here , we have deprecated the syntax `GROUP BY WINDOW`, you should

Re: Failed to serialize the result for RPC call : requestMultipleJobDetails after Upgrading to Flink 1.14.3

2022-02-15 Thread Chirag Dewan
Ah, should have looked better. I think  https://issues.apache.org/jira/browse/FLINK-25732 causes this. Are there any side effects of this? How can I avoid this problem so that it doesn't affect my processing? Thanks On Wednesday, 16 February, 2022, 10:19:12 am IST, Chirag Dewan wrote:

Failed to serialize the result for RPC call : requestMultipleJobDetails after Upgrading to Flink 1.14.3

2022-02-15 Thread Chirag Dewan
Hi, We are running a Flink cluster with 2 JMs in HA and 2 TMs on a standalone K8 cluster. After migrating to 1.14.3, we started to see some exceptions in the JM logs: 2022-02-15 11:30:00,100 ERROR org.apache.flink.runtime.rest.handler.job.JobIdsHandler      [] POD_NAME: eric-bss-em-sm-streamser

Re: TM OOMKilled

2022-02-15 Thread Xintong Song
Thanks Alexey, In my experience, common causes for TM OOMKill are: 1. RocksDB uses more memory than expected. Unfortunately, the memory hard limit is not supported by RocksDB. Flink conservatively estimates RocksDB's memory footprint and tunes its parameters accordingly, which is not 100% safe. 2.

Re: Flink 1.12.x DataSet --> Flink 1.14.x DataStream

2022-02-15 Thread saravana...@gmail.com
Thanks Zhipeng. Working as expected. Thanks once again. Saravanan On Tue, Feb 15, 2022 at 3:23 AM Zhipeng Zhang wrote: > Hi Saravanan, > > One solution could be using a streamOperator to implement `BoundedOneInput` > interface. > An example code could be found here [1]. > > [1] > https://gith

Re: Exception Help

2022-02-15 Thread Jonathan Weaver
I've narrowed it down to a TableSource that is returning a MAP type as a column. Only errors when the column is referenced, and not on the first row, but somewhere in the stream of rows. On 1.15 master branch (I need the new JSON features in 1.15 for this project so riding the daily snapshot durin

Change column names Pyflink Table/Datastream API

2022-02-15 Thread Francis Conroy
Hi all, I'm hoping to be able to change the column names when creating a table from a datastream, the flatmap function generating the stream is returning a Tuple4. It's currently working as follows: inputmetrics = table_env.from_data_stream(ds, Schema.new_builder()

How to get memory specific metrics for tasknodes

2022-02-15 Thread Diwakar Jha
Hello, I'm running Flink 1.11 on AWS EMR using the Yarn application. I'm trying to access memory metrics(Heap.Max, Heap.Used) per tasknode in CloudWatch. I have 50 tasknodes and it creates Millions of metrics(including per operator) though I need only a few metrics per tasknode (Heap.Max, Heap.Use

Re: Exception Help

2022-02-15 Thread Sid Kal
Hi Jonathan, It would be better if you describe your scenario along with the code. It would be easier for the community to help. On Tue, 15 Feb 2022, 23:33 Jonathan Weaver, wrote: > I'm getting the following exception running locally from my IDE (IntelliJ) > but seems to not occur > when runnin

Exception Help

2022-02-15 Thread Jonathan Weaver
I'm getting the following exception running locally from my IDE (IntelliJ) but seems to not occur when running on a cluster. I'm assuming it may be related to memory settings on the runtime (machine has 64GB of ram avail) but not sure what setting to try and change. Caused by: java.lang.IndexOutOf

Re: Log4j2 configuration

2022-02-15 Thread jonas eyob
1. Ok, thanks! 2. We are using application mode. No changes to the distribution other than updating the log4j-console.properties file. content of /lib/: * flink-csv-1.14.3.jar * flink-json-1.14.3.jar * flink-table_2.12-1.14.3.jar * log4j-api-2.17.1.jar * log4j-slf4j-impl-2.17.1.jar * flink-dist_2

Re: method select(org.apache.flink.table.api.ApiExpression cannot find symbol .select($("handlingTime"),

2022-02-15 Thread HG
😬 it worked Table t = tupled3DsTable .window(Session.withGap(lit(5).minutes()).on($("handlingTime")).as("w")) .groupBy($("transactionId")) .select($("handlingTime"), $("transactionId"), $("originalEvent"), $("handlingTime").sum().over($("w"))); Op di 15 feb. 2022 om 15:50

Re: Log4j2 configuration

2022-02-15 Thread Chesnay Schepler
1) You either need to modify the log4j-console.properties file, or explicitly set the log4j.configurationFile property to point to your .xml file. 2) Have you made modifications to the distribution (e.g., removing other logging jars from the lib directory)? Are you using application mode, or s

Re: TM OOMKilled

2022-02-15 Thread Alexey Trenikhun
Hi Xintong, I've checked - `state.backend.rocksdb.memory.managed` is not explicitly configured, so as you wrote it should be true by default. Regarding task off-heap, I believe KafkaConsumer needed off-heap memory some time ago From: Xintong Song Sent: Monday,

Log4j2 configuration

2022-02-15 Thread jonas eyob
Hey, We are deploying our Flink Cluster on a standalone Kubernetes with the longrunning job written in scala. We recently upgraded our Flink cluster from 1.12 to 1.14.3 - after which we started seeing a few problems related to logging which I have been struggling to fix for the past days). Relate

Re: Performance Issues in Source Operator while migrating to Flink-1.14 from 1.9

2022-02-15 Thread Martijn Visser
Hi Arujit, I'm also looping in some contributors from the connector and runtime perspective in this thread. Did you also test the upgrade first by only upgrading to Flink 1.14 and keeping the FlinkKafkaConsumer? That would offer a better way to determine if a regression is caused by the upgrade of

Re: method select(org.apache.flink.table.api.ApiExpression cannot find symbol .select($("handlingTime"),

2022-02-15 Thread HG
Will try. The order of clauses is a little bit obscure to me. I would expect the groupBy to come first In the docs it only does the window and select Indeed I need to groupBy. Will share the complete final class when I have finished this project so that others can benefit. On Tue, Feb 15, 2022,

Re: method select(org.apache.flink.table.api.ApiExpression cannot find symbol .select($("handlingTime"),

2022-02-15 Thread Chesnay Schepler
Aren't you missing a groupBy() between window() and select()? On 15/02/2022 15:45, HG wrote: Hi all, When I execute the code : Table tupled3DsTable = tableEnv.fromDataStream(tuple3ds, Schema.newBuilder() .column("f0","TIMESTAMP_LTZ(3)")// Only TIMESTAMP_LTZ(0) to TIMESTAMP_

method select(org.apache.flink.table.api.ApiExpression cannot find symbol .select($("handlingTime"),

2022-02-15 Thread HG
Hi all, When I execute the code : Table tupled3DsTable = tableEnv.fromDataStream(tuple3ds, Schema.newBuilder() .column("f0","TIMESTAMP_LTZ(3)") // Only TIMESTAMP_LTZ(0) to TIMESTAMP_LTZ(3) allowed .column("f1","STRING") .column("f2","STR

Re: Unit test harness for Sources

2022-02-15 Thread James Sandys-Lumsdaine
Thanks for the reply. If I upgrade my legacy Sources to use the new split Sources is there a better unit test harness for that? Thanks, James. Sent from my iPhone On 15 Feb 2022, at 13:24, Chesnay Schepler wrote:  I don't think there is anything of the sort for the legacy sources. I would

Re: Unit test harness for Sources

2022-02-15 Thread Chesnay Schepler
I don't think there is anything of the sort for the legacy sources. I would suggest to follow the example at https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/testing/#testing-flink-jobs and using a job that only contains the source (+ something to either extract th

Re: Save app-global cache used by RichAsyncFunction to Flink State?

2022-02-15 Thread Chesnay Schepler
I'm not sure if this would work, but you could try implementing the CheckpointedFunction interface and getting access to state that way. On 14/02/2022 16:25, Clayton Wohl wrote: Is there any way to save a custom application-global cache into Flink state so that it is used with checkpoints + sav

Performance Issues in Source Operator while migrating to Flink-1.14 from 1.9

2022-02-15 Thread Arujit Pradhan
Hey team, We are migrating our Flink codebase and a bunch of jobs from Flink-1.9 to Flink-1.14. To ensure uniformity in performance we ran a bunch of jobs for a week both in 1.9 and 1.14 simultaneously with the same resources and configurations and monitored them. Though most of the jobs are runn

Re: How to cogroup multiple streams?

2022-02-15 Thread Chesnay Schepler
You could first transform each stream to a common format (in the worst case, an ugly Either-like capturing all possible types), union those streams, and then do a keyBy + window function. This is how coGroup is implemented internally. On 14/02/2022 16:08, Will Lauer wrote: OK, here's what I ho

Re: Flink 1.12.x DataSet --> Flink 1.14.x DataStream

2022-02-15 Thread Zhipeng Zhang
Hi Saravanan, One solution could be using a streamOperator to implement `BoundedOneInput` interface. An example code could be found here [1]. [1] https://github.com/apache/flink-ml/blob/56b441d85c3356c0ffedeef9c27969aee5b3ecfc/flink-ml-core/src/main/java/org/apache/flink/ml/common/datastream/Data

Re: Removing unused flink-avro causes savepoint to fail loading

2022-02-15 Thread Chesnay Schepler
Indeed, when flink-avro is on the classpath we automatically register 1 serializer with Kryo. There is no switch to ignore this error or to exclude the Avro serializer somehow. As such you'll either need to rewrite the savepoint, with either the state-processing-api or by creating a slightly m

Re: "No operators defined in streaming topology" error when Flink app still starts successfully

2022-02-15 Thread Chesnay Schepler
When you call env.execute() the StreamExecutionEnvironment is being reset, clearing all sources/transformations from it. That's why env.getExecutionPlan() complains; there aren't any operations so a plan cannot be created. You need to create the execution plan before calling execute(). String