Flink sink never executes

2020-12-21 Thread Ben Beasley
First off I want to thank the folks in this email list for their help thus far. I’m facing another strange issue where if I add a window to my stream, the sink no longer executes. However the sink executes without the windowing. I described my problem on stackoverflow

Re: No execution.target specified in your configuration file

2020-12-21 Thread Kostas Kloudas
Glad I could help! On Mon, Dec 21, 2020 at 3:42 AM Ben Beasley wrote: > > That worked. Thankyou, Kostas. > > > > From: Kostas Kloudas > Date: Sunday, December 20, 2020 at 7:21 AM > To: Ben Beasley > Cc: user@flink.apache.org > Subject: Re: No execution.target specified in your configuration fi

checkpointing seems to be throttled.

2020-12-21 Thread Colletta, Edward
Using session cluster with three taskmanagers, cluster.evenly-spread-out-slots is set to true. 13 jobs running. Average parallelism of each job is 4. Flink version 1.11.2, Java 11. Running on AWS EC2 instances with EFS for high-availability.storageDir. We are seeing very high checkpoint times

Queryable state on task managers that are not running the job

2020-12-21 Thread Martin Boyanov
Hi, I'm running a long-running flink job in cluster mode and I'm interested in using the queryable state functionality. I have the following problem: when I query the flink task managers (i.e. the queryable state proxy), it is possible to hit a task manager which doesn't have the requested state, b

Re: Will job manager restarts lead to repeated savepoint restoration?

2020-12-21 Thread Till Rohrmann
What are exactly the problems when the checkpoint recovery does not work? Even if the ZooKeeper connection is temporarily disconnected which leads to the JobMaster losing leadership and the job being suspended, the next leader should continue where the first job left stopped because of the lost Zoo

Re: Will job manager restarts lead to repeated savepoint restoration?

2020-12-21 Thread vishalovercome
Thanks for your reply! What I have seen is that the job terminates when there's intermittent loss of connectivity with zookeeper. This is in-fact the most common reason why our jobs are terminating at this point. Worse, it's unable to restore from checkpoint during some (not all) of these terminat

Re: Challenges Deploying Flink With Savepoints On Kubernetes

2020-12-21 Thread vishalovercome
Thanks for your reply! What I have seen is that the job terminates when there's intermittent loss of connectivity with zookeeper. This is in-fact the most common reason why our jobs are terminating at this point. Worse, it's unable to restore from checkpoint during some (not all) of these terminat

Re: checkpointing seems to be throttled.

2020-12-21 Thread Yun Gao
Hi Edward, For the second issue, have you also set the statebackend type? I'm asking so because except for the default heap statebackend, other statebackends should throws exception if the state.checkpoint.dir is not set. Since heap statebackend stores all the snapshots in the JM's memory,

Re: [Help Required:]-Unable to submit job from REST API

2020-12-21 Thread Yun Gao
Hi Puneet, From the doc it seems submitting a job via rest api should send a post request to /jars/:jarid/run [1]. The response "Not Found" should means the REST API server does not know the request type. Best, Yun [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops

[Help Required:]-Unable to submit job from REST API

2020-12-21 Thread Puneet Kinra
oHi All Unable to submit job from REST API (Flink-Monitoring API), *Steps followed:* 1) Load the jar using load api. 2) can see the jar in the /tmp/flink-web folder. 3) Try to run the jar using the following. *Request* http://host-ip/45f30ad6-c8fb-4c2c-9fbf-c4f56acdd9d9_stream-processor-jar-wi

checkpoint delay consume message

2020-12-21 Thread nick toker
Hello, We noticed the following behavior: If we enable the flink checkpoints, we saw that there is a delay between the time we write a message to the KAFKA topic and the time the flink kafka connector consumes this message. The delay is closely related to checkpointInterval and/or minPauseBetweenC

RE: checkpointing seems to be throttled.

2020-12-21 Thread Colletta, Edward
Thanks for the quick response. We are using FsStateBackend, and I did see checkpoint files and directories in the EFS mounted directory. We do monitor backpressure through rest api periodically and we do not see any. From: Yun Gao Sent: Monday, December 21, 2020 10:40 AM To: Colletta, Edward ;

Re: checkpoint delay consume message

2020-12-21 Thread Yun Gao
Hi Nick, Are you using EXACTLY_ONCE semantics ? If so the sink would use transactions, and only commit the transaction on checkpoint complete to ensure end-to-end exactly-once. A detailed description could be find in [1] Best, Yun [1] https://flink.apache.org/features/2018/03/01/end-t

Re: RE: checkpointing seems to be throttled.

2020-12-21 Thread Yun Gao
Hi Edward, Are you setting the FSStateBackend via code or flink-conf.yaml ? If via code it requires a path parameter and the path would be the state.checkpoint.dir. If via flink-conf.yaml, I tried on 1.12 by setting state.backend: filesystem in config file and enable checkpoint, it indeed

RE: RE: checkpointing seems to be throttled.

2020-12-21 Thread Colletta, Edward
Doh! Yeah, we set the state backend in code and I read the flink-conf.yaml file and use the high-availability storage dir. From: Yun Gao Sent: Monday, December 21, 2020 11:28 AM To: Colletta, Edward ; user@flink.apache.org Subject: Re: RE: checkpointing seems to be throttled. This email is f

Re: checkpoint delay consume message

2020-12-21 Thread nick toker
hi i am confused the delay in in the source when reading message not on the sink nick ‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪yungao...@aliyun.com ‬‏>:‬ > Hi Nick, > > Are you using EXACTLY_ONCE semantics ? If so the sink would use > transactions, and only commit the transa

NullPointerException while calling TableEnvironment.sqlQuery in Flink 1.12

2020-12-21 Thread Yuval Itzchakov
Hi, While trying to execute a query via TableEnvironment.sqlQuery in Flink 1.12, I receive the following exception: java.lang.NullPointerException :114, RelMetadataQuery (org.apache.calcite.rel.metadata) :76, RelMetadataQuery (org.apache.calcite.rel.metadata) get:39, FlinkRelOptClusterFactory$$an

Re: Will job manager restarts lead to repeated savepoint restoration?

2020-12-21 Thread vishalovercome
I don't know how to reproduce it but what I've observed are three kinds of termination when connectivity with zookeeper is somehow disrupted. I don't think its an issue with zookeeper as it supports a much bigger kafka cluster since a few years. 1. The first kind is exactly this - https://github.

numRecordsOutPerSecond metric and side outputs

2020-12-21 Thread Alexey Trenikhun
Hello, Does numRecordsOutPerSecond metric takes into account number of records send to side output or it provides rate only for main output? Thanks, Alexey

Re: NullPointerException while calling TableEnvironment.sqlQuery in Flink 1.12

2020-12-21 Thread Danny Chan
Hi Yuval Itzchakov ~ The thread you paste has a different stake trace with your case. In the pasted thread, the JaninoRelMetadataProvider was missed because we only set it once in a thread local variable, when the RelMetadataQuery was used in a different working thread, the JaninoRelMetadataProvi

Re: a question about KubernetesConfigOptions

2020-12-21 Thread Yang Wang
Hi Debasish Ghosh, Thanks for the attention on native K8s integration of Flink. 1. For volumes and volumes mount, it is not supported now. And we are trying to get it done via pod template. Refer here[1] for more information. 2. Currently, on different deployments, Flink has different cpu config

Re: Re: checkpoint delay consume message

2020-12-21 Thread Yun Gao
Hi nick, Sorry I initially think that the data is also write into Kafka with flink . So it could be ensured that there is no delay in the write side, right ? Does the delay in the read side keeps existing ? Best, Yun --Original Mail -- Sender:nick toker