Re: Default Flink Metrics Graphite

2020-09-03 Thread Till Rohrmann
Hi Vijay, yes the last value is the timestamp when this value was sent to Graphite. Cheers, Till On Wed, Sep 2, 2020 at 6:39 PM Vijayendra Yadav wrote: > Hi Till, > > *Info below, also I have a question at the end. * > pretty much what was told earlier, for 1.10.0 use: > metrics.reporter.grph

Re: Use of slot sharing groups causing workflow to hang

2020-09-03 Thread Till Rohrmann
Hi Ken, I believe that we don't have a lot if not any explicit logging about the slot sharing group in the code. You can, however, learn indirectly about it by looking at the required number of AllocatedSlots in the SlotPool. Also the number of "multi task slot" which are created should vary becau

Re: Fail to deploy Flink on minikube

2020-09-03 Thread Till Rohrmann
In order to exclude a Minikube problem, you could also try to run Flink on an older Minikube and an older K8s version. Our end-to-end tests use Minikube v1.8.2, for example. Cheers, Till On Thu, Sep 3, 2020 at 8:44 AM Yang Wang wrote: > Sorry i forget that the JobManager is binding its rpc addr

Re: Count of records coming into the Flink App from KDS and at each step through Execution Graph

2020-09-03 Thread aj
Hello Vijay, I have the same use case where I am reading from Kafka and want to report count corresponding to each event every 5 mins. On Prometheus, I want to set an alert if fr any event we do not receive the event like say count is zero. So can you please help me with how you implemented this f

Re: Flink 1.8.3 GC issues

2020-09-03 Thread Piotr Nowojski
Hi, Have you tried using a more recent Flink version? 1.8.x is no longer supported, and latest versions might not have this issue anymore. Secondly, have you tried backtracking those references to the Finalizers? Assuming that Finalizer is indeed the class causing problems. Also it may well be a

Combined streams backpressure

2020-09-03 Thread Adam Venger
Hi. I'm thinking about a solution to a problem I have. I need to create keyed session windows from multiple streams of data. Combining streams is done by watermarks. The problem is, one of the streams can be slower. This opens too many windows that wait for the stream to catch up, which wastes reso

Re: FileSystemHaServices and BlobStore

2020-09-03 Thread Alexey Trenikhun
Hi Yang, Yes, I’ve persisted CompletedCheckpointStore, CheckpointIDCounter and RunningJobsRegistry Thanks, Alexey From: Yang Wang Sent: Wednesday, September 2, 2020 8:21:20 PM To: Alexey Trenikhun Cc: dev ; Flink User Mail List Subject: Re: FileSystemHaService

Re: Combined streams backpressure

2020-09-03 Thread Piotr Nowojski
Hi, This is a known problem. As of recently, there was no way to solve this problem generically, for every source. This is changing now, as one of the motivations behind FLIP-27, was to actually address this issue [1]. Note, as of now, there are no FLIP-27 sources yet in the code base, but for Fli

Re: Job Manager taking long time to upload job graph on remote storage

2020-09-03 Thread Prakhar Mathur
We tried uploading the same blob from Job Manager k8s pod directly to GCS using gsutils and it took 2 seconds. The upload speed was 166.8 MiB/s. Thanks. On Wed, Sep 2, 2020 at 6:14 PM Till Rohrmann wrote: > The logs don't look suspicious. Could you maybe check what the write > bandwidth to your

Re: Job Manager taking long time to upload job graph on remote storage

2020-09-03 Thread Till Rohrmann
Hmm then it probably rules GCS out. What about ZooKeeper? Have you experienced slow response times from your ZooKeeper cluster? Cheers, Till On Thu, Sep 3, 2020 at 6:23 PM Prakhar Mathur wrote: > We tried uploading the same blob from Job Manager k8s pod directly to GCS > using gsutils and it to

Re: Flink 1.8.3 GC issues

2020-09-03 Thread Josson Paul
1) We are in the process of migrating to Flink 1.11. But it is going to take a while before we can make everything work with the latest version. Meanwhile since this is happening in production I am trying to solve this. 2) Finalizae class is pointing to org.apache.flink.streaming.runtime.tasks.Syst

FLINK YARN SHIP from S3 Directory

2020-09-03 Thread Vijayendra Yadav
Hi Team, Is there any feature to be able to ship directory to containers from s3 Directory instead of local. -yt,--yarnship Ship files in the specified directory (t for transfer)

Unit Test for KeyedProcessFunction with out-of-core state

2020-09-03 Thread Alexey Trenikhun
Hello, I want to unit test KeyedProcessFunction which uses with out-of-core state (like rocksdb). Does Flink has mock for rocksdb, which can be used in unit tests ? Thanks, Alexey

Re: Bug: Kafka producer always writes to partition 0, because KafkaSerializationSchemaWrapper does not call open() method of FlinkKafkaPartitioner

2020-09-03 Thread Ken Krugler
Assuming you’re not doing custom partitioning, then another workaround is to pass Optional.empty() for the partitioner, so that it will use the Kafka partitioning vs. a Flink partitioner. Or at least that worked for us, when we encountered this same issue. — Ken > On Sep 3, 2020, at 5:36 AM, D

Re: Flink 1.8.3 GC issues

2020-09-03 Thread Piotr Nowojski
Hi Josson, 2. Are you sure that all/vast majority of those objects are pointing towards SystemProcessingTimeService? And is this really the problem of those objects? Are they taking that much of the memory? 3. It still could be Kafka's problem, as it's likely that between 1.4 and 1.8.x we bumped K

flink on k8s 如果jobmanager 所在pod重启后job失败如何处理

2020-09-03 Thread dty...@163.com
请教一个问题。在使用k8s 部署的flink 集群,如果jobmanger 重启后,1)job所在的jar包会清除,jobmanager 找不到这个job的jar 包,2)正在运行的job也会取消,重启后的jobmanager 如何找到之前运行的job dty...@163.com

Re: flink on k8s 如果jobmanager 所在pod重启后job失败如何处理

2020-09-03 Thread Yun Tang
Hi Please use English to ask questions in user mailing list. I have added this thread to user-zh mailing list, if you would like to reply this thread again, please remove user mailing list in senders. When talking about the question how to handle job manager failure in k8s, you could consider

Re: Job Manager taking long time to upload job graph on remote storage

2020-09-03 Thread Prakhar Mathur
Yes, I will check that, but any pointers on why Flink is taking more time than gsutil upload? On Thu, Sep 3, 2020 at 10:14 PM Till Rohrmann wrote: > Hmm then it probably rules GCS out. What about ZooKeeper? Have you > experienced slow response times from your ZooKeeper cluster? > > Cheers, > Til

State Storage Questions

2020-09-03 Thread Rex Fenley
Hello! I've been digging into State Storage documentation, but it's left me scratching my head with a few questions. Any help will be much appreciated. Qs: 1. Is there a way to use RocksDB state backend for Flink on AWS EMR? Possibly with S3 backed savepoints for recovery (or maybe hdfs for savep

Re: Fail to deploy Flink on minikube

2020-09-03 Thread superainbower
Hi Till & Yang, I can deploy Flink on kubernetes(not minikube), it works well So there are some problem about my minikube but I can’t find and fix it Anyway I can deploy on k8s now Thanks for your help! | | superainbower | | superainbo...@163.com | 签名由网易邮箱大师定制 On 09/3/2020 15:47,Till Rohrmann wro