Adaptive Watermarks Generator

2020-05-08 Thread 杨东晓
Hi , I noticed there is a paper describe about adaptive watermark generator on top of Apache Flink v1.6.2 : https://www.dfki.de/fileadmin/user_upload/import/10260_awad-adaptive-watermarks.pdf This looks like a more precise generator with much less data drop . Does anybody know more detail about th

Re: Cannot start native K8s

2020-05-08 Thread Dongwon Kim
Hi Yang, Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that the current user account can access to K8s. Now that I copied it, I found that kubernetes-session.sh is working fine. Thanks very much! Best, Dongwon [flink@DAC-E04-W06 ~]$ kubernetes-session.sh 2020-05-09 12:43:4

Re: Cannot start native K8s

2020-05-08 Thread Yang Wang
Hi Dongwon Kim, Thanks a lot for your information. I will dig into this issue. I think the "UnknownHostException" is caused by incorrectly setting the Kubernetes ApiServer address. Maybe you are using "kubernetes.default.svc". However, it could not be accessed outside of the Kubernetes cluster. Y

Re: Cannot start native K8s

2020-05-08 Thread Dongwon Kim
Hello Yang, I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not a Minikube). Previously, as you pointed out, openjdk version "1.8.0_252" was installed. I bump up java version to openjdk 11.0.7 but got something different: [flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh 2020

Re: Cannot start native K8s

2020-05-08 Thread Yang Wang
Hi Dongwon Kim, Are you running Flink on a minikube or a real Kubernetes cluster? I just could not reproduce it in a real Kubernetes cluster with java 8u252. For minikube, i get the exception with you. Best, Yang Yang Wang 于2020年5月6日周三 上午9:29写道: > Hi Dongwon Kim, > > I think it is a known iss

Re: No Slots available exception in Apache Flink Job Manager while Scheduling

2020-05-08 Thread Xintong Song
Linking to the jira ticket, for the record. https://issues.apache.org/jira/browse/FLINK-17560 Thank you~ Xintong Song On Sat, May 9, 2020 at 2:14 AM Josson Paul wrote: > Set up > -- > Flink verson 1.8.3 > > Zookeeper HA cluster > > 1 ResourceManager/Dispatcher (Same Node) > 1 TaskManager

Re: Flink on Kubernetes unable to Recover from failure

2020-05-08 Thread Yun Tang
Hi Morgan If "because the number of task slots has been reduced to zero", do you mean the total task slots reduced to 0? And how many registered task managers could you see when this happened (you could click to the "Task Managers" tab to view related information). All containers running do no

Re: Correctly implementing of SourceFunction.run()

2020-05-08 Thread Senthil Kumar
OK, thank you. Much appreciated. Yes, I don’t want the job to fail. The source has very little data that is being pumped into a Broadcast stream. From: Robert Metzger Date: Friday, May 8, 2020 at 9:51 AM To: Jingsong Li Cc: Senthil Kumar , "user@flink.apache.org" Subject: Re: Correctly imple

No Slots available exception in Apache Flink Job Manager while Scheduling

2020-05-08 Thread Josson Paul
Set up -- Flink verson 1.8.3 Zookeeper HA cluster 1 ResourceManager/Dispatcher (Same Node) 1 TaskManager 4 pipelines running with various parallelism's Issue -- Occationally when the Job Manager gets restarted we noticed that all the pipelines are not getting scheduled. The error that i

Re: Correctly implementing of SourceFunction.run()

2020-05-08 Thread Robert Metzger
Hey Kumar, if you are swallowing any and all exceptions, your Flink job will not fail because of issues arising from your custom source. It might make sense to stop the source if you are catching an InterruptedException. Throwing exceptions out of the run method basically signals the Flink framew

Re: MongoDB sink;

2020-05-08 Thread Robert Metzger
I'm also not aware of a MongoDB sink in Flink. The code provided by Jingsong applies to the "SinkFunction" interface of Flink. That's a good starting point to implement a custom Sink. On Wed, May 6, 2020 at 9:46 AM Jingsong Li wrote: > Hi, > > My impression is that MongoDB's API is not complicat

Re: Flink on Kubernetes unable to Recover from failure

2020-05-08 Thread Robert Metzger
Hey Morgan, Is it possible for you to provide us with the full logs of the JobManager and the affected TaskManager? This might give us a hint why the number of task slots is zero. Best, Robert On Tue, May 5, 2020 at 11:41 AM Morgan Geldenhuys < morgan.geldenh...@tu-berlin.de> wrote: > > Commun

Re: flink-s3-fs-hadoop retry configuration

2020-05-08 Thread Robert Metzger
I validated my assumption. Putting s3.connection.maximum: 123456 into the flink-conf.yaml file results in the following DEBUG log output: 2020-05-08 16:20:47,461 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.connection.maximum as fs.s3a.

Re: flink-s3-fs-hadoop retry configuration

2020-05-08 Thread Robert Metzger
Hey Jeff, Which Flink version are you using? Have you tried configuring the S3 filesystem via Flink's config yaml? Afaik all config parameters prefixed with "s3." are mirrored into the Hadoop file system connector. On Mon, May 4, 2020 at 8:45 PM Jeff Henrikson wrote: > > 2) How can I tell if

Re: What's the best practice to determine whether a job has finished or not?

2020-05-08 Thread Kurt Young
+dev Best, Kurt On Fri, May 8, 2020 at 3:35 PM Caizhi Weng wrote: > Hi Jeff, > > Thanks for the response. However I'm using executeAsync so that I can run > the job asynchronously and get a JobClient to monitor the job. JobListener > only works for synchronous execute method. Is there other w

Re: java.lang.IllegalStateException: The RPC connection is already closed

2020-05-08 Thread Robert Metzger
Are you able to reproduce the issue reliably? If so, it would be nice if you could test if the issue still happens with the 1.10.1 release candidate: https://dist.apache.org/repos/dist/dev/flink/flink-1.10.1-rc3/ On Mon, May 4, 2020 at 6:08 PM Manish G wrote: > https://issues.apache.org/jira/bro

Re: Assertion failed: (last_ref), function ~ColumnFamilySet, file db/column_family.cc, line 1238

2020-05-08 Thread Seth Wiesman
Gordon is correct. Additionally, if you are using flink 1.10 you may be running into a known bug that has been resolved in 1.10.1 which will be released soon. Seth https://issues.apache.org/jira/browse/FLINK-16313 On Fri, May 8, 2020 at 5:19 AM Tzu-Li (Gordon) Tai wrote: > Hi, > > The last ti

Re: What is the RocksDB local directory in flink checkpointing?

2020-05-08 Thread Yun Tang
Hi LakeShen You could refer to [1] and [2] to know the temporary directory in YARN, the related log could be "Setting directories for temporary files to: " or "Overriding Fink's temporary file directories with those specified in the Flink config: " [1] https://github.com/apache/flink/blob/0dd

Re: Window processing in Stateful Functions

2020-05-08 Thread m@xi
Dear Igal,Very insightful answer. Thanks. Igal Shilman wrote > An alternative approach would be to implement a * > thumbling window * > per vertex(a stateful function instance)by sending to itself a delayed > message [2]. When that specific delayedmessage arrives you wouldhave to > purge the olde

Re: Assertion failed: (last_ref), function ~ColumnFamilySet, file db/column_family.cc, line 1238

2020-05-08 Thread Tzu-Li (Gordon) Tai
Hi, The last time I saw this error, was that there was a mismatch in the used flink-state-processor-api version and other core Flink dependencies. Could you confirm that? Also, are you seeing this assertion error consistently, or only occasionally? cc'ing Seth, maybe he has other clues on the cau

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-08 Thread Congxian Qiu
Hi Currently, it is hard to determine which files can be deleted safely in the shared folder, the ground truth is in the checkpoint metafile. I've created an issue[1] for such a feature [1] https://issues.apache.org/jira/browse/FLINK-17571 Best, Congxian Trystan 于2020年5月8日周五 下午1:05写道: > Aha,

Re: What's the best practice to determine whether a job has finished or not?

2020-05-08 Thread Caizhi Weng
Hi Jeff, Thanks for the response. However I'm using executeAsync so that I can run the job asynchronously and get a JobClient to monitor the job. JobListener only works for synchronous execute method. Is there other way to achieve this? Jeff Zhang 于2020年5月8日周五 下午3:29写道: > I use JobListener#onJo

Re: What's the best practice to determine whether a job has finished or not?

2020-05-08 Thread Jeff Zhang
I use JobListener#onJobExecuted to be notified that the flink job is done. It is pretty reliable for me, the only exception is the client process is down. BTW, the reason you see ApplicationNotFound exception is that yarn app is terminated which means the flink cluster is shutdown. While for stand

Re: Assertion failed: (last_ref), function ~ColumnFamilySet, file db/column_family.cc, line 1238

2020-05-08 Thread luisfaamaral
No one? :) -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/