Re: Handle idle kafka source in Flink 1.9

2020-07-22 Thread Niels Basjes
Have a look at this presentation I gave a few weeks ago. https://youtu.be/bQmz7JOmE_4 Niels Basjes On Wed, 22 Jul 2020, 08:51 bat man, wrote: > Hi Team, > > Can someone share their experiences handling this. > > Thanks. > > On Tue, Jul 21, 2020 at 11:30 AM bat man wrote: > >> Hello, >> >> I ha

Re: Flink failed to resume from checkpoint stored on S3

2020-07-22 Thread Congxian Qiu
Hi Xiaolong From the log, seems there is no `_metadata` file in the checkpoint directory s3:///flink/checkpoint_dir/65786c3307a10e79a52b4de478cfe996/chk-7853. Do you configurate the retain checkpoint configuration[1] ever? If we do not configuration it, the checkpoint will be deleted if job

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Congxian Qiu
Thanks Dian for the great work and thanks to everyone who makes this release possible! Best, Congxian Rui Li 于2020年7月23日周四 上午10:48写道: > Thanks Dian for the great work! > > On Thu, Jul 23, 2020 at 10:22 AM Jingsong Li > wrote: > > > Thanks for being the release manager for the 1.11.1 release,

Re: Rolling update of flink cluster in kubernetes

2020-07-22 Thread Yang Wang
Hi Suraj, AFAIK, it is not a good practice to rolling update the JobManager and TaskManagers. Since every restarting of Pod will cause a failover of Flink job. Instead, i prefer to enable the HA configuration(e.g. zookeeper), then you could completely delete the current deployment and start a new

Re: Docker Taskmanager unable to connect to Flink JpbManager...Connection RefusedHi,

2020-07-22 Thread Yang Wang
Hi Avijit, I think you need to create a network via "docker network create flink-network". And then use "docker run ... --name=jobmanager --network flink-network" to set the hostname. Also "jobmanager.rpc.address" need to be set the jobmanager. Refer the doc[1] for more information. If you really

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Rui Li
Thanks Dian for the great work! On Thu, Jul 23, 2020 at 10:22 AM Jingsong Li wrote: > Thanks for being the release manager for the 1.11.1 release, Dian. > > Best, > Jingsong > > On Thu, Jul 23, 2020 at 10:12 AM Zhijiang > wrote: > >> Thanks for being the release manager and the efficient work,

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Jingsong Li
Thanks for being the release manager for the 1.11.1 release, Dian. Best, Jingsong On Thu, Jul 23, 2020 at 10:12 AM Zhijiang wrote: > Thanks for being the release manager and the efficient work, Dian! > > Best, > Zhijiang > > -- > F

Re: Flink app cannot restart

2020-07-22 Thread Yang Wang
Could you check for that whether the JobManager is also running on the lost Yarn NodeManager? If it is the case, you need to configure "yarn.application-attempts" to a value bigger than 1. BTW, the logs you provided are not Yarn NodeManager logs. And if you could provide the full jobmanager log,

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

2020-07-22 Thread Xintong Song
Hi Peter, Thanks for reporting this issue. >From the exception stack, it seems there's indeed a problem. However, I'm not able to reproduce this issue on my machine, and I guess that's why this is not discovered before the release. Could you help share some more details (and maybe screenshots) on

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Zhijiang
Thanks for being the release manager and the efficient work, Dian! Best, Zhijiang -- From:Konstantin Knauf Send Time:2020年7月22日(星期三) 19:55 To:Till Rohrmann Cc:dev ; Yangze Guo ; Dian Fu ; user ; user-zh Subject:Re: [ANNOUNCE] A

Flink failed to resume from checkpoint stored on S3

2020-07-22 Thread Xiaolong Wang
Deare community, One of my Flink job failed yesterday, and when I tried to resume from the latest checkpoint, following exceptions happen: ``` Log Type: jobmanager.err Log Upload Time: Wed Jul 22 09:04:24 + 2020 Log Length: 506 SLF4J: Class path contains multiple SLF4J bindings. SLF4J:

Re: MaxConnections understanding on FlinkKinesisProducer via KPL

2020-07-22 Thread Vijay Balakrishnan
Hi Gordon, Thx for your reply. FlinkKinesisProducer default is ThreadPool which is what I am using. So, does that mean only 10 threads are making calls to KDS by default ?? I see from the number of records coming to the KDS that I need only 1-2 shards. So, the bottleneck is on the KPL side. Does th

NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

2020-07-22 Thread Peter Westermann
I just started testing Flink 1.11.1 and noticed that the Task Managers section in the UI doesn’t load. The exception in the log is: j.i.NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) \tat

Rolling update of flink cluster in kubernetes

2020-07-22 Thread Suraj Puvvada
Hello Wanted to understand the best practices around running Flink in Kubernetes especially from a continuous deployment perspective. Is it possible to do a rolling update ? Thanks Suraj

Re: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-22 Thread Tom Fennelly
Thanks Chen. I'm thinking about errors that occur while processing a record/message that shouldn't be retried until after some "action" has been taken Vs flooding the system with pointless retries e.g. - A side output step might involve an API call to an external system and that system is d

Re: Docker Taskmanager unable to connect to Flink JpbManager...Connection RefusedHi,

2020-07-22 Thread Avijit Saha
Hi, I have built a docker image containing both Flink 1.11 and the job jar as per instructions at: https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html The jobanager starts up fine as follows: - FLINK_PROPERTIES="jobmanager.rpc.address: 127.0.0.1" - docker run --env

RE: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-22 Thread Chen Qin
Could you more specific on what “failed message” means here?In general side output can do something like were  def process(ele) {   try{    biz} catch {   Sideout( ele + exception context)}}  process(func).sideoutput(tag).addSink(kafkasink) Thanks,Chen   From: Eleanore JinSent: Wednesday, July

Re: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-22 Thread Eleanore Jin
+1 we have a similar use case for message schema validation. Eleanore On Wed, Jul 22, 2020 at 4:12 AM Tom Fennelly wrote: > Hi. > > I've been searching blogs etc trying to see if there are > established patterns/mechanisms for reprocessing of failed messages via > something like a DLQ. I've rea

Is it possible to do state migration with checkpoints?

2020-07-22 Thread Sivaprasanna
Hi, We are trying out state schema migration for one of our stateful pipelines. We use few Avro type states. Changes made to the job: 1. Updated the schema for one of the states (added a new 'boolean' field with default value). 2. Modified the code by removing a couple of ValueStates. To

Re: Kafka Consumer consuming rate suddenly dropped

2020-07-22 Thread Jake
Hi Mu Kong Yes, you need check your kafka cluser server log, network traffic, disk latency, cpu load. Jake > On Jul 22, 2020, at 7:34 PM, Till Rohrmann wrote: > > Hi Mu Kong, > > I think Jake was asking for the logs of your Kafka cluster and not the Flink > TM logs. > > Cheers, > Till >

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Konstantin Knauf
Thank you for managing the quick follow up release. I think this was very important for Table & SQL users. On Wed, Jul 22, 2020 at 1:45 PM Till Rohrmann wrote: > Thanks for being the release manager for the 1.11.1 release, Dian. Thanks > a lot to everyone who contributed to this release. > > Che

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Till Rohrmann
Thanks for being the release manager for the 1.11.1 release, Dian. Thanks a lot to everyone who contributed to this release. Cheers, Till On Wed, Jul 22, 2020 at 11:38 AM Hequn Cheng wrote: > Thanks Dian for the great work and thanks to everyone who makes this > release possible! > > Best, Hequ

Re: Kafka Consumer consuming rate suddenly dropped

2020-07-22 Thread Till Rohrmann
Hi Mu Kong, I think Jake was asking for the logs of your Kafka cluster and not the Flink TM logs. Cheers, Till On Wed, Jul 22, 2020 at 12:47 PM Mu Kong wrote: > Hi, Jake, > > Thanks for offering help. > I didn't find anything related to kafka in my tm log. > Is there a way to enable the loggin

Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-22 Thread Tom Fennelly
Hi. I've been searching blogs etc trying to see if there are established patterns/mechanisms for reprocessing of failed messages via something like a DLQ. I've read about using checkpointing and restarting tasks (not what we want because we want to keep processing forward) and then also how some u

Re: Kafka Consumer consuming rate suddenly dropped

2020-07-22 Thread Mu Kong
Hi, Jake, Thanks for offering help. I didn't find anything related to kafka in my tm log. Is there a way to enable the logging, or am I just looking into the wrong place? Thanks in advance. Best regards, Mu

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Hequn Cheng
Thanks Dian for the great work and thanks to everyone who makes this release possible! Best, Hequn On Wed, Jul 22, 2020 at 4:40 PM Jark Wu wrote: > Congratulations! Thanks Dian for the great work and to be the release > manager! > > Best, > Jark > > On Wed, 22 Jul 2020 at 15:45, Yangze Guo wro

Re: Key group is not in KeyGroupRange

2020-07-22 Thread Ori Popowski
The problem was caused by by concurrent access to the ValueState by another thread. Thanks to Yun Tang for pointing this out. Discussion is in FLINK-18637 On Tue, Jul 21, 2020 at

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Jark Wu
Congratulations! Thanks Dian for the great work and to be the release manager! Best, Jark On Wed, 22 Jul 2020 at 15:45, Yangze Guo wrote: > Congrats! > > Thanks Dian Fu for being release manager, and everyone involved! > > Best, > Yangze Guo > > On Wed, Jul 22, 2020 at 3:14 PM Wei Zhong wrote:

Flink app cannot restart

2020-07-22 Thread Rainie Li
Hi Flink help, I am new to Flink. I am investigating one flink app that cannot restart when we lose yarn node manager (tc.yarn.rm.cluster.NumActiveNMs=0), while other flink apps can restart automatically. *Here is job's restartPolicy setting:* *env.setRestartStrategy(RestartStrategies.fixedDelay

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Yangze Guo
Congrats! Thanks Dian Fu for being release manager, and everyone involved! Best, Yangze Guo On Wed, Jul 22, 2020 at 3:14 PM Wei Zhong wrote: > > Congratulations! Thanks Dian for the great work! > > Best, > Wei > > > 在 2020年7月22日,15:09,Leonard Xu 写道: > > > > Congratulations! > > > > Thanks Dian

Changing watermark in the middle of a flow

2020-07-22 Thread Lorenzo Nicora
Hi I have a linear streaming flow with a single source and multiple sinks to publish intermediate results. The time characteristic is Event Time and I am adding one AssignerWithPeriodicWatermarks immediately after the source. I need to add a different assigner, in the middle of the flow, to change

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Wei Zhong
Congratulations! Thanks Dian for the great work! Best, Wei > 在 2020年7月22日,15:09,Leonard Xu 写道: > > Congratulations! > > Thanks Dian Fu for the great work as release manager, and thanks everyone > involved! > > Best > Leonard Xu > >> 在 2020年7月22日,14:52,Dian Fu 写道: >> >> The Apache Flink co

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Leonard Xu
Congratulations! Thanks Dian Fu for the great work as release manager, and thanks everyone involved! Best Leonard Xu > 在 2020年7月22日,14:52,Dian Fu 写道: > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.11.1, which is the first bugfix release for the Apach