Re: flink 1.2 job fails while trying to use -p option

2017-05-23 Thread Tzu-Li (Gordon) Tai
Hi Sathi, You seem to have built the Kinesis connector on Flink 1.3, but using it against Flink 1.2. Please make sure that the library versions and the Flink version match. As for the case you’ve bumped in to: it just happens that with parallelism 1, the `markAsTemporarilyIdle` method is not c

Tumbling window expects a time attribute for grouping in a stream environment

2017-05-23 Thread enrico canzonieri
Hi, I'm trying to window and groupBy a stream using the table api, but I get ValidationException in the windowing function. Here is the relevant code: tableEnv.registerTableSource(schema.getName, src) val table = tableEnv.scan(schema.getName) val t = table.window(Tumble over 1.minutes on 'time as

flink 1.2 job fails while trying to use -p option

2017-05-23 Thread Sathi Chowdhury
Need quick help on this Trying to run /bin/flink run -p 6 –c CLASSNAME - /mnt/flink/jarname.jar everything works without –p option ,but runs with parallelization 1, which is what I am trying to get past. java.lang.NoSuchMethodError: org.apache.flink.streaming.api.functions.source.SourceFunctio

Re: trying to externalize checkpoint to s3

2017-05-23 Thread Sathi Chowdhury
Hi Till, thanks for your reply.I have to try out my fatjar not including Hadoop classes as well. From: Till Rohrmann Date: Tuesday, May 23, 2017 at 7:12 AM To: Ted Yu Cc: Sathi Chowdhury , user Subject: Re: trying to externalize checkpoint to s3 Hi Sathi, the last error indicates that you ar

Re: High Availability on Yarn

2017-05-23 Thread Jain, Ankit
Following up on this. From: "Jain, Ankit" Date: Tuesday, May 16, 2017 at 12:14 AM To: Stephan Ewen , "user@flink.apache.org" Subject: Re: High Availability on Yarn Bringing it back to list’s focus. From: "Jain, Ankit" Date: Thursday, May 11, 2017 at 1:19 PM To: Stephan Ewen , "user@flink.apa

Re: large sliding window perf question

2017-05-23 Thread Stefan Richter
Hi, Which state backend and Flink version are you using? There was a problem with large merging states on RocksDB, caused by some inefficiencies in the merge operator of RocksDB. We provide a custom patch for this with all newer versions of Flink. Best, Stefan > Am 23.05.2017 um 21:24 schrieb

large sliding window perf question

2017-05-23 Thread Chen Qin
Hi there, I have seen some weird perf issue while running event time based job with large sliding window (24 hours offset every 10s) pipeline looks simple, tail kafka topic and assign timestamp and watermark, forward to large sliding window (30days) and fire every 10 seconds and print out. what

Re: How does Job manager schedule jobs to task managers

2017-05-23 Thread Till Rohrmann
Hi Sendoh, Flink does not necessarily spread the tasks across all available TMs. It depends a little bit on the topology of the job, because in general it is beneficial to deploy downstream tasks to the same machines where their input tasks run. I described the process in more detail here [1]. Thi

Re: Async IO Question

2017-05-23 Thread Till Rohrmann
Hi Frank, yes it could be related to the bug we have fixed in 1.3. Could you try it out with Flink 1.3 to see if it fixes your problem? If not, then I would like to take a look at your code to exactly see what is happening there. Cheers, Till On Tue, May 23, 2017 at 4:28 PM, Frank Xue wrote: >

How does Job manager schedule jobs to task managers

2017-05-23 Thread Sendoh
Hi Flink users, I saw our Flink cluster doesn't dispatch jobs equally. So some nodes are busy and have few task slots left, and some nodes are not busy and have many task slots. We don't have yarn setup. Would that be the reason? Or does Job manager dispatch job first to some nodes are not busy

Re: Async IO Question

2017-05-23 Thread Frank Xue
Thanks for the reply Till! I am using Flink 1.2.1 so it could be an issue with the bug you mentioned that looks to be fixed in 1.3. The restart strategy is a fixed delay restart and I have tried various checkpoint and restart intervals and the behavior remains the same. Pretty much inside the async

Re: Need help debugging back pressure job

2017-05-23 Thread Till Rohrmann
Hi Fritz, you're right that back pressure should propagate upstream to the sources. Thus, the cause of the back pressure should be the operator following the last operator with back pressure. In order to debug it you could take a look at the stack trace of the TM. Simply go to the machine on whic

Re: trying to externalize checkpoint to s3

2017-05-23 Thread Till Rohrmann
Hi Sathi, the last error indicates that you are running Flink on a cluster with an incompatible Hadoop version. Please make sure that you use/build Flink with the Hadoop version you have running on your cluster. Especially make sure that the Hadoop version Flink is built for is compatible with the

Re: Async IO Question

2017-05-23 Thread Till Rohrmann
Hi Frank, which version of Flink are you using? There was a problem with correctly recognizing failed asynchronous operations, see FLINK-6435 [1]. In general, if an exception occurs within AsyncFunction#asyncInvoke, then the job should fail. Depending on which restart strategy you have chosen, th

Re: ERROR while creating save points..

2017-05-23 Thread Till Rohrmann
Hi Sathi, this looks not right on the first glance. Could you please share the exact command you've used to trigger a savepoint and also the full stack trace of the exception? Cheers, Till On Mon, May 22, 2017 at 10:55 PM, Sathi Chowdhury < sathi.chowdh...@elliemae.com> wrote: > I was able to b

Re: yarnship option

2017-05-23 Thread Mikhail Pryakhin
Hi Robert, Thanks for your advice but this option doesn’t work for us because our job depends on third party jars with transitive dependencies to the same jar but of different versions and it hardly possible to say which version will participate in the fat jar.. Anyway thanks a lot for your he

Re: yarnship option

2017-05-23 Thread Robert Metzger
Hi Mike, I would recommend you to build a "fat jar" containing your application code and all required dependencies. On Tue, May 23, 2017 at 10:33 AM, Mikhail Pryakhin wrote: > Hi Robert! > > You were right.. it was my fault, I didn’t copy all required dependencies > for the flink-connector-kafk

Re: Excessive stdout is causing java heap out of mem

2017-05-23 Thread Robert Metzger
Hi Fritz, what are you doing on your task manager? Are you keeping many objects on the heap in your application? Are you using any window operators of Flink? If so, which statebackend are you using? On Tue, May 23, 2017 at 7:02 AM, Fritz Budiyanto wrote: > Hi Robert, > > Thanks Robert, I’ll s

Re: state.backend.fs.checkpointdir may support multi path and failover.

2017-05-23 Thread Petr Novotnik
I think for this case you want to set up HA name nodes [1]. > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html that will allow you to set the checkpoint dir as: `hdfs://mycluster/XXX` P. On 05/23/2017 01:56 PM, yunfan123 wrote: > For now I c

state.backend.fs.checkpointdir may support multi path and failover.

2017-05-23 Thread yunfan123
For now I can only config like: state.backend.fs.checkpointdir: hdfs://namenodehost:port/XXX But the machine running namenode may be down. This config should support multi path in case of some become unavailable. Normally, user should make sure any of them can be used. -- View this message in

Re: yarnship option

2017-05-23 Thread Mikhail Pryakhin
Hi Robert! You were right.. it was my fault, I didn’t copy all required dependencies for the flink-connector-kafka. That made impossible to load the class. Everything works as expected but only if I put required dependencies inside the flink lib folder, but when dependencies are located in a di