Need help debugging back pressure job

2017-05-22 Thread Fritz Budiyanto
Hi All, Any tips on debugging back pressure ? I have a workload where it get stuck after it ran for a couple of hours. I assume the cause of the back pressure is the block next to the one showing as having the back pressure, is this right ? Any idea on how to get the backtrace ? (I’m using stan

Re: Excessive stdout is causing java heap out of mem

2017-05-22 Thread Fritz Budiyanto
Hi Robert, Thanks Robert, I’ll start using the logger. I didn’t pay attention whether the error occur when I accessed the log from job manager. I will do that in my next test. Anyone has any suggestion on how to debug out of memory exception on flink jm/tm ? — Fritz > On May 22, 2017, at 1

Re: trying to externalize checkpoint to s3

2017-05-22 Thread Ted Yu
Adding back user@ Please check the hadoop-common jar in the classpath. On Mon, May 22, 2017 at 7:12 PM, Sathi Chowdhury < sathi.chowdh...@elliemae.com> wrote: > Tried it , > > It does not fail like before but a new error popped up..looks like a jar > problem(clash ) to me > > thanks > > java.lan

Re: trying to externalize checkpoint to s3

2017-05-22 Thread SHI Xiaogang
Hi Sathi, According to the format specification of URI, "abc-checkpoint" is the host name in the given uri and the path is null. Therefore, FsStateBackend are complaining about the usage of the root directory. Maybe "s3:///abc-checkpoint" ("///" instead of "//") is the uri that you want to use. I

trying to externalize checkpoint to s3

2017-05-22 Thread Sathi Chowdhury
We are running flink 1.2 in pre production I am trying to test checkpoint stored in external location in s3 I have set these below in flink-conf.yaml state.backend: filesystem state.checkpoints.dir: s3://abc-checkpoint state.backend.fs.checkpointdir: s3://abc-checkpoint I get this failure in jo

Re: ERROR while creating save points..

2017-05-22 Thread Sathi Chowdhury
I was able to bypass that one ..by running it from bin/flink … Now encountering by: java.lang.NullPointerException: Checkpoint properties say that the checkpoint should have been persisted, but missing external path. From: Sathi Chowdhury Date: Monday, May 22, 2017 at 12:55 PM To: "user@flink.ap

Re: yarnship option

2017-05-22 Thread Mikhail Pryakhin
Hi Robert!Thanks a lot for your reply!>Can you double check if the job-libs/flink-connector-kafka-0.9_2.11-1.2.1.jar contains org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09 ?The jar does contain the class org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09.class (The

Re: State in Custom Tumble Window Class

2017-05-22 Thread rhashmi
Could you elaborate this more? If i assume if i set window time to max .. does it mean my window will stay for infinite time framework, Wouldn't this may hit memory overflow with time? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/State-i

ERROR while creating save points..

2017-05-22 Thread Sathi Chowdhury
Hi Flink Dev, I am running flink on yarn from EMR and I was running this command to test an external savepoint flink savepoint 8c4c885c5899544de556c5caa984502d /mnt The program finished with the following exception: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not

Async IO Question

2017-05-22 Thread Frank Xue
Hi, I have a question related to async io for Flink. I found that when running unordered (AsyncDataStream.unorderedWait) failures within each individual asyncInvoke is added back to be retried, but when I run it ordered (AsyncDataStream.orderedWait) and an exception is thrown within asyncInvoke, i

Re: yarnship option

2017-05-22 Thread Robert Metzger
Hi, this issue is unexpected :) Can you double check if the job-libs/flink-connector-kafka-0.9_2.11-1.2.1.jar contains org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09 ? Also, were there any previous Kafka09 related exceptions in the log?? >From this SO answer, it seems that this

Re: Job submission: Fail using command line. Success using web (flink1.2.0)

2017-05-22 Thread Robert Metzger
Hi Rami, I think the problem is that when submitting your job through the web interface, Akka will not use remoting (= it will not send your message to a remote machine). When you submit your message from the client, it'll go through akka's network stack (=remoting). Akka rejects messages above a

Re: Excessive stdout is causing java heap out of mem

2017-05-22 Thread Robert Metzger
Hi Fritz, The TaskManagers are not buffering all stdout for the webinterface (at least I'm not aware of that). Did the error occur when accessing the log from the JobManager? Flinks web front end lazily loads the logs from the taskmanagers. The suggested method for logging is to use slf4j for log

yarnship option

2017-05-22 Thread Mikhail Pryakhin
Hi all! I'm playing with flink streaming job on yarn cluster. The job consumes events from kafka and prints them to the standard out. The job uses flink-connector-kafka-0.9_2.11-1.2.1.jar library that is passed via the --yarnship option. Here is the way I run the job: export HADOOP_USER_NAME=hd

Re: Flink metrics related problems/questions

2017-05-22 Thread Aljoscha Krettek
Ah ok, the onTimer() and processElement() methods are all protected by synchronized blocks on the same lock. So that shouldn’t be a problem. > On 22. May 2017, at 15:08, Chesnay Schepler wrote: > > Yes, that could cause the observed issue. > > The default implementations are not thread-safe; i

Re: Flink metrics related problems/questions

2017-05-22 Thread Chesnay Schepler
Yes, that could cause the observed issue. The default implementations are not thread-safe; if you do concurrent writes they may be lost/overwritten. You will have to either guard accesses to that metric with a synchronized block or implement your own thread-safe counter. On 22.05.2017 14:17,

Re: Checkpoint ?

2017-05-22 Thread Aljoscha Krettek
Hi Jim, What are your checkpointing settings? Are you checkpointing to a distributed file system, such as HDFS or S3 or the local file system. The latter should not be used in a production setting and I would not expect this to work properly. (Except if the local filesystem is actually a networ

Re: Implementing Flink Jobs :: Java-API or Scala-API

2017-05-22 Thread Aljoscha Krettek
Hi, New features should be available on both the Java and Scala API at the same time so you can pick whatever suits you best. If you ever find something that doesn’t work in one API but does work in the other, please file and issue for that. Best, Aljoscha > On 21. May 2017, at 17:11, Mikhail

Re: Need help on Streaming API | Flink | GlobalWindow and Customized Trigger

2017-05-22 Thread Aljoscha Krettek
Hi, If your could give us a look at your custom Trigger we might be able to figure out what’s going on. Best, Aljoscha > On 22. May 2017, at 09:06, Samim Ahmed wrote: > > Hello All, > > Hope you are doing well.. > > Myself Samim and I am working of POC(proof of concept) for a project. In th

Re: Job submission: Fail using command line. Success using web (flink1.2.0)

2017-05-22 Thread Rami Al-Isawi
Hi Robert, Yes there is an OversizedPayloadException in the job manager log: --- 2017-05-22 15:39:18,942 INFO org.apache.flink.runtime.client.JobSubmissionClientActor - Upload jar files to job manager akka.tcp://flink@localhost:6123/user/jobmanager. 2017-05-22 15:39:18,957 INFO

Re: State in Custom Tumble Window Class

2017-05-22 Thread Aljoscha Krettek
Hi, Why don’t you set the allowed lateness to Long.MAX_VALUE? This way no data will ever be considered late. If you make the trigger via PurgingTrigger.of(EventTimeTrigger.of(…)). You ensure that window state is not kept after a window fires. Best, Aljoscha > On 17. May 2017, at 13:39, rizhas

Re: Flink metrics related problems/questions

2017-05-22 Thread Aljoscha Krettek
@Chesnay With timers it will happen that onTimer() is called from a different Thread than the Tread that is calling processElement(). If Metrics updates happen in both, would that be a problem? > On 19. May 2017, at 11:57, Chesnay Schepler wrote: > > 2. isn't quite accurate actually; metrics o

Re: ConnectedStream keyby issues

2017-05-22 Thread Aljoscha Krettek
Hi, The State will never be automatically GC’ed. You have to do it in the onTimer() callback, as mentioned earlier. Best, Aljoscha > On 19. May 2017, at 10:39, gaurav wrote: > > Hello > > I am little confused on when the state will be gc. For example, > > Example 1: > > Class abc extends R

Need help on Streaming API | Flink | GlobalWindow and Customized Trigger

2017-05-22 Thread Samim Ahmed
Hello All, Hope you are doing well.. Myself Samim and I am working of POC(proof of concept) for a project. In this project we are using Apache Flink to process the stream data and find the required pattern and finally dump those patterns in DB. So to implement this we have used the global w