Re: In 1.2-SNAPSHOT, EventTimeSessionWindows are not firing untill the whole stream is processed

2016-12-05 Thread Aljoscha Krettek
Hi, could you please try adding this custom watermark debugger to see what's going on with the element timestamps and watermarks: public static class WatermarkDebugger extends AbstractStreamOperator implements OneInputStreamOperator { private static final long serialVersionUID = 1L;

Re: separation of JVMs for different applications

2016-12-05 Thread Manu Zhang
Thanks Stephan, They don't use YARN now but I think they will consider it. Do you think it would be beneficial to provide such an option as "separate-jvm" in stand-alone mode for streaming processor and long running services ? Or do you think it would introduce too much complexity ? Manu On Tue

Fwd: Default restart behavior with checkpointing

2016-12-05 Thread Rohit Agarwal
Hi, https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/fault_tolerance.html says: Number of retries: The setNumberOfExecutionRerties() method defines how many times the job is restarted after a failure. When checkpointing is activated, but this value is not explicitly set,

Re: Resource under-utilization when using RocksDb state backend

2016-12-05 Thread Robert Metzger
Another Flink user using RocksDB with large state on SSDs recently posted this video for oprimizing the performance of Rocks on SSDs: https://www.youtube.com/watch?v=pvUqbIeoPzM That could be relevant for you. For how long did you look at iotop. It could be that the IO access happens in bursts, de

RE: Equivalent of Rx combineLatest() on a join?

2016-12-05 Thread denis.dollfus
Actually that doesn't work as expected because emitted values are not purged. I'll experiment with purging triggers and/or evictors, though I have the feeling that Flink was not designed for what we need to do here -- but I'll keep on searching. In the meantime any advice is appreciated. If the

Re: Resource under-utilization when using RocksDb state backend

2016-12-05 Thread Cliff Resnick
Hi Robert, We're following 1.2-SNAPSHOT, using event time. I have tried "iotop" and I see usually less than 1 % IO. The most I've seen was a quick flash here or there of something substantial (e.g. 19%, 52%) then back to nothing. I also assumed we were disk-bound, but to use your metaphor I'm hav

Re: separation of JVMs for different applications

2016-12-05 Thread Stephan Ewen
Hi! Are your customers using YARN? In that case, the default configuration will start a new YARN application per Flink job, no JVMs are shared between jobs. By default, even each slot has its own JVM. Greetings, Stephan PS: I think the "spawning new JVMs" is what Till referred to when saying "sp

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
Hey Daniel, the fix won't make it into 1.1.4 since it is only relevant if you're using Flink Meters together with either the Graphite or Ganglia Reporter. The Meter metric is however not available in 1.1 at all, so it can't be the underlying cause. My fix is only for 1.2; the fixed issue coul

Re: JVM Non Heap Memory

2016-12-05 Thread Ufuk Celebi
Quick question since the Meter issue does _not_ apply to 1.1.3, which Flink metrics are you using? – Ufuk On 5 December 2016 at 16:44:47, Daniel Santos (dsan...@cryptolab.net) wrote: > Hello, > > Thank you all for the kindly reply. > > I've got the general idea. I am using version flink's 1.

Variable Tuple Type

2016-12-05 Thread Max Kießling
Hey, for a project we need to represent data as lists. So each entry in the DataSets basically holds a list of basic data type elements. When processing the data we keep joining lists of the same shape and so the list size grows over time e.g. (a,b,c) x (c,d,e) -> (a,b,c,d,e) Currently our solut

Re: Thread 'SortMerger spilling thread' terminated due to an exception: No space left on device

2016-12-05 Thread Miguel Coimbra
Hello Fabian, Thanks for the attention. Still haven't solved this. I did set up a cron job to clean the Docker images daily - thanks for that hint. As a last resort, I am going to look into a 2 TB NAS to see if this works. What is confusing me is that this happens also for the com-orkut.ungraph.t

Re: JVM Non Heap Memory

2016-12-05 Thread Daniel Santos
Hello, Thank you all for the kindly reply. I've got the general idea. I am using version flink's 1.1.3. So it seems the fix of Meter's won't make it to 1.1.4 ? Best Regards, Daniel Santos On 12/05/2016 01:28 PM, Chesnay Schepler wrote: We don't have to include it in 1.1.4 since Meter's do n

RE: Equivalent of Rx combineLatest() on a join?

2016-12-05 Thread denis.dollfus
Asking the response helped me to find the answer (yes, rubber duck debugging) as it seems that the code below does what I need: s3 = s1.join(s2) .where(new KeySelector1()).equalTo(new KeySelector2()) .window(GlobalWindow.create

Re: separation of JVMs for different applications

2016-12-05 Thread Manu Zhang
> > The pro for the multi-tenant cluster mode is that you can share data > between jobs and you don't have to spin up a new cluster for each job. I don't think we have to spin up a new cluster for each job if every job gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a new

Re: microsecond resolution

2016-12-05 Thread jeff jacobson
Thanks for clearing things up, Stephan and Kostas On Mon, Dec 5, 2016 at 8:08 AM, Kostas Kloudas wrote: > Hi Jeff, > > As Stephan said the interpretation of the timestamps is up to the logic of > your job. > And as for the documentation, thanks for reporting this. > We should update it. > > Chee

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
We don't have to include it in 1.1.4 since Meter's do not exist in 1.1; my bad for tagging it in JIRA for 1.1.4. On 05.12.2016 14:18, Ufuk Celebi wrote: Just to note that the bug mentioned by Chesnay does not invalidate Stefan's comments. ;-) Chesnay's issue is here: https://issues.apache.org

Re: JVM Non Heap Memory

2016-12-05 Thread Ufuk Celebi
Just to note that the bug mentioned by Chesnay does not invalidate Stefan's comments. ;-) Chesnay's issue is here: https://issues.apache.org/jira/browse/FLINK-5261 I added an issue to improve the documentation about cancellation (https://issues.apache.org/jira/browse/FLINK-5260). Which version

Re: microsecond resolution

2016-12-05 Thread Kostas Kloudas
Hi Jeff, As Stephan said the interpretation of the timestamps is up to the logic of your job. And as for the documentation, thanks for reporting this. We should update it. Cheers, Kostas > On Dec 5, 2016, at 1:56 PM, Stephan Ewen wrote: > > @Jeff - good point about the docs. > > I think Kos

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
Hello Daniel, I'm afraid you stumbled upon a bug in Flink. Meters were not properly cleaned up, causing the underlying dropwizard meter update threads to not be shutdown either. I've opened a JIRA and will open a PR soon. Thank your for re

Re: microsecond resolution

2016-12-05 Thread Stephan Ewen
@Jeff - good point about the docs. I think Kostas is right though - the event timestamps are up to the user's interpretation. The built-in window assigners interpret them as "Unix Epoch Millis", but you can define your own window assigners that interpret the timestamps differently. The system int

Re: microsecond resolution

2016-12-05 Thread jeff jacobson
Thanks Kostas. So if we're comfortable treating timestamps as longs (and doing conversions to human readable time at our application level), we can use CEP, ML lib etc. in addition to all basic Flink functions? That's great news? To Matthias's point, *why then does the following not read "**Both t

Re: Flink 1.1.3 OOME Permgen

2016-12-05 Thread Konstantin Knauf
Yep, I would suppose so. You need to have the reference from the AppClassLoader to the UserCodeClassLoader. On 05.12.2016 12:37, Robert Metzger wrote: > I executed this snipped in each Flink job: > > @Override > public void open(Configuration config) { > ObjectMapper somethingWithJackson = new

Re: Flink 1.1.3 OOME Permgen

2016-12-05 Thread Robert Metzger
I executed this snipped in each Flink job: @Override public void open(Configuration config) { ObjectMapper somethingWithJackson = new ObjectMapper(); try { ObjectNode on = somethingWithJackson.readValue("{\"a\": \"b\"}", ObjectNode.class); } catch (IOException e) { throw new RuntimeE

Re: flink-job-in-yarn,has max memory

2016-12-05 Thread Robert Metzger
Hi, The TaskManager reports a total memory usage of 3 GB. That's fine, given that you requested containers of size 4GB. Flink doesn't allocate all the memory assigned to the container to the heap. Are you running a batch or a streaming job? On Tue, Nov 29, 2016 at 12:43 PM, wrote: > Hi, >

Re: Flink 1.1.3 OOME Permgen

2016-12-05 Thread Konstantin Knauf
Hi Robert, you need to actually use Jackson. The problematic field is a cache, which is filled by all classes, which were serialized/deserialized by Jackson. Best, Konstantin On 05.12.2016 11:55, Robert Metzger wrote: > I've submitted Wordcount 410 times to a testing cluster and a streaming > j

Re: CEP issue

2016-12-05 Thread Robert Metzger
Hi Kieran, which statebackend are you using for your CEP job? Using RocksDB as a state backend could potentially fix the issue. What's the number of keys in your stream? On Tue, Nov 29, 2016 at 3:18 PM, kieran . wrote: > Hello, > > I am currently building a multi-tenant monitoring application

Equivalent of Rx combineLatest() on a join?

2016-12-05 Thread denis.dollfus
Hi all, [first email here, I'm new to Flink, Java and Scala, sorry if I missed something obvious] I'm exploring Flink in the context of streaming calculators. Basically, the data flow boils down to multiple data streams with variable update rates (ms, seconds, ..., month) which are joined befo

Re: Dealing with Multiple sinks in Flink

2016-12-05 Thread Robert Metzger
For enabling JMX when starting Flink from your IDE, you need to do the following: Configuration configuration = new Configuration(); configuration.setString("metrics.reporters", "my_jmx_reporter"); configuration.setString("metrics.reporter.my_jmx_reporter.class", "org.apache.flink.metrics.jmx.JMXR

Re: Dealing with Multiple sinks in Flink

2016-12-05 Thread vinay patil
Yes I had configured it as given in the documentation. I can see this line in Job Manager Logs : Started JMX server on port 9020 (but this was on EMR ) How to do this locally ? can we check these metrics while running the pipeline from IDE ? If yes what is teh default JMX port to connect ? or do w

Re: JVM Non Heap Memory

2016-12-05 Thread Stefan Richter
Hi Daniel, the behaviour you observe looks like some threads are not canceled. Thread cancelation in Flink (and Java in general) is always cooperative, where cooperative means that the thread you want to cancel should somehow check cancelation and react to it. Sometimes this also requires some

Re: Dealing with Multiple sinks in Flink

2016-12-05 Thread Robert Metzger
Hi Vinay, the JMX port depends on the port you've configured for the JMX metrics reporter. Did you configure it? Regards, Robert On Fri, Dec 2, 2016 at 11:14 AM, vinay patil wrote: > Hi Robert, > > I had resolved this issue earlier as I had not set the Kafka source > parallelism to number of

Re: Flink 1.1.3 OOME Permgen

2016-12-05 Thread Robert Metzger
I've submitted Wordcount 410 times to a testing cluster and a streaming job 290 times and I could not reproduce the issue with 1.1.3. Also, the heapdump of one of the TaskManagers looked pretty normal. Do you have any ideas how to reproduce the issue? On Fri, Dec 2, 2016 at 3:21 PM, Robert Metzge

Re: In 1.2-SNAPSHOT, EventTimeSessionWindows are not firing untill the whole stream is processed

2016-12-05 Thread Robert Metzger
I'll add Aljoscha and Kostas Kloudas to the conversation. They have the best overview over the changes to the window operator between 1.1. and 1.2. On Mon, Dec 5, 2016 at 11:33 AM, Yassine MARZOUGUI < y.marzou...@mindlytix.com> wrote: > I forgot to mention : the watermark extractor is the one inc

Re: Resource under-utilization when using RocksDb state backend

2016-12-05 Thread Robert Metzger
Hi Cliff, which Flink version are you using? Are you using Eventtime or processing time windows? I suspect that your disks are "burning" (= your job is IO bound). Can you check with a tool like "iotop" how much disk IO Flink is producing? Then, I would set this number in relation with the theoret

Re: JVM Non Heap Memory

2016-12-05 Thread Daniel Santos
Hello, I have done some threads checking and dumps. And I have disabled the checkpointing. Here are my findings. I did a thread dump a few hours after I booted up the whole cluster. (@2/12/2016; 5 TM ; 3GB HEAP each ; 7GB total each as Limit ) The dump shows that most threads are of 3 sour

Re: Flink CEP dynamic patterns

2016-12-05 Thread Abdallah Ghdiri
thank you, i will investigate further On Mon, Dec 5, 2016 at 10:36 AM, Till Rohrmann wrote: > Hi Abdallah, > > I've answered your question on SO. For the sake of completeness here is a > copy: > > At the moment Flink's CEP library does not support dynamic pattern changes > out of the box. Thus,

Re: In 1.2-SNAPSHOT, EventTimeSessionWindows are not firing untill the whole stream is processed

2016-12-05 Thread Yassine MARZOUGUI
I forgot to mention : the watermark extractor is the one included in Flink API. 2016-12-05 11:31 GMT+01:00 Yassine MARZOUGUI : > Hi robert, > > Yes, I am using the same code, just swithcing the version in pom.xml to > 1.2-SNAPSHOT and the cluster binaries to the compiled lastest master (at > the

Re: In 1.2-SNAPSHOT, EventTimeSessionWindows are not firing untill the whole stream is processed

2016-12-05 Thread Yassine MARZOUGUI
Hi robert, Yes, I am using the same code, just swithcing the version in pom.xml to 1.2-SNAPSHOT and the cluster binaries to the compiled lastest master (at the time of the question)). Here is the watermark assignment : .assignTimestampsAndWatermarks(new AscendingTimestampExtractor>() { @Overr

Re: separation of JVMs for different applications

2016-12-05 Thread Till Rohrmann
The pro for the multi-tenant cluster mode is that you can share data between jobs and you don't have to spin up a new cluster for each job. This might be helpful for scenarios where you want to run many short-lived and light-weight jobs. But the important part is that you don't have to use this me

Re: In 1.2-SNAPSHOT, EventTimeSessionWindows are not firing untill the whole stream is processed

2016-12-05 Thread Robert Metzger
Hi Yassine, are you sure your watermark extractor is the same between the two versions. It sounds a bit like the watermarks for the 1.2 code are not generated correctly. Regards, Robert On Sat, Dec 3, 2016 at 9:01 AM, Yassine MARZOUGUI wrote: > Hi all, > > With 1.1-SNAPSHOT, EventTimeSessionWi

Re: microsecond resolution

2016-12-05 Thread Kostas Kloudas
Hi Jeff, Actually in Flink timestamps are simple longs. This means that you can assign anything you want as a timestamp, as long as it fits in a long. Hope this helps and if not, we can discuss to see if we can find a solution that fits your needs together. Cheers, Kostas > On Dec 4, 2016, a

Re: Flink CEP dynamic patterns

2016-12-05 Thread Till Rohrmann
Hi Abdallah, I've answered your question on SO. For the sake of completeness here is a copy: At the moment Flink's CEP library does not support dynamic pattern changes out of the box. Thus, once you've defined your pattern and started your job, it will only process this defined pattern. However,

Re: Update avro to 1.7.7 or later for flink 1.1.4

2016-12-05 Thread Timo Walther
Hi David, thanks for looking into this. Since you already looked into this issue and solved/tested your fix, it would be great if you could open a pull request for it. Every contribution is very welcome. Regards, Timo Am 04/12/16 um 23:35 schrieb Torok, David: I spent close to two days an