Re: Correlation between data streams/operators and threads

2017-11-13 Thread Shailesh Jain
Hi Piotrek, I tried out option 'a' mentioned above, but instead of separate jobs, I'm creating separate streams per device. Following is the test deployment configuration as a local cluster (8GB ram, 2.5 GHz i5, ubuntu machine): akka.client.timeout 15 min jobmanager.heap.mb 1024 jobmanager.rpc.ad

Re: Scale by the Bay Flink talks

2017-11-13 Thread Stephan Ewen
Hi Ken! I'm happy to chat with you after my talk on Saturday at 9am. http://sched.co/BLwI. I will be at Scale by the Bay until early afternoon. I am also giving talk and workshop at QCon and am part of a panel there. I am not joining a Meetup this time, because with the conferences and some more

Apache Flink - Question about TriggerResult.FIRE

2017-11-13 Thread M Singh
Hi Flink Users I have a few questions about triggers: If a trigger returns TriggerResult.FIRE from say the onProcessingTime method - the window computation is triggered but elements are kept in the window.  If there a second invocation of the onProcessingTime method will the elements from the pr

Re: Off heap memory issue

2017-11-13 Thread Flavio Pompermaier
Unfortunately the issue I've opened [1] was not a problem of Flink but was just caused by an ever increasing job plan. So no help from that..Let's hope to find out the real source of the problem. Maybe using -Djdk.nio.maxCachedBufferSize could help (but I didn't try it yet) Best, Flavio [1] http

Re: Correlation between data streams/operators and threads

2017-11-13 Thread Piotr Nowojski
Sure, let us know if you have other questions or encounter some issues. Thanks, Piotrek > On 13 Nov 2017, at 14:49, Shailesh Jain wrote: > > Thanks, Piotr. I'll try it out and will get back in case of any further > questions. > > Shailesh > > On Fri, Nov 10, 2017 at 5:52 PM, Piotr Nowojski

Re: Correlation between data streams/operators and threads

2017-11-13 Thread Shailesh Jain
Thanks, Piotr. I'll try it out and will get back in case of any further questions. Shailesh On Fri, Nov 10, 2017 at 5:52 PM, Piotr Nowojski wrote: > 1. It’s a little bit more complicated then that. Each operator chain/task > will be executed in separate thread (parallelism > Multiplies that).

Re: JobManager shows TaskManager was lost/killed while TaskManger Process is still running and the network is OK.

2017-11-13 Thread Nico Kruber
>From what I read in [1], simply add JVM options to env.java.opts as you would when you start a Java program yourself, so setting "-XX:+UseG1GC" should enable G1. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ config.html#common-options On Friday, 15 September 2017

Re: Streaming : a way to "key by partition id" without redispatching data

2017-11-13 Thread Nico Kruber
Hi Gwenhaël, several functions in Flink require keyed streams because they manage their internal state by key. These keys, however, should be independent of the current execution and its parallelism so that checkpoints may be restored to different levels of parallelism (for re-scaling, see [1]).

Re: Flink drops messages?

2017-11-13 Thread Fabian Hueske
Thanks for the correction! :-) 2017-11-13 13:05 GMT+01:00 Kien Truong : > Getting late elements from side-output is already available with Flink 1.3 > :) > > Regards, > > Kien > On 11/13/2017 5:00 PM, Fabian Hueske wrote: > > Hi Andrea, > > you are right. Flink's window operators can drop message

Re: Metric Registry Warnings

2017-11-13 Thread ashish pok
Thanks Fabian! Sent from Yahoo Mail on Android On Mon, Nov 13, 2017 at 4:44 AM, Fabian Hueske wrote: Hi Ashish, this is a known issue and has been fixed for the next version [1]. Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-7100 2017-11-11 16:02 GMT+01:00 Ashish Pokhare

Re: Flink drops messages?

2017-11-13 Thread Kien Truong
Getting late elements from side-output is already available with Flink 1.3 :) Regards, Kien On 11/13/2017 5:00 PM, Fabian Hueske wrote: Hi Andrea, you are right. Flink's window operators can drop messages which are too late, i.e., have a timestamp smaller than the last watermark. This is e

Re: Flink HA Zookeeper Connection Timeout

2017-11-13 Thread Nico Kruber
Hi Sathya, have you checked this yet? https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ jobmanager_high_availability.html I'm no expert on the HA setup, have you also tried Flink 1.3 just in case? Nico On Wednesday, 8 November 2017 04:02:47 CET Sathya Hariesh Prakash (sathypra)

Re: AvroParquetWriter may cause task managers to get lost

2017-11-13 Thread Nico Kruber
Hi Ivan, sure, the more work you do per record, the slower the sink will be. However, this should not influence (much) the liveness checks inside flink. Do you get some meaningful entries in the TaskManagers' logs indicating the problem? I'm no expert on Avro and don't know how much actual work

Re: AvroParquetWriter may cause task managers to get lost

2017-11-13 Thread Fabian Hueske
Hi Ivan, I don't have much experience with Avro, but extracting the schema and creating a writer for each record sounds like a pretty expensive approach. This might result in significant load and increased GC activity. Do all records have a different schema or might it make sense to cache the wri

Flink vs Spark streaming benchmark

2017-11-13 Thread G.S.Vijay Raajaa
Hi Guys, I have been using Flink for quite sometime now and recently I hit upon a benchmark result that was published in Data bricks. Would love to hear your thoughts - https://databricks.com/blog/2017/10/11/benchmarking-structured-streaming-on-databricks-runtime-against-state-of-the-art-streamin

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-13 Thread Nico Kruber
Hi Shankara, can you give us some more details, e.g. - how do you run the job? - how do you add/include the jar with the missing class? - is that jar file part of your program's jar or separate? - is the missing class, i.e. "com.huawei.ccn.intelliom.ims.MeasurementTable $measurementTable" (an inner

Re: Flink drops messages?

2017-11-13 Thread Fabian Hueske
Hi Andrea, you are right. Flink's window operators can drop messages which are too late, i.e., have a timestamp smaller than the last watermark. This is expected behavior and documented at several places [1] [2]. There are a couple of options how to deal with late elements: 1. Use more conservat

Re: readFile, DataStream

2017-11-13 Thread Kostas Kloudas
Hi Juan, The problem is that once a file for a certain timestamp is processed and the global modification timestamp is modified, then all files for that timestamp are considered processed. The solution is not to remove the = from the modificationTime <= globalModificationTime; in ContinuousFil

Re: Metric Registry Warnings

2017-11-13 Thread Fabian Hueske
Hi Ashish, this is a known issue and has been fixed for the next version [1]. Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-7100 2017-11-11 16:02 GMT+01:00 Ashish Pokharel : > All, > > Hopefully this is a quick one. I enabled Graphite reporter in my App and I > started to see th

RE: Streaming : a way to "key by partition id" without redispatching data

2017-11-13 Thread Gwenhael Pasquiers
>From what I understood, in your case you might solve your issue by using >specific key classes instead of Strings. Maybe you could create key classes that have a user-specified hashcode that could take the previous key's hashcode as a value. That way your data shouldn't be sent over the wire a