Re: set flink yarn jvm options

2016-08-04 Thread Prabhu V
Thanks much Jamie, This works just like you mentioned. Sorry my bad, i was mistaken with this parameter. On Thu, Aug 4, 2016 at 4:39 PM, Jamie Grier wrote: > The YARN client should pass these JVM options along to each of the > launched containers. Have you tried this? > > On Thu, Aug 4, 2016

Re: Having a single copy of an object read in a RichMapFunction

2016-08-04 Thread Sameer W
Theodore, Broadcast variables do that when using the DataSet API - http://data-artisans.com/how-to-factorize-a-700-gb-matrix-with-apache-flink/ See the following lines in the article- To support the above presented algorithm efficiently we had to improve Flink’s broadcasting mechanism since it ea

Having a single copy of an object read in a RichMapFunction

2016-08-04 Thread Theodore Vasiloudis
Hello all, for a prototype we are looking into we would like to read a big matrix from HDFS, and for every element that comes in a stream of vectors do on multiplication with the matrix. The matrix should fit in the memory of one machine. We can read in the matrix using a RichMapFunction, but tha

Re: set flink yarn jvm options

2016-08-04 Thread Jamie Grier
The YARN client should pass these JVM options along to each of the launched containers. Have you tried this? On Thu, Aug 4, 2016 at 4:07 PM, Prabhu V wrote: > > This property i believe affects only the yarn client. I want to set jvm > opts on application-manager and task-manager containers. > >

Re: set flink yarn jvm options

2016-08-04 Thread Prabhu V
This property i believe affects only the yarn client. I want to set jvm opts on application-manager and task-manager containers. Thanks, Prabhu On Thu, Aug 4, 2016 at 3:07 PM, Jamie Grier wrote: > Use *env.java.opts* > > This will be respected by the YARN client. > > > > On Thu, Aug 4, 2016 at

Re: set flink yarn jvm options

2016-08-04 Thread Jamie Grier
Use *env.java.opts* This will be respected by the YARN client. On Thu, Aug 4, 2016 at 11:10 AM, Prabhu V wrote: > The docs mention that > > env.java.opts.jobmanager > env.java.opts.taskmanager > > parameters are available but are ignored by the yarn client, is there a > way to set the jvm opt

Re: set flink yarn jvm options

2016-08-04 Thread Prabhu V
The docs mention that env.java.opts.jobmanager env.java.opts.taskmanager parameters are available but are ignored by the yarn client, is there a way to set the jvm opts for yarn ? Thanks, Prabhu On Wed, Aug 3, 2016 at 7:03 PM, Prabhu V wrote: > Hi, > > Is there a way to set jvm options on the

RE: What is the recommended way to read AVRO data from Kafka using flink.

2016-08-04 Thread Alam, Zeeshan
Hi Stephan, My AvroDeserializationSchema worked fine with a different Kafka topic, it seems like the previous Kafka topic was having heterogeneous data with both AVRO and JSON formatted data. Thanks for your time ☺. Thanks & Regards Zeeshan Alam From: Stephan Ewen [mailto:se...@apache.org] Sen

Re: Generate timestamps in front of event for event time windows

2016-08-04 Thread Jason Brelloch
Thanks Aljoscha, Looking forward to the 1.1. release. I managed to solve my problem using this example code: https://bitbucket.org/snippets/vstoyak/o9Rqp (courtesy of Vladimir Stoyak) I had to create a custom window and window assigner. Hopefully that will help someone else. On Wed, Aug 3, 20

Re: What is the recommended way to read AVRO data from Kafka using flink.

2016-08-04 Thread Stephan Ewen
Hi! To read data from Kafka, you need a DeserializationSchema. You could create one that wraps the AvroInputFormat, but an AvroDeserializationSchema would simply be an adjustment of the AvroInputFormat to the interface of the DeserializationSchema. In your Avro DeserializationSchema, you can prob

Re: Regarding QueryableState

2016-08-04 Thread Vishnu Viswanath
Hi Ufuk, I was able to create a QueryableState stream and query it when running in cluster mode in my local machine, but I am unable to Query the stream when running on AWS cluster, getting error AbstractMethodError at kvState.getSerializedValue(serializedKeyAndNamespace) in KvStateServerHandler.j

Re: Regarding QueryableState

2016-08-04 Thread Ufuk Celebi
You can expect it to be merged by the end of this week. Note that the APIs are very low level at the moment though. In the PR branch you can have a look at the QueryableStateITCase for more details. – Ufuk On Thu, Aug 4, 2016 at 11:53 AM, vinay patil wrote: > Hi All, > > For one use case I wan

Regarding QueryableState

2016-08-04 Thread vinay patil
Hi All, For one use case I want to make use of this feature instead of querying external store like Cassandra or ES. Can you please let me know if this is under development or present in the master branch. It will be of great help if you can provide examples or link to the documentation. Thanks

Re: Flink Kafka Consumer Behaviour

2016-08-04 Thread Stephan Ewen
Hi! I have not used the Kafka Offset Checker before, maybe someone who worked with that can chime in. Greetings, Stephan On Wed, Aug 3, 2016 at 4:59 PM, Janardhan Reddy wrote: > I can see that offsets are stored in zookeeper and are not returned when i > query through kafka offset checker. >

Re: Container running beyond physical memory limits when processing DataStream

2016-08-04 Thread Stephan Ewen
Hi! The JVM is allowed 1448m of memory, and the JVM should never use more heap than that. The fact that the process is using more than 2GB memory in total means that some libraries are allocating memory outside the heap. You can activate the memory logger to diagnose that: https://ci.apache.org/

Re: Parsing source JSON String as Scala Case Class

2016-08-04 Thread Stephan Ewen
If the class has non-serializable members, you need to initialize them "lazily" when the objects are already in the distributed execution (after serializing / distributing them). Making a Scala 'val' a 'lazy val' often does the trick (at minimal performance cost). On Thu, Aug 4, 2016 at 3:56 AM,

Re: How to avoid path conflict in zookeeper/HDFS

2016-08-04 Thread Hironori Ogibayashi
Ufuk, Thank you for your answer. I understood only ZooKeeper root path should be different, and I am glad hear that YARN will automatically handle the root path in the next release. Regards, Hironori 2016-08-04 16:39 GMT+09:00 Ufuk Celebi : > Hey Hironori, > > the storage directories (recovery.z

Re: Parallel execution on AllWindows

2016-08-04 Thread Andrew Ge Wu
Thanks for the quick response, everything is clear! cheers! Andrew > On 03 Aug 2016, at 18:11, Aljoscha Krettek wrote: > > Hi, > "rebalance" simply specifies the strategy to use when sending elements > downstream to the next operator(s). There is no interaction or competition > between the pa

Re: How to avoid path conflict in zookeeper/HDFS

2016-08-04 Thread Ufuk Celebi
Hey Hironori, the storage directories (recovery.zookeeper.storageDir, state.backend.fs.checkpointdir) can stay the same I think (either random or jobID-specific sub folders should be created there). The ZooKeeper root path (recovery.zookeeper.path.root) needs to be unique per cluster for HA. If y