Re: About delta awareness caches

2016-12-21 Thread xingcan
Hi Aljoscha, First of all, sorry for that I missed your prompt reply : ( In these days, I've been learning the implementation mechanism of window in Flink. I think the main difference between the window in Storm and Flink (from the API level) is that, Storm maintains only one window while Flink

Re: Monitoring REST API

2016-12-21 Thread Ovidiu-Cristian MARCU
Hi Lydia, I have used sar monitoring (sar -u -n DEV -p -d -r 1) and plotted the average over multiple nodes. 1)So for each node you can collect the sar output, and obtain for example: Linux 3.2.0-4-amd64 (parasilo-4.rennes.grid5000.fr) 2016-01-27 _x86_64_(16 CPU) 12:54:09

Re: FlinkML and DataStream API

2016-12-21 Thread Márton Balassi
Thanks for mentioning it, Theo. Here it is: https://github.com/streamline-eu/ML-Pipelines/tree/stream-ml Look at these examples: https://github.com/streamline-eu/ML-Pipelines/commit/314e3d940f1f1ac7b762ba96067e13d806476f57 On Wed, Dec 21, 2016 at 9:38 PM, wrote: > I'm interested in that code y

Re: FlinkML and DataStream API

2016-12-21 Thread dromitlabs
I'm interested in that code you mentioned too, I hope you can find it. Regards, Matt > On Dec 21, 2016, at 17:12, Theodore Vasiloudis > wrote: > > Hello Mäki, > > I think what you would like to do is train a model using batch, and use the > Flink streaming API as a way to serve your model a

Re: FlinkML and DataStream API

2016-12-21 Thread Theodore Vasiloudis
Hello Mäki, I think what you would like to do is train a model using batch, and use the Flink streaming API as a way to serve your model and make predictions. While we don't have an integrated way to do that in FlinkML currently, I definitely think that's possible. I know Marton Balassi has been

Re: Stateful Stream Processing with RocksDB causing Job failure

2016-12-21 Thread abiybirtukan
Thanks, that helps. Sent from my iPhone > On Dec 21, 2016, at 12:08 PM, Ufuk Celebi wrote: > >> On Wed, Dec 21, 2016 at 2:52 PM, wrote: >> So the RocksDB state backend should only be used in conjunctions with hdfs >> or s3 on a multi node cluster? > > Yes. Otherwise there is no way to resto

Monitoring REST API

2016-12-21 Thread Lydia Ickler
Hi all, I have a question regarding the Monitoring REST API; I want to analyze the behavior of my program with regards to I/O MiB/s, Network MiB/s and CPU % as the authors of this paper did. (https://hal.inria.fr/hal-01347638v2/document ) From the

Re: static/dynamic lookups in flink streaming

2016-12-21 Thread Meghashyam Sandeep V
Thanks for the solution Fab. My map would be substantially large. So I wouldn't want to replicate it in each operator. I will probably add a layer of redis cache adn use it in streaming process. Do you foresee any problems with that? Thanks, Sandeep On Wed, Dec 21, 2016 at 9:52 AM, Fabian Hueske

Re: static/dynamic lookups in flink streaming

2016-12-21 Thread Fabian Hueske
You could read the map from a file in the open method of a RichMapFunction. The open method is called before the first record is processed and can put data into the operator state. The downside of this approach is that the data is replicated in each operator, i.e., each operator holds a full copy

Re: static/dynamic lookups in flink streaming

2016-12-21 Thread Meghashyam Sandeep V
As a follow up question, can we populate the operator state from an external source? My use case is as follows: I have a flink streaming process with Kafka as a source. I only have ids coming from kafka messages. My look ups () which is a static map come from a different source. I would like to us

Re: Stateful Stream Processing with RocksDB causing Job failure

2016-12-21 Thread Ufuk Celebi
On Wed, Dec 21, 2016 at 2:52 PM, wrote: > So the RocksDB state backend should only be used in conjunctions with hdfs > or s3 on a multi node cluster? Yes. Otherwise there is no way to restore the checkpoint on a different host.

FlinkML and DataStream API

2016-12-21 Thread Mäki Hanna
Hi, I'm wondering if there is a way to use FlinkML and make predictions continuously for test data coming from a DataStream. I know FlinkML only supports the DataSet API (batch) at the moment, but is there a way to convert a DataStream into DataSets? I'm thinking of something like (0. fit mod

Events are assigned to wrong window

2016-12-21 Thread Nico
Hi @all, I am using a TumblingEventTimeWindows.of(Time.seconds(20)) for testing. During this I found a strange behavior (at least for me) in the assignment of events. The first element of a new window is actually always part of the old window. I thought the events are late, but then they they wou

Re: static/dynamic lookups in flink streaming

2016-12-21 Thread Fabian Hueske
OK, I see. Yes, you can do that with Flink. It's actually a very common use case. You can store the names in operator state and Flink takes care of checkpointing the state and restoring it in case of a failure. In fact, the operator state is persisted in the state backends you mentioned before. B

Re: static/dynamic lookups in flink streaming

2016-12-21 Thread Meghashyam Sandeep V
Hi Fabian, I meant look ups like IDs to names. For example if I have IDs coming through the stream and if I want to replace them with corresponding names stored in cache or somewhere within flink. Thanks, Sandeep On Dec 21, 2016 12:35 AM, "Fabian Hueske" wrote: > Hi Sandeep, > > I'm sorry but

Re: Stateful Stream Processing with RocksDB causing Job failure

2016-12-21 Thread abiybirtukan
Thanks for the prompt response. I am persisting checkpoints to local file system. I have three node cluster, task managers running on two different hosts. At times the job runs for days with no issue but recently it started to fail when it takes snapshots. So the RocksDB state backend should o

Re:

2016-12-21 Thread Fabian Hueske
This issue was posted twice to the ML. The discussion should be continued on the other thread with the subject "Stateful Stream Processing with RocksDB causing Job failure" 2016-12-21 9:44 GMT+01:00 Fabian Hueske : > Hi Abiy, > > to which type of filesystem are you persisting your checkpoints? >

Re: Stateful Stream Processing with RocksDB causing Job failure

2016-12-21 Thread Fabian Hueske
Copying my reply from the other thread with the same issue to have the discussion in one place. -- Hi Abiy, to which type of filesystem are you persisting your checkpoints? We have seen problems with S3 and its consistency model. These issues have been addressed in newer versions of Flink.

Re: Stream Iterations

2016-12-21 Thread Ufuk Celebi
The delta iterations are a batch-only feature. Did you try the DataStream#iterate? https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/index.html#iterations On Mon, Dec 19, 2016 at 5:35 AM, Govindarajan Srinivasaraghavan wrote: > Hi All, > > I have a use case for which I

Re: Stateful Stream Processing with RocksDB causing Job failure

2016-12-21 Thread Ufuk Celebi
Hey Abiy! - Do all the task managers run on a single host? Only then using the local file system will work. - What does every now and then mean? Every time when the job tries to take a snapshot? After restarts? The JobManager logs will also help if we can't figure this out like this. Best, Ufu

Re: CodeAnalysisMode in Flink

2016-12-21 Thread Fabian Hueske
Hi Vinay, I had a look into the code. The code analysis is only performed for DataSet (batch) programs and not for DataStream (streaming) programs. If your program is a DataStream program, this would explain why no information is shown. The DataStream API does also not leverage semantic function

Re:

2016-12-21 Thread Fabian Hueske
Hi Abiy, to which type of filesystem are you persisting your checkpoints? We have seen problems with S3 and its consistency model. These issues have been addressed in newer versions of Flink. Not sure if the fix went into 1.1.3 already but release 1.1.4 is currently voted on and has tons of other

Re: static/dynamic lookups in flink streaming

2016-12-21 Thread Fabian Hueske
Hi Sandeep, I'm sorry but I think I do not understand your question. What do you mean by static or dynamic look ups? Do you want to access an external data store and cache data? Can you give a bit more detail about your use? Best, Fabian 2016-12-20 23:07 GMT+01:00 Meghashyam Sandeep V : > Hi t