Re: Continuing from the stackoverflow post

2015-11-27 Thread Nirmalya Sengupta
Hello Fabian, A little long mail; please have some patience. >From your response: ' Let's start by telling me what you actually want to do ;-) ' At a broad level, I want to write a (series of, perhaps) tutorial of Flink, where these concepts are brought out by a mix of definition, elaboration,

Re: Interpretation of Trigger and Eviction on a window

2015-11-27 Thread Nirmalya Sengupta
Hello Fabian, >From your reply to this thread: ' it is correct that the evictor is called BEFORE the window function is applied because this is required to support certain types of sliding windows. ' This is clear to me now. However, my point was about the way it is described in the User-guide. T

Get an aggregator's value outside of an iteration

2015-11-27 Thread Truong Duc Kien
Hi, I'm looking for a way get the value of aggregators outside of iteration. Specifically, I want the final aggregators' value after the iteration has finished. Is there any API for that ? Thanks, Kien Truong

Re: Insufficient number of network buffers running on cluster

2015-11-27 Thread Fabian Hueske
Hi Guido, please check this section of the configuration documentation: https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/config.html#configuring-the-network-buffers It should answer your questions. Please let us know, if not. Cheers, Fabian 2015-11-27 16:41 GMT+01:00 Guido :

Insufficient number of network buffers running on cluster

2015-11-27 Thread Guido
Hello, I would like to ask you something regarding an error I’m facing running java code over the cluster at DIMA. Caused by: java.io.IOException: Insufficient number of network buffers: required 300, but only 161 available. The total number of network buffers is currently set to 70480. You ca

Re: Working with State example /flink streaming

2015-11-27 Thread Lopez, Javier
Hi, Thanks for the example. We have done it with windows before and it works. We are using state because the data comes with a gap of several days and we can't handle a window size of several days. That's why we decided to use the state. On 27 November 2015 at 11:09, Aljoscha Krettek wrote: > H

Re: POJO Dataset read and write

2015-11-27 Thread Flavio Pompermaier
I was expecting Parquet + thrift to perform faster but I wasn't expecting that much, it was just to know whether my results were right or not. Thanks for the moment Fabian! On Fri, Nov 27, 2015 at 4:22 PM, Fabian Hueske wrote: > Parquet is much cleverer that the TypeSerializer and applies column

Re: POJO Dataset read and write

2015-11-27 Thread Fabian Hueske
Parquet is much cleverer that the TypeSerializer and applies columnar storage and compression technique. The TypeSerializerIOFs just use Flink's element-wise serializers to write and read binary data. I'd go with Parquet if it is working well for you. 2015-11-27 16:15 GMT+01:00 Flavio Pompermaier

Re: POJO Dataset read and write

2015-11-27 Thread Flavio Pompermaier
I made a simple test and using parquet + thrift vs TypeSerializer IF/OF: the former outperformed the second approach for a simple filter (not pushed down) and a map+sum (something like 2 s vs 33s, and not considering disk space usage that is much worse). Is that normal or TypeSerializer is supposed

Re: Working with the Windowing functionality

2015-11-27 Thread Aljoscha Krettek
Hi, yes, you are right in your analysis. Did you try running it with always setting the timer? Maybe it’s not the bottleneck of the computation. I would be very interested in seeing how this behaves since I only did tests with regular time windows, where the first if statement almost always dire

Re: Doubt about window and count trigger

2015-11-27 Thread Stephan Ewen
Hi! The reason why trigger state is purged right now with the window is to make sure that no memory is occupied any more after the purge. Otherwise, memory consumption would just grow indefinitely, holding state of old triggers. Greetings, Stephan On Fri, Nov 27, 2015 at 4:05 PM, Fabian Hueske

Re: Doubt about window and count trigger

2015-11-27 Thread Fabian Hueske
When a window is purged, the Trigger and its state are also cleared. A new window comes with a new Trigger (and a new state). So yes, in your example the window will be fired after 30 secs again. Best, Fabian 2015-11-27 16:01 GMT+01:00 Anwar Rizal : > Thanks Fabian, > > Just for completion. > In

Re: Doubt about window and count trigger

2015-11-27 Thread Anwar Rizal
Thanks Fabian, Just for completion. In that 1 min window, is my modified count trigger still valid ? Say, if in that one minute window, I have 100 events after 30 s, it will still fire at 30th second ? Cheers, anwar. On Fri, Nov 27, 2015 at 3:31 PM, Fabian Hueske wrote: > Hi Anwar, > > You

Re: POJO Dataset read and write

2015-11-27 Thread Fabian Hueske
If you are just looking for an exchange format between two Flink jobs, I would go for the TypeSerializerInput/OutputFormat. Note that these are binary formats. Best, Fabian 2015-11-27 15:28 GMT+01:00 Flavio Pompermaier : > Hi to all, > > I have a complex POJO (with nexted objects) that I'd like

Re: flink connectors

2015-11-27 Thread Stephan Ewen
The reason why the binary distribution does not contain all connectors is that this would add all libraries used by the connectors into the binary distribution jar. These libraries partly conflict with each other, and often conflict with the libraries used by the user's programs. Not including the

Re: Doubt about window and count trigger

2015-11-27 Thread Fabian Hueske
Hi Anwar, You trigger looks good! I just want to make sure you know what it is exactly happening after a window was evaluated and purged. Once a window was purged, the whole window is cleared and removed. If a new element arrives, that would have fit into the purged window, a new window with exac

Re: Cleanup of OperatorStates?

2015-11-27 Thread Stephan Ewen
Hey Niels! You may be able to implement this in windows anyways, depending on your setup. You can definitely implement state with timeout yourself (using the more low-level state interface), or you may be able to use custom windows for that (they can trigger on every element and return elements im

POJO Dataset read and write

2015-11-27 Thread Flavio Pompermaier
Hi to all, I have a complex POJO (with nexted objects) that I'd like to write and read with Flink (batch). What is the simplest way to do that? I can't find any example of it :( Best, Flavio

[ANNOUNCE] Flink 0.10.1 released

2015-11-27 Thread Robert Metzger
The Flink PMC is pleased to announce the availability of Flink 0.10.1. The official release announcement: http://flink.apache.org/news/2015/11/27/release-0.10.1.html Release binaries: http://apache.openmirror.de/flink/flink-0.10.1/ Please update your maven dependencies to the new 0.10.1 version a

Re: Interpretation of Trigger and Eviction on a window

2015-11-27 Thread Fabian Hueske
Hi Nirmalya, it is correct that the evictor is called BEFORE the window function is applied because this is required to support certain types of sliding windows. If you want to remove all elements from the window after the window function was applied, you need a trigger that purges the window. The

Re: Continuing from the stackoverflow post

2015-11-27 Thread Fabian Hueske
Hi Nirmalya, can you describe the semantics that you want to implement? Do you want to find the max temperature every 5 milliseconds or the max of every 5 records? Right now, you are using a non-keyed timeWindow of 5 milliseconds. This will create a window for the complete stream every 5 msecs. H

RE: flink connectors

2015-11-27 Thread Radu Tudoran
Hi, Thank you for the tips! For future references in case someone else wants to search for the binaries for this, I would like to share the link to the maven repository http://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka Dr. Radu Tudoran Research Engineer IT R&D Division

Re: flink connectors

2015-11-27 Thread Ovidiu-Cristian MARCU
Hi, The main question here is why the distribution release doesn’t contain the connector dependencies. It is fair to say that it does not have to (which connector to include or all). So just like Spark does, Flink offers binary distribution for hadoop only without considering other dependencies

Re: Doubt about window and count trigger

2015-11-27 Thread Anwar Rizal
Thanks Fabian and Aljoscha, I try to implement the trigger as you described as follow: https://gist.github.com/anonymous/d0578a4d27768a75bea4 It works fine , indeed. Thanks, Anwar On Fri, Nov 27, 2015 at 11:49 AM, Aljoscha Krettek wrot

Re: flink connectors

2015-11-27 Thread Matthias J. Sax
If I understand the question right, you just want to download the jar manually? Just go to the maven repository website and download the jar from there. -Matthias On 11/27/2015 02:49 PM, Robert Metzger wrote: > Maybe there is a maven mirror you can access from your network? > > This site conta

Re: flink connectors

2015-11-27 Thread Robert Metzger
Maybe there is a maven mirror you can access from your network? This site contains a list of some mirrors http://stackoverflow.com/questions/5233610/what-are-the-official-mirrors-of-the-maven-central-repository You don't have to use the maven tool, you can also manually browse for the jars and dow

Continuing from the stackoverflow post

2015-11-27 Thread Nirmalya Sengupta
Hello Fabian/Matthius, Many thanks for showing interest in my query on SOF. That helps me sustain my enthusiasm. :-) After setting parallelism of environment to '1' and replacing _max()_ with _maxBy()_, I get a list of maximum temperatures but I fail to explain to myself, how does Flink arrive at

Re: flink connectors

2015-11-27 Thread Fabian Hueske
You can always build Flink from source, but apart from that I am not aware of an alternative. 2015-11-27 14:42 GMT+01:00 Radu Tudoran : > Hi, > > > > Is there any alternative to avoiding maven? > > That is why I was curious if there is a binary distribution of this > available for download direct

RE: flink connectors

2015-11-27 Thread Radu Tudoran
Hi, Is there any alternative to avoiding maven? That is why I was curious if there is a binary distribution of this available for download directly Dr. Radu Tudoran Research Engineer IT R&D Division [cid:image007.jpg@01CD52EB.AD060EE0] HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Cent

Re: flink connectors

2015-11-27 Thread Fabian Hueske
Hi Radu, the connectors are available in Maven Central. Just add them as a dependency in your project and they will be fetched and included. Best, Fabian 2015-11-27 14:38 GMT+01:00 Radu Tudoran : > Hi, > > > > I was trying to use flink connectors. However, when I tried to import this > > > > im

flink connectors

2015-11-27 Thread Radu Tudoran
Hi, I was trying to use flink connectors. However, when I tried to import this import org.apache.flink.streaming.connectors.*; I saw that they are not present in the binary distribution as downloaded from website (flink-dist-0.10.0.jar). Is this intentionally? Is there also a binary distributi

Re: Cleanup of OperatorStates?

2015-11-27 Thread Niels Basjes
Hi, Thanks for the explanation. I have clickstream data arriving in realtime and I need to assign the visitId and stream it out again (with the visitId now begin part of the record) into Kafka with the lowest possible latency. Although the Window feature allows me to group and close the visit on a

Re: Working with the Windowing functionality

2015-11-27 Thread Niels Basjes
Hi, Thanks for all this input. I didn't know about the // a trigger can only have 1 timer so we remove the old trigger when setting the new one This insight is to me of major importance. Let me explain: I found in the WindowOperator this code below. @Override public void registerEventTime

Interpretation of Trigger and Eviction on a window

2015-11-27 Thread Nirmalya Sengupta
Hello there. I have just started exploring Apache Flink, and it has immediately got me excited. Because I am a beginner, my questions may be a bit naive. Please bear with me. I refer to this particular sentence from Flink 0.10.0 Guide

Re: Cleanup of OperatorStates?

2015-11-27 Thread Stephan Ewen
Hi Niels! Currently, state is released by setting the value for the key to null. If you are tracking web sessions, you can try and send a "end of session" element that sets the value to null. To be on the safe side, you probably want state that is automatically purged after a while. I would look

Re: Doubt about window and count trigger

2015-11-27 Thread Aljoscha Krettek
Hi Anwar, what Fabian wrote is completely right. I just want to give the reasoning for why the CountTrigger behaves as it does. The idea was to have Triggers that clearly focus on one thing and then at some point add combination triggers. For example, an OrTrigger that triggers if either of it’s

Re: graph problem to be solved

2015-11-27 Thread Stephan Ewen
Hi! Yes, looks like quite a graph problem. The best way to get started with that is to have a look at Gelly: https://ci.apache.org/projects/flink/flink-docs-release-0.10/libs/gelly_guide.html Beware: The problem you describe (all possible paths between all pairs of points) results in an exponenti

Re: Working with the Windowing functionality

2015-11-27 Thread Aljoscha Krettek
Hi Niels, do the records that arrive from Kafka already have the session ID or do you want to assign them inside your Flink job based on the idle timeout? For the rest of your problems you should be able to get by with what Flink provides: The triggering can be done using a custom Trigger that

Re: Working with State example /flink streaming

2015-11-27 Thread Aljoscha Krettek
Hi, I’ll try to go into a bit more detail about the windows here. What you can do is this: DataStream> input = … // fields are (id, sum, count), where count is initialized to 1, similar to word count DataStream> counts = input .keyBy(0) .timeWindow(Time.minutes(10)) .reduce(new MyCounting

RE: Doubt about window and count trigger

2015-11-27 Thread fhueske
Hi, a regular tumbling time window of 5 seconds gets all elements within that period of time (semantics of time varies for processing, ingestion, and event time modes) and triggers the execution after 5 seconds. If you define a custom trigger, the assignment policy remains the same, but the t