Re: Firing windows multiple times

2016-08-12 Thread Shannon Carey
Thanks Aljoscha, I didn't know about those. Yes, they look like handy changes, especially to enable flexible approaches for eviction. In particular, having the current watermark available to the evictor via EvictorContext is helpful: it will be able to evict the old data more easily without need

Ordering expectations of data

2016-08-12 Thread Bart Wyatt
​Hello all, We have a kafka topic with lots of partitions where data is partitioned by an upstream publisher on "session". In flink we read this topic and another single partition topic which contains configuration definitions for a little flatMap based operation. We also do a little bit

Re: Multiple Partitions (Source Functions) -> Event Time -> Watermarks -> Trigger

2016-08-12 Thread Sameer W
Thanks Max - I will advance watermarks when no event arrives for a while. But when using Kafka is it a good practice to assign events to partitions randomly instead say device id or region id where the devices are located. What I noticed is if devices sending to one of the partitions stop sending

Re: Within interval for CEP - Wall Clock based or Event Timestamp based?

2016-08-12 Thread Sameer W
Thanks Max - Especially the last part about late events. Sameer On Fri, Aug 12, 2016 at 4:23 AM, Maximilian Michels wrote: > Hi Sameer, > > That depends on the time characteristic you have chosen. If you have > set it to event time [1] then it will use event time, otherwise the > default is to

RE: flink - Working with State example

2016-08-12 Thread Ramanan, Buvana (Nokia - US)
Hi Kostas, I am trying to use FsStateBackend as the backend for storing state. And configure it as follows in the code: StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStateBackend(new FsStateBackend("file:///home/buvana/flink/checkp

1.1.1: JobManager config endpoint no longer supplies port

2016-08-12 Thread Shannon Carey
It appears that when running Flink 1.1.1 on Yarn, my previous method of making a request to the yarn AM proxy on the master node at http://{master_node}:20888/proxy/{app_id}/jobmanager/config doesn't work the same as it did. Previously, the returned JSON value would include an accurate value fo

Updating stored window data

2016-08-12 Thread Paul Joireman
Hello, I'd like to specify a long time duration window (say 1 day) but write a custom trigger to force processing on a more frequent interval (say 10 minutes). Similar, at least initially, to the thread recently started by S. Carey (Firing windows mulitple times). However, each time the win

Re: ValueState is missing

2016-08-12 Thread Stephan Ewen
Hi! So far we are not aware of a state loss bug in Flink. My guess is that it is some subtlety in the program. The check that logs also has other checks, like "handHistoryInfo.playType == PlayType.Cash" and "players.size > 1". Is one of them maybe the problem? To debug this, you can try and do

Re: ValueState is missing

2016-08-12 Thread Dong-iL, Kim
Hi. I checked order of data. but it is alright. Is there any other possibilities? Thank you. > On Aug 12, 2016, at 7:09 PM, Stephan Ewen wrote: > > Hi! > > Its not that easy to say at a first glance. > > One thing that is important to bear in mind is what ordering guarantees Flink > gives, an

Re: Using CustomPartitionerWrapper with KeyedStream

2016-08-12 Thread Philippe Caparroy
Hi Max, Thanks for the answer. I needed to ensure that in a parallel window operation (which relies on a KeyedStream) each partition contains a single key, in the output stream of the window. I can obtain this using a customPartitioner just after the window, but relying on the partitioner of th

Re: ValueState is missing

2016-08-12 Thread Stephan Ewen
Hi! Its not that easy to say at a first glance. One thing that is important to bear in mind is what ordering guarantees Flink gives, and where the ordering guarantees are not given. When you use keyBy() or redistribute(), order is preserved per parallel source/target pair only. Have a look here:

Re: Using CustomPartitionerWrapper with KeyedStream

2016-08-12 Thread Maximilian Michels
Hi Philippe, There is no particular reason other than hash partitioning is a sensible default for most users. It seems like this is rarely an issue. When the number of keys is close to the parallelism, having idle partitions is usually not a problem due to low data volume. I see that it could be a

Re: ValueState is missing

2016-08-12 Thread Dong-iL, Kim
Nope. I added log in End. but there is same log. is there any fault in my code? thank you. > On Aug 12, 2016, at 6:42 PM, Maximilian Michels wrote: > > You're clearing the "handState" on "GameEndHistory". I'm assuming this > event comes in before "CommCardHistory" where you check the state. >

Re: flink shaded jar in yarn

2016-08-12 Thread Aljoscha Krettek
Thanks for letting us know! On Thu, 11 Aug 2016 at 07:33 Janardhan Reddy wrote: > sorry my bad, i was using some other version. > > > > On Thu, Aug 11, 2016 at 4:47 AM, Janardhan Reddy < > janardhan.re...@olacabs.com> wrote: > >> Hi, >> >> the flink-dist_2.11-1.0.0.jar jar present in lib folder

Re: ValueState is missing

2016-08-12 Thread Maximilian Michels
You're clearing the "handState" on "GameEndHistory". I'm assuming this event comes in before "CommCardHistory" where you check the state. On Fri, Aug 12, 2016 at 6:59 AM, Dong-iL, Kim wrote: > in my code, is the config of ExecutionEnv alright? > > >> On Aug 11, 2016, at 8:47 PM, Dong-iL, Kim wro

Re: Unit tests failing, losing stream contents

2016-08-12 Thread Stephan Ewen
Hi David! I would guess that the first exception happens once in a while, as part of a rare race condition. As Max said, two executions happen simultaneously. We should fix that race condition, though. The second exception looks like it is purely part of your application code. Greetings, Stephan

Re: Unit tests failing, losing stream contents

2016-08-12 Thread Maximilian Michels
Hi David, You're starting two executions at the same time (in different threads). Here's why: Execution No 1 DataStreamUtils.collect(..) starts a Thread which executes your job and collects stream elements. It runs asynchronously. The collect(..) method returns after starting the thread. Executi

Re: Firing windows multiple times

2016-08-12 Thread Aljoscha Krettek
Hi, there is already this FLIP: https://cwiki.apache.org/confluence/display/FLINK/FLIP-4+%3A+Enhance+Window+Evictor which also links to a mailing list discussion. And this FLIP: https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Window+Function+Metadata. The former proposes to enhan

Re: Does Flink DataStreams using combiners?

2016-08-12 Thread Aljoscha Krettek
Hi, Sameer is right that Flink currently does not combine for any combination of assigner, trigger and window function. Technically, it would be possible to use a combiner for Triggers that don't observe individual elements but only fire on time. With triggers that observe elements, such as CountT

Re: flink - Working with State example

2016-08-12 Thread Kostas Kloudas
No problem! Regards, Kostas > On Aug 12, 2016, at 3:00 AM, Ramanan, Buvana (Nokia - US) > wrote: > > Kostas, > Good catch! That makes it working! Thank you so much for the help. > Regards, > Buvana > > -Original Message- > From: Kostas Kloudas [mailto:k.klou...@data-artisans.com] > S

Re: Window not emitting output after upgrade to Flink 1.1.1

2016-08-12 Thread Yassine Marzougui
Hi Kostas, Yes, that's the case. I will revert back to 1.0.3 until the bug is fixed. Thank you. Best, Yassine On Fri, Aug 12, 2016 at 10:34 AM, Kostas Kloudas < k.klou...@data-artisans.com> wrote: > Hi Yassine, > > Are you reading from a file and use ingestion time? > If yes, then the problem c

Re: Does Flink DataStreams using combiners?

2016-08-12 Thread Sameer Wadkar
Streaming cannot use windows. The aggregations happen on the trigger. The elements being aggregated are only known after the trigger delivers the elements to the evaluation function. Since windows can overlap and even assignment to a window is not done until the elements arrive at the sum ope

Using CustomPartitionerWrapper with KeyedStream

2016-08-12 Thread Philippe Caparroy
Hi there, It seems not possible to use some custom partitioner in the context of the KeyedStream, without modifying the KeyedStream. protected DataStream setConnectionType(StreamPartitioner partitioner) { throw new UnsupportedOperationException("Cannot override partitioning for

Re: Multiple Partitions (Source Functions) -> Event Time -> Watermarks -> Trigger

2016-08-12 Thread Maximilian Michels
Hi Sameer, If you use Event Time you should make sure to assign Watermarks and Timestamps at the source. As you already observed, Flink may get stuck otherwise because it waits for Watermarks to progress in time. There is no timeout for windows. However, you can implement that logic in your Water

Re: Window not emitting output after upgrade to Flink 1.1.1

2016-08-12 Thread Kostas Kloudas
Hi Yassine, Are you reading from a file and use ingestion time? If yes, then the problem can be related to this: https://issues.apache.org/jira/browse/FLINK-4329 Is this the case? Best, Kostas > On Aug 12, 2016, at 10:30 AM, Yassine Marzougui

Window not emitting output after upgrade to Flink 1.1.1

2016-08-12 Thread Yassine Marzougui
Hi all, The following code works under Flink 1.0.3, but under 1.1.1 it just switches to FINISHED and doesn't output any result. stream.map(new RichMapFunction() { private ObjectMapper objectMapper; @Override public void open(Configuration parameters) { object

Re: Within interval for CEP - Wall Clock based or Event Timestamp based?

2016-08-12 Thread Maximilian Michels
Hi Sameer, That depends on the time characteristic you have chosen. If you have set it to event time [1] then it will use event time, otherwise the default is to use processing time. When using event time, the element's timestamp is used to assign it to the specified time windows in the patterns,