Re: Windowing isn't applied per key

2017-10-11 Thread Tony Wei
Hi Marcus, Yes, each key would has it's own window managed, so the aggregation on window is sum of the value by each key, not sum of all element. You can imagine that each key has its own sliding window assignor that decides each element in each keyed stream belong to which windows, but all keyed

Re: Windowing isn't applied per key

2017-10-11 Thread mclendenin
Hi Tony, In the documentation on keyed windows vs non-keyed it says that it will split the stream into parallel keyed streams with windows being executed in parallel across the keys. I would think that this would mean that each key has it's own window managed independently. https://ci.apache.org/

Re: Windowing isn't applied per key

2017-10-10 Thread Tony Wei
Hi Marcus, I think that is an expected result for sliding window in Flink. You can see the example in the document for more details. [1] For your need, I will suggest to use ProcessFunction to implement the sliding window that you expected. You can use key state to buffer elements and onTimer to t

Re: Windowing isn't applied per key

2017-10-10 Thread mclendenin
Sure, I'm going to use a name as key in this example and just a number as the value aggregated. This is the sample input data 12:00 {"name": "Marcus", "value": 1} 12:01 {"name": "Suzy", "value": 2} 12:03 {"name": "Alicia", "value": 3} 12:04 {"name": "Ben", "value": 1} 12:06 {"name": "Alicia", "val

Re: Windowing isn't applied per key

2017-10-10 Thread Aljoscha Krettek
Hi, Could you maybe give an example of what you expect as output and what you actually get? Best, Aljoscha > On 9. Oct 2017, at 16:09, mclendenin wrote: > > I am using Processing Time, so it is using the default timestamps and > watermarks. I am running it with a parallelism of 3, I can see

Re: Windowing isn't applied per key

2017-10-09 Thread mclendenin
I am using Processing Time, so it is using the default timestamps and watermarks. I am running it with a parallelism of 3, I can see each operator running at a parallelism of 3 on the Web UI. I am pulling data from a Kafka topic with 12 partitions. -- Sent from: http://apache-flink-user-mailing

Re: Windowing isn't applied per key

2017-10-02 Thread Timo Walther
Hi Marcus, from a first glance your pipeline looks correct. It should not be executed with a parallelism of one, if not specified explicitly. Which time semantics are you using? If it is event-time, I would check your timestamps and watermarks assignment. Maybe you can also check in the web f

Windowing isn't applied per key

2017-09-29 Thread Marcus Clendenin
I have a job that is performing an aggregation over a time window. This windowing is supposed to be happening by key, but the output I am seeing is creating an overall window on everything coming in. Is this happening because I am doing a map of the data before I am running the keyBy command? This