Hi, Thanks for the reply. So I have 2 cases:
1. timeWindowAll (length, slide).reduce (...) (with parallelism = 1) 2. groupby(someField).timeWindow(length, slide). reduce(...) Lets say case-1 global window, case-2 partitioned window. If I have only one key (for case-2) and I set parallelism=1 for case-1, I would expect that both cases have similar performance both in terms of latency and throughput. However, partitioned windows outperform global ones by orders of magnitude in terms of throughput. I am using Flink 1.1.3. Thanks, Adrienne On Mon, May 8, 2017 at 3:55 PM, Stefan Richter <s.rich...@data-artisans.com> wrote: > Hi, > > to answer this question, we would first need to know what you mean by > „global windows“: using „windowAll()“ or „GlobalWindows“? Also, the answer > might depend on the Flink version that you are using. > > Best, > Stefan > > > Am 07.05.2017 um 23:23 schrieb Adrienne Kole <adrienneko...@gmail.com>: > > > > Hi, > > > > I am doing simple aggregation with a keyed and global windows in flink. > > When I compare the keyed window aggregation with 1 key and global window > (which has parallelism 1) I would expect that both of them would have > similar performance. > > > > However, keyed stream with 1 key performs with 2x more throughput than > global window. > > My configuration is 8 node cluster, 16 core in each node, parallelism = > 128. > > > > AFAIK, Flink doesn't manage skew by default and uses hash function to > assign keys to partitions. So if I have 1 key only, it should go to only > one partition always, which is semantically similar to global windows in > flink. > > > > What can be the reason behind this difference in performance? > > > > Thanks, > > Adrienne > >