So there is no way to do a countWindow(100) and preserve data locality? My use case is this: augment a data stream with new fields from DynamoDB lookup. DynamoDB allows batch get's of up to 100 records, so I am trying to collect 100 records before making that call. I have no other reason to do a repartitioning, so I am hoping to avoid incurring the cost of shipping all the data across the network to do this.
If I use countWindowAll, I am limited to parallelism = 1, so all data gets repartitioned twice. And if I use keyBy().countWindow(), then it gets repartitioned by key. So in both cases I lose locality. Am I missing any other options? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Strange-behavior-of-DataStream-countWindow-tp7482p13981.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.