Hi All, I'm trying to digest what's the difference between this two. From my experience in Spark GroupBy will cause shuffling on the network. Is that the same case in Flink ?
I've watch videos and read a couple docs about Flink that's actually Flink will compile the user code into it's own optimized graph structure so i think Flink engine will take care of this one ? >From the docs for Partitioning http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#partitioning Is that true that GroupBy is more advanced than PartitionBy ? Can someone elaborate ? I think this one is really confusing for me that come from Spark world. Any help would be really appreciated. Cheers -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Streaming-PartitionBy-vs-GroupBy-differences-tp1927.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.