Dear Fabian, Can you have a look into this issue. What actions will be required to resolve this one?
https://issues.apache.org/jira/browse/FLINK-1725 Regards, Anis On Wed, Feb 15, 2017 at 6:36 PM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi Anis, > > Flink uses regular hash-partitioning to shuffle records and does not have a > mechanism to counter data skew (other than scaling out). > Heterogeneous hardware can (to some extend) be addressed by adapting the > number of processing slots (or task managers) per machine, i.e., configure > fewer slots on machines with lower performance. > > Best, Fabian > > 2017-02-15 2:12 GMT+01:00 Anis Nasir <aadi.a...@gmail.com>: > > > Dear All, > > > > I have few use cases for Flink streaming where the cluster consist of > > heterogenous machines. > > > > Additionally, there is skew present in both the input distribution (e.g., > > each tuple is drawn from a zipf distribution) and the service time (e.g., > > service time required for each tuple comes from a zipf distribution). > > > > I want to know who Flink will handle such use cases assuming that the > > distribution of both workload and cluster is unknown in prior. > > > > Any help will be highly appreciated! > > > > > > Regards, > > Anis > > >