Sebastian Kruse created FLINK-2193: -------------------------------------- Summary: Partial shuffling Key: FLINK-2193 URL: https://issues.apache.org/jira/browse/FLINK-2193 Project: Flink Issue Type: Improvement Reporter: Sebastian Kruse Priority: Minor
In some cases, it would come in handy to shuffle only some specific elements of a dataset instead of all elements. This is currently not achievable with a custom partitioner. Use cases for such a feature are: * Load balancing: split up elements that require high processing load and distribute the splits among all task managers. * Evolutionary algorithms: A well-suited EA model for Map/Reduce-like platforms is the island model, where each worker maintains and evolves its own population. From time to time, individuals among the population need to be exchanged. Shuffling all the complete populations is not necessary, though. A presumably easy way to achieve this feature could be to provide the local partition number in deployed partitioners, similar to {{RichFunction#getRuntimeContext()#getIndexOfThisSubtask()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)