I'm writing a utility to split a data set randomly into several parts and return an Array of data sets. However, whenever I operate on any of these *subsets, *the program basically start from the original data set, and the split is performed again.
To ensure that these subsets are mutually exclusive, we need to generate the exact same sequence of random numbers, but also to ensure that the elements arrive in a filter job in exactly the same order. How do I achieve this? Here's the code I've written: https://github.com/apache/flink/pull/921/files Regards Sachin -- Sachin Goel Computer Science, IIT Delhi m. +91-9871457685