I'm writing a utility to split a data set randomly into several parts and
return an Array of data sets. However, whenever I operate on any of
these *subsets,
*the program basically start from the original data set, and the split is
performed again.

To ensure that these subsets are mutually exclusive, we need to generate
the exact same sequence of random numbers, but also to ensure that the
elements arrive in a filter job in exactly the same order. How do I achieve
this?
Here's the code I've written: https://github.com/apache/flink/pull/921/files

Regards
Sachin

-- Sachin Goel
Computer Science, IIT Delhi
m. +91-9871457685

Reply via email to