Hi If you're looking for an implementation, I wrote this in context of a Random k-means initialization. Take a look at the {{weightedFit}} function. https://github.com/sachingoel0101/flink/blob/clustering_initializations/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/clustering/KMeansRandomInit.scala
Note that numCluster is the number of points you need in the sample. Regards Sachin Goel On Tue, Jun 16, 2015 at 2:45 PM, Maximilian Alber < alber.maximil...@gmail.com> wrote: > Thanks! > Cheers, > Max > > On Tue, Jun 16, 2015 at 11:01 AM, Till Rohrmann <trohrm...@apache.org> > wrote: > >> This might help you [1]. >> >> Cheers, >> Till >> >> [1] >> http://stackoverflow.com/questions/2514061/how-to-pick-random-small-data-samples-using-map-reduce >> >> >> On Tue, Jun 16, 2015 at 10:16 AM Maximilian Alber < >> alber.maximil...@gmail.com> wrote: >> >>> Hi Flinksters, >>> >>> again a similar problem. I would like to choose ONE random element out >>> of a data set, without shuffling the whole set. Again I would like to have >>> the element (mathematically) randomly chosen. >>> >>> Thanks! >>> Cheers, >>> Max >>> >> >