Bisecting K-Means - Working with intermediate results as DataSets

2016-08-28 Thread Adrian Bartnik
Hi, I am working on implementing a variant of the k-means algorithm, namely Bisecting K-means [1]. The basic premise is to run the original k-means algorithm on increasingly smaller subsets of the original input data. In each step of the outer loop, it splits the current cluster in 2 new sma

Flink programm with for loop yields wrong results when run in parallel

2016-07-04 Thread Adrian Bartnik
Hi, I have a Flink programm, which outputs wrong results once I set the parallelism to a value larger that 1. If I run the programm with parallelism 1, everything works fine. The algorithm works on one input dataset, which will iteratively be split until the desired output split size is reach

NoSuchMethodError when using the Flink Gelly library with Scala

2016-05-06 Thread Adrian Bartnik
Hi, I am trying to run the code examples from the Gelly documentation, in particular this code: import org.apache.flink.api.scala._ import org.apache.flink.graph.generator.GridGraph object SampleObject { def main(args: Array[String]) { val env = ExecutionEnvironment.getExecutionEnviron