Hi Natia, As I understand, the processing order of onlineKmeans is the same the input data.
Are you running OnlineKmeans with using one data point with random initial KmeansModel? Could you use a fixed initial model following [1] and try out? [1] https://github.com/apache/flink-ml/blob/239788f2b1f1f3a4e55ca112517980b598705a15/flink-ml-lib/src/test/java/org/apache/flink/ml/clustering/OnlineKMeansTest.java#L354 Jing Ge <j...@ververica.com> 于2022年6月3日周五 17:04写道: > Hi, > > It seems like an evaluation with a small dataset. In this case, would you > like to share your data sample and code? In addition, have you tried KMeans > with the same dataset and got inconsistent results too? > > Best regards, > Jing > > On Fri, Jun 3, 2022 at 4:29 AM Natia Chachkhiani < > natia.chachkhia...@gmail.com> wrote: > >> Hi, >> >> I am running OnlineKmeans from flink-ml repo on a small dataset. I've >> noticed that I don't get consistent results, assignments to clusters, >> across different runs. I have set both parallelism and globalBatchSize to 1. >> I am doing simple fit and transform on each data point ingested. Is the >> order of processing not guaranteed? Or am I missing something? >> >> Thanks, >> Natia >> > -- best, Zhipeng