Hi Natia,

As I understand, the processing order of onlineKmeans is the same the input
data.

Are you running OnlineKmeans with using one data point with random initial
KmeansModel? Could you use a fixed initial model following [1] and try out?

[1]
https://github.com/apache/flink-ml/blob/239788f2b1f1f3a4e55ca112517980b598705a15/flink-ml-lib/src/test/java/org/apache/flink/ml/clustering/OnlineKMeansTest.java#L354

Jing Ge <j...@ververica.com> 于2022年6月3日周五 17:04写道:

> Hi,
>
> It seems like an evaluation with a small dataset. In this case, would you
> like to share your data sample and code? In addition, have you tried KMeans
> with the same dataset and got inconsistent results too?
>
> Best regards,
> Jing
>
> On Fri, Jun 3, 2022 at 4:29 AM Natia Chachkhiani <
> natia.chachkhia...@gmail.com> wrote:
>
>> Hi,
>>
>> I am running OnlineKmeans from flink-ml repo on a small dataset. I've
>> noticed that I don't get consistent results, assignments to clusters,
>> across different runs. I have set both parallelism and globalBatchSize to 1.
>> I am doing simple fit and transform on each data point ingested. Is the
>> order of processing not guaranteed? Or am I missing something?
>>
>> Thanks,
>> Natia
>>
>

-- 
best,
Zhipeng

Reply via email to