Hi All,

Any ideas on this ??

Thanks
Stuti Awasthi

From: Stuti Awasthi
Sent: Wednesday, May 14, 2014 6:20 PM
To: user@spark.apache.org
Subject: Understanding epsilon in KMeans

Hi All,

I wanted to understand the functionality of epsilon in KMeans in Spark MLlib.

As per documentation :
distance threshold within which we've consider centers to have converged.If all 
centers move less than this Euclidean distance, we stop iterating one run.

Now I have assumed that if centers are moving less than epsilon value then 
Clustering Stops but then what does it mean by "we stop iterating one run"..
Now suppose I have given maxIterations=10  and epsilon = 0.1 and assume that 
centers are afteronly 2 iteration, the epsilon condition is met i.e. now 
centers are moving only less than 0.1..

Now what happens ?? The whole 10 iterations are completed OR the Clustering 
stops ??

My 2nd query is in Mahout, there is a configuration param : "Convergence 
Threshold (cd)"   which states : "in an iteration, the centroids don't move 
more than this distance, no further iterations are done and clustering stops."

So is epsilon and cd similar ??

3rd query :
How to pass epsilon as a configurable param. KMeans.train() does not provide 
the way but in code I can see "setEpsilon" as method. SO if I want to pass the 
parameter as epsilon=0.1 , how may I do that..

Pardon my ignorance

Thanks
Stuti Awasthi




::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

Reply via email to