Hi All, Any ideas on this ??
Thanks Stuti Awasthi From: Stuti Awasthi Sent: Wednesday, May 14, 2014 6:20 PM To: user@spark.apache.org Subject: Understanding epsilon in KMeans Hi All, I wanted to understand the functionality of epsilon in KMeans in Spark MLlib. As per documentation : distance threshold within which we've consider centers to have converged.If all centers move less than this Euclidean distance, we stop iterating one run. Now I have assumed that if centers are moving less than epsilon value then Clustering Stops but then what does it mean by "we stop iterating one run".. Now suppose I have given maxIterations=10 and epsilon = 0.1 and assume that centers are afteronly 2 iteration, the epsilon condition is met i.e. now centers are moving only less than 0.1.. Now what happens ?? The whole 10 iterations are completed OR the Clustering stops ?? My 2nd query is in Mahout, there is a configuration param : "Convergence Threshold (cd)" which states : "in an iteration, the centroids don't move more than this distance, no further iterations are done and clustering stops." So is epsilon and cd similar ?? 3rd query : How to pass epsilon as a configurable param. KMeans.train() does not provide the way but in code I can see "setEpsilon" as method. SO if I want to pass the parameter as epsilon=0.1 , how may I do that.. Pardon my ignorance Thanks Stuti Awasthi ::DISCLAIMER:: ---------------------------------------------------------------------------------------------------------------------------------------------------- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ----------------------------------------------------------------------------------------------------------------------------------------------------