Re: Understanding epsilon in KMeans

2014-05-16 Thread Brian Gawalt
gt; > > *From:* Stuti Awasthi > *Sent:* Wednesday, May 14, 2014 6:20 PM > *To:* user@spark.apache.org > *Subject:* Understanding epsilon in KMeans > > > > Hi All, > > > > I wanted to understand the functionality of epsilon in KMeans in Spark > MLlib. > &

Re: Understanding epsilon in KMeans

2014-05-16 Thread Krishna Sankar
Stuti, - The two numbers at different contexts, but finally end up in two sides of an && operator. - A parallel K-Means consists of multiple iterations which in turn consists of moving centroids around. A centroids would be deemed stabilized when the root square distance between suc

Re: Understanding epsilon in KMeans

2014-05-16 Thread Long Pham
:29 PM, "Stuti Awasthi" wrote: > Hi All, > > > > Any ideas on this ?? > > > > Thanks > > Stuti Awasthi > > > > *From:* Stuti Awasthi > *Sent:* Wednesday, May 14, 2014 6:20 PM > *To:* user@spark.apache.org > *Subject:* Understanding ep

Re: Understanding epsilon in KMeans

2014-05-16 Thread Xiangrui Meng
2014 at 8:35 PM, Stuti Awasthi wrote: > Hi All, > > > > Any ideas on this ?? > > > > Thanks > > Stuti Awasthi > > > > From: Stuti Awasthi > Sent: Wednesday, May 14, 2014 6:20 PM > To: user@spark.apache.org > Subject: Understanding epsilon i

Re: Understanding epsilon in KMeans

2014-05-16 Thread Sean Owen
It is running k-means many times, independently, from different random starting points in order to pick the best clustering. Convergence ends one run, not all of them. Yes epsilon should be the same as "convergence threshold" elsewhere. You can set epsilon if you instantiate KMeans directly. Mayb

RE: Understanding epsilon in KMeans

2014-05-15 Thread Stuti Awasthi
Hi All, Any ideas on this ?? Thanks Stuti Awasthi From: Stuti Awasthi Sent: Wednesday, May 14, 2014 6:20 PM To: user@spark.apache.org Subject: Understanding epsilon in KMeans Hi All, I wanted to understand the functionality of epsilon in KMeans in Spark MLlib. As per documentation : distance

Understanding epsilon in KMeans

2014-05-14 Thread Stuti Awasthi
Hi All, I wanted to understand the functionality of epsilon in KMeans in Spark MLlib. As per documentation : distance threshold within which we've consider centers to have converged.If all centers move less than this Euclidean distance, we stop iterating one run. Now I have assumed that if cent