Re: K-means with large K

2014-04-28 Thread Dean Wampler
; > > Dave > > > > *From:* Chester Chen [mailto:chesterxgc...@yahoo.com] > *Sent:* Monday, April 28, 2014 9:31 AM > *To:* user@spark.apache.org > *Cc:* user@spark.apache.org > *Subject:* Re: K-means with large K > > > > David, > > Just curious to kn

Re: K-means with large K

2014-04-28 Thread Matei Zaharia
Try turning on the Kryo serializer as described at http://spark.apache.org/docs/latest/tuning.html. Also, are there any exceptions in the driver program’s log before this happens? Matei On Apr 28, 2014, at 9:19 AM, Buttler, David wrote: > Hi, > I am trying to run the K-means code in mllib, an

RE: K-means with large K

2014-04-28 Thread Buttler, David
@spark.apache.org Cc: user@spark.apache.org Subject: Re: K-means with large K David, Just curious to know what kind of use cases demand such large k clusters Chester Sent from my iPhone On Apr 28, 2014, at 9:19 AM, "Buttler, David" mailto:buttl...@llnl.gov>> wrote: Hi, I am trying to

Re: K-means with large K

2014-04-28 Thread Chester Chen
David, Just curious to know what kind of use cases demand such large k clusters Chester Sent from my iPhone On Apr 28, 2014, at 9:19 AM, "Buttler, David" wrote: > Hi, > I am trying to run the K-means code in mllib, and it works very nicely with > small K (less than 1000). However, when I

K-means with large K

2014-04-28 Thread Buttler, David
Hi, I am trying to run the K-means code in mllib, and it works very nicely with small K (less than 1000). However, when I try for a larger K (I am looking for 2000-4000 clusters), it seems like the code gets part way through (perhaps just the initialization step) and freezes. The compute nodes