Re: KMeans Input Format

Sean Owen Fri, 08 Aug 2014 07:49:37 -0700

(-incubator, +user)

It's a method of KMeansModel, not KMeans. On first glance it looks
like model should be a KMeansModel, but Scala says it's not. The
problem is...


val model = new KMeans()
.setInitializationMode("k-means||")
.setK(2)
.setMaxIterations(2)
.setEpsilon(1e-4)
.setRuns(1)
.run(train)

The first line is a complete statement and you end up executing just
'val model = new KMeans()'

I forget in which cases the Scala compiler will interpret it the way
you intend and when it won't, but, to avoid doubt, I put the periods
on the end of the preceding line. That way the lines can't be
interpreted to conclude a statement.

On Fri, Aug 8, 2014 at 3:18 PM, AlexanderRiggers
<alexander.rigg...@gmail.com> wrote:
> Thanks for your answers. I added some lines to my code and it went through,
> but I get a error message for my compute cost function now...
>
> scala> val WSSSE = model.computeCost(train)14/08/08 15:48:42 WARN
> BlockManagerMasterActor: Removing BlockManager BlockManagerId(<driver>,
> 192.168.0.33, 49242, 0) with no recent heart beats: 156207ms exceeds 45000ms
> 14/08/08 15:48:42 INFO BlockManager: BlockManager re-registering with master
> 14/08/08 15:48:42 INFO BlockManagerMaster: Trying to register BlockManager
> 14/08/08 15:48:42 INFO BlockManagerInfo: Registering block manager
> 192.168.0.33:49242 with 303.4 MB RAM
> 14/08/08 15:48:42 INFO BlockManagerMaster: Registered BlockManager
> 14/08/08 15:48:42 INFO BlockManager: Reporting 0 blocks to the master.
>
> <console>:30: error: value computeCost is not a member of
> org.apache.spark.mllib.clustering.KMeans
>        val WSSSE = model.computeCost(train)
>
> compute cost should be a member of KMeans isn't it?
>
> My whole code is here:
> import org.apache.spark.SparkContext
> import org.apache.spark.SparkContext._
> import org.apache.spark.SparkConf
>
> val conf = new SparkConf()
> .setMaster("local")
> .setAppName("Kmeans")
> .set("spark.executor.memory", "2g")
> val sc = new SparkContext(conf)
>
>
>
> import org.apache.spark.mllib.clustering.KMeans
> import org.apache.spark.mllib.clustering.KMeansModel
> import org.apache.spark.mllib.linalg.Vectors
>
> // Load and parse the data
> val data = sc.textFile("data/outkmeanssm.txt")
> val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble)))
> val train = parsedData.repartition(20).cache()
>
> // Set model and run it
> val model = new KMeans()
> .setInitializationMode("k-means||")
> .setK(2)
> .setMaxIterations(2)
> .setEpsilon(1e-4)
> .setRuns(1)
> .run(train)
>
> // Evaluate clustering by computing Within Set Sum of Squared Errors
> val WSSSE = model.computeCost(train)
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/KMeans-Input-Format-tp11654p11788.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: KMeans Input Format

Reply via email to