Thanks for your response.  Is there a reason why this thread isn't
appearing on the mailing list?  So far, I only see my post, with no
answers, although I have received 2 answers via email.  It would be nice if
other people could see these answers as well.

On Thu, Sep 17, 2015 at 2:22 AM, Sun, Rui <rui....@intel.com> wrote:

> The existing algorithms operating on R data.frame can't simply operate on
> SparkR DataFrame. They have to be re-implemented to be based on SparkR
> DataFrame API.
>
> -----Original Message-----
> From: ekraffmiller [mailto:ellen.kraffmil...@gmail.com]
> Sent: Thursday, September 17, 2015 3:30 AM
> To: user@spark.apache.org
> Subject: SparkR - calling as.vector() with rdd dataframe causes error
>
> Hi,
> I have a library of clustering algorithms that I'm trying to run in the
> SparkR interactive shell. (I am working on a proof of concept for a
> document classification tool.) Each algorithm takes a term document matrix
> in the form of a dataframe.  When I pass the method a local dataframe, the
> clustering algorithm works correctly, but when I pass it a spark rdd, it
> gives an error trying to coerce the data into a vector.  Here is the code,
> that I'm calling within SparkR:
>
> # get matrix from a file
> file <-
>
> "/Applications/spark-1.5.0-bin-hadoop2.6/examples/src/main/resources/matrix.csv"
>
> #read it into variable
>  raw_data <- read.csv(file,sep=',',header=FALSE)
>
> #convert to a local dataframe
> localDF = data.frame(raw_data)
>
> # create the rdd
> rdd  <- createDataFrame(sqlContext,localDF)
>
> #call the algorithm with the localDF - this works result <-
> galileo(localDF, model='hclust',dist='euclidean',link='ward',K=5)
>
> #call with the rdd - this produces error result <- galileo(rdd,
> model='hclust',dist='euclidean',link='ward',K=5)
>
> Error in as.vector(data) :
>   no method for coercing this S4 class to a vector
>
>
> I get the same error if I try to directly call as.vector(rdd) as well.
>
> Is there a reason why this works for localDF and not rdd?  Should I be
> doing something else to coerce the object into a vector?
>
> Thanks,
> Ellen
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-calling-as-vector-with-rdd-dataframe-causes-error-tp24717.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional
> commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to