thanks, I understand now. but I can't find mllib.clustering.GaussianMixture#vectorMean , what version of spark do you use?
On Thu, Jul 9, 2015 at 1:16 AM, Feynman Liang <fli...@databricks.com> wrote: > A RDD[Double] is an abstraction for a large collection of doubles, > possibly distributed across multiple nodes. The DoubleRDDFunctions are > there for performing mean and variance calculations across this distributed > dataset. > > In contrast, a Vector is not distributed and fits on your local machine. > You would be better off computing these quantities on the Vector directly > (see mllib.clustering.GaussianMixture#vectorMean for an example of how to > compute the mean of a vector). > > On Tue, Jul 7, 2015 at 8:26 PM, 诺铁 <noty...@gmail.com> wrote: > >> hi, >> >> there are some useful functions in DoubleRDDFunctions, which I can use if >> I have RDD[Double], eg, mean, variance. >> >> Vector doesn't have such methods, how can I convert Vector to >> RDD[Double], or maybe better if I can call mean directly on a Vector? >> > >