thanks, I understand now.
but I can't find mllib.clustering.GaussianMixture#vectorMean   , what
version of spark do you use?

On Thu, Jul 9, 2015 at 1:16 AM, Feynman Liang <fli...@databricks.com> wrote:

> A RDD[Double] is an abstraction for a large collection of doubles,
> possibly distributed across multiple nodes. The DoubleRDDFunctions are
> there for performing mean and variance calculations across this distributed
> dataset.
>
> In contrast, a Vector is not distributed and fits on your local machine.
> You would be better off computing these quantities on the Vector directly
> (see mllib.clustering.GaussianMixture#vectorMean for an example of how to
> compute the mean of a vector).
>
> On Tue, Jul 7, 2015 at 8:26 PM, 诺铁 <noty...@gmail.com> wrote:
>
>> hi,
>>
>> there are some useful functions in DoubleRDDFunctions, which I can use if
>> I have RDD[Double], eg, mean, variance.
>>
>> Vector doesn't have such methods, how can I convert Vector to
>> RDD[Double], or maybe better if I can call mean directly on a Vector?
>>
>
>

Reply via email to