It will run distributed
On Mar 2, 2016 3:00 PM, "Priya Ch" wrote:
> Hi All,
>
> I am running k-means clustering algorithm. Now, when I am running the
> algorithm as -
>
> val conf = new SparkConf
> val sc = new SparkContext(conf)
> .
> .
> val kmeans = new KMeans()
> val model = kmeans.run(RDD[
Hi Jia,
I think the examples you provided is not very suitable to illustrate what
driver and executors do, because it's not show the internal implementation
of the KMeans algorithm.
You can refer the source code of MLlib Kmeans (
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org
Thanks, Yanbo.
The results become much more reasonable, after I set driver memory to 5GB
and increase worker memory to 25GB.
So, my question is for following code snippet extracted from main method in
JavaKMeans.java in examples, what will the driver do? and what will the
worker do?
I didn't unde
Hi Jia,
You can try to use inputRDD.persist(MEMORY_AND_DISK) and verify whether it
can produce stable performance. The storage level of MEMORY_AND_DISK will
store the partitions that don't fit on disk and read them from there when
they are needed.
Actually, it's not necessary to set so large drive
i want evaluate some different distance measure for time-space clustering.
so i need a api for implement my own function in java.
2015-05-19 22:08 GMT+02:00 Xiangrui Meng :
> Just curious, what distance measure do you need? -Xiangrui
>
> On Mon, May 11, 2015 at 8:28 AM, Jaonary Rabarisoa
> wrote
Just curious, what distance measure do you need? -Xiangrui
On Mon, May 11, 2015 at 8:28 AM, Jaonary Rabarisoa wrote:
> take a look at this
> https://github.com/derrickburns/generalized-kmeans-clustering
>
> Best,
>
> Jao
>
> On Mon, May 11, 2015 at 3:55 PM, Driesprong, Fokko
> wrote:
>>
>> Hi Pa
take a look at this
https://github.com/derrickburns/generalized-kmeans-clustering
Best,
Jao
On Mon, May 11, 2015 at 3:55 PM, Driesprong, Fokko
wrote:
> Hi Paul,
>
> I would say that it should be possible, but you'll need a different
> distance measure which conforms to your coordinate system.
Hi Paul,
I would say that it should be possible, but you'll need a different
distance measure which conforms to your coordinate system.
2015-05-11 14:59 GMT+02:00 Pa Rö :
> hi,
>
> it is possible to use a custom distance measure and a other data typ as
> vector?
> i want cluster temporal geo dat