Hi,

As a part of the project, we are trying to create parallel implementation
of BIRCH clustering algorithm [1]. We are mostly getting idea how to do it
from this paper, which used CUDA to make BIRCH parallel [2]. ([2] is short
paper, just section 4. is relevant).

We would like to implement BIRCH on Spark. Would this be an interesting
contribution for MLlib? Is there anyone already who tried to implement
BIRCH on Spark?

Any suggestions for implementation itself would be very much appreciated!


[1] http://www.cs.sfu.ca/CourseCentral/459/han/papers/zhang96.pdf
[2] http://boyuan.global-optimization.com/Mypaper/IDEAL2013-88.pdf


Best,
Dzeno

Reply via email to