Re: K-NN by efficient sparse matrix product

2014-05-28 Thread Christian Jauvin
nished. In theory, this has complexity max(nnz(L)*log p, nnz(L)*n/p). I >> have to warn though: when I played with matrix multiplication, I was getting >> nowhere near serial performance. >> >> >> On Wed, May 28, 2014 at 11:00 AM, Christian Jauvin >> wrote: &g

K-NN by efficient sparse matrix product

2014-05-28 Thread Christian Jauvin
Hi, I'm new to Spark and Hadoop, and I'd like to know if the following problem is solvable in terms of Spark's primitives. To compute the K-nearest neighbours of a N-dimensional dataset, I can multiply my very large normalized sparse matrix by its transpose. As this yields all pairwise distance v