Hi,

I wrote a Scala implementation of Annoy(https://github.com/spotify/annoy)
which is an ann library.

https://github.com/mskimm/annoy4s

Because building tree in Annoy is done by a single node,
I thought the following solution:
 - building tree (index file) using `toLocalIterator` of RDD on the driver,
 - then quering nns on executors using the `index file` which is downloaded
by `sc.addFile`

Anybody reviews the code and idea?

I tested this implementation in Spark 1.6.2, and it seems work.

The code I tested was like
```
https://github.com/mskimm/annoy4s#item-similarity-computation
```

Minseok

Reply via email to