like it
might not work much better than brute-force even you set a higher threshold.
-
Liang-Chi Hsieh | @viirya
Spark Technology Center
http://www.spark.tc/
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Document-Similarity-Spark-Mllib
.n3.nabble.com/Document-Similarity-Spark-Mllib-tp20196p20219.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
---
> Liang-Chi Hsieh | @viirya
> Spark Technology Center
> --
> View this message in context: http://apache-spark-
> developers-list.1001551.n3.nabble.com/Document-Similarity-Spark-Mllib-
> tp20196p20198.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
viirya
Spark Technology Center
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Document-Similarity-Spark-Mllib-tp20196p20198.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
--
Hi ALL,
I am trying to implement a mlllib spark job, to find the similarity between
documents(for my case is basically home addess).
i believe i cannot use DIMSUM for my use case as, DIMSUM is works well only
with matrix with thin columns and more rows in matrix.
matrix example format, for my us