Spark: How to find similar text title

Ascot Moss Tue, 20 Oct 2015 08:39:52 -0700

Hi,

I have my RDD that stores the titles of some articles:
1. "About Spark Streaming"
2. "About Spark MLlib"
3. "About Spark SQL"
4. "About Spark Installation"
5. "Kafka Streaming"
6. "Kafka Setup"
7. ....


I need to build a model to find titles by similarity,
e.g
if given "About Spark", hope to get:

"About Spark Installation", 0.98622 (where 0.98622 is the score
of similarity, range between 0 to 1)
"About Spark MLlib", 0.95394
"About Spark Streaming", 0.94332
"About Spark SQL", 0.9111

Any idea or reference to do so?

Thanks
Ascot





 and need to find out similar titles

Spark: How to find similar text title

Reply via email to