Dear Spark Users, If you want to search a list of phrases, approx. 10,000 each having words between 1 to 6, in a large amount of text (approximately 10GB) how do you go about it?
I ended up wiring a small RDD based libraries: https://github.com/cloudxlab/phrasesearch I would like to get feedback on this. This is in very early stages and hacky and probably would require more testing. Regards, Sandeep Giri, www.CloudxLab.com <http://www.cloudxlab.com/>