Hi all, I'm experimenting with Spark's Word2Vec implementation for a relatively large (5B word, vocabulary size 4M, 400-dimensional vectors) corpora. Has anybody had success running it at this scale?
Thanks in advance for your guidance! -Shilad -- Shilad W. Sen Associate Professor Mathematics, Statistics, and Computer Science Dept. Macalester College s...@macalester.edu http://www.shilad.com https://www.linkedin.com/in/shilad 651-696-6273