GitHub user kalmanchapman opened a pull request: https://github.com/apache/flink/pull/2735
[FLINK-2094] implements Word2Vec for FlinkML This pr implements Word2Vec for FlinkML - addressing Jira Issue [Flink-2094](https://issues.apache.org/jira/browse/FLINK-2094) Word2Vec is a word embedding algorithm that generates vectors to reflect the contextual and semantic values of words in a text. find out more detail about word2vec here: https://arxiv.org/pdf/1411.2738v4.pdf This implementation uses an abstracted embedding algorithm which I've called a ContextEmbedder - based on the original Word2Vec algorithms - to allow users to extend embedding to reflect problems outside of words. Word2Vec is an implementation of the ContextEmbedder You can merge this pull request into a Git repository by running: $ git pull https://github.com/kalmanchapman/flink FLINK-2094 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2735.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2735 ---- commit 82f4d4fcc34872c79dcb4c83b5f4bf7ec94c7da7 Author: kalman <kalchap...@gmail.com> Date: 2016-09-14T19:36:32Z [FLINK-2094] implements Word2Vec for FlinkML Word2Vec is a word embedding algorithm that generates vectors to reflect the contextual and semantic values of those words in a text This implementation uses an abstracted embedding algorithm - based on the original Word2Vec algorithms - to allow users to extend embedding to reflect problems outside of words ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---