Github user kalmanchapman commented on the issue:
https://github.com/apache/flink/pull/2735
@kateri1 - I agree that seeking a solution with Flink's data structures is
valuable.
I also think that Flink-ML is in a unique position to implement
streaming-first, iter
Github user kalmanchapman commented on a diff in the pull request:
https://github.com/apache/flink/pull/2735#discussion_r109329665
--- Diff:
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nlp/Word2Vec.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user kalmanchapman commented on a diff in the pull request:
https://github.com/apache/flink/pull/2735#discussion_r109329660
--- Diff:
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nlp/Word2Vec.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user kalmanchapman commented on a diff in the pull request:
https://github.com/apache/flink/pull/2735#discussion_r109329658
--- Diff:
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nlp/Word2Vec.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user kalmanchapman commented on a diff in the pull request:
https://github.com/apache/flink/pull/2735#discussion_r109329636
--- Diff:
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nlp/Word2Vec.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user kalmanchapman commented on a diff in the pull request:
https://github.com/apache/flink/pull/2735#discussion_r109329644
--- Diff:
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nlp/Word2Vec.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user kalmanchapman commented on a diff in the pull request:
https://github.com/apache/flink/pull/2735#discussion_r109329639
--- Diff:
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nlp/Word2Vec.scala
---
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache
Github user kalmanchapman commented on the issue:
https://github.com/apache/flink/pull/2735
Hey Theodore,
Thanks for taking a look at my PR!
- I'll add docs shortly, per the examples you posted.
- I've tested against datasets in the hundreds-of-megabytes s
GitHub user kalmanchapman opened a pull request:
https://github.com/apache/flink/pull/2735
[FLINK-2094] implements Word2Vec for FlinkML
This pr implements Word2Vec for FlinkML - addressing Jira Issue
[Flink-2094](https://issues.apache.org/jira/browse/FLINK-2094)
Word2Vec