Hi Jeremy, here are a few links about the recent efforts for ML on streams with Flink:
- Discussion on the dev mailing list [1] - Announcement of a Slack channel [2] - GDocs Design Doc [3] IMO, anomaly detection is a great use case for ML on streams. Cheers, Fabian [1] https://lists.apache.org/thread.html/638fdee0c361a7fb362e050e8cc79ba1e8b4162b044bcbcca31d31ed@%3Cdev.flink.apache.org%3E [2] https://lists.apache.org/thread.html/e2a1f974300bf1f1b3ff19317a6b7fc941ebedd013950307959cf830@%3Cdev.flink.apache.org%3E [3] https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3Ud06MIRhahtJ6dw 2017-07-21 21:57 GMT+02:00 Branham, Jeremy [IT] <jeremy.d.bran...@sprint.com >: > Thanks Fabian – > > I’m interested in the early development of ML on streams. > > Harshith and I plan on doing some prototyping for NRT anomaly detection > leveraging the stream API. > > It would be great if we could produce something reusable for the community. > > > > > > *From:* Fabian Hueske [mailto:fhue...@gmail.com] > *Sent:* Wednesday, July 19, 2017 2:12 PM > *To:* Branham, Jeremy [IT] <jeremy.d.bran...@sprint.com> > *Cc:* user@flink.apache.org > *Subject:* Re: Flink ML with DataStream > > > > Hi, > > unfortunately, it is not possible to convert a DataStream into a DataSet. > > Flink's DataSet and DataStream APIs are distinct APIs that cannot be used > together. > > > The FlinkML library is only available for the DataSet API. > There is some ongoing work to add a machine learning library for streaming > use cases as well, but this is still in an early stage and mostly focusing > on model serving on streams, i.e, applying an externally trained model on > streaming data. > > Best, Fabian > > > > > > 2017-07-19 19:07 GMT+02:00 Branham, Jeremy [IT] < > jeremy.d.bran...@sprint.com>: > > Hello – > > I’ve been successful working with Flink in Java, but have some trouble > trying to leverage the ML library, specifically with KNN. > > From my understanding, this is easier in Scala [1] so I’ve been converting > my code. > > > > One issue I’ve encountered is – How do I get a DataSet[Vector] from a > DataStream[MyClass]? > > I’ve attempted to use windowing, but scala is completely new to me and I > may need a push in the right direction. > > > > The below code executes properly, I’m just unsure of the next step. > > > > > > I’ve also seen an example [2] that looks like something I need to > implement – especially the PartialModelBuilder. > > Am I on the right track? > > Thoughts? > > > > Thanks! > > > > > > [1] - https://stackoverflow.com/questions/44039857/is-there-a- > apache-flink-machine-learning-tutorial-in-java-language/44040819#44040819 > <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F44039857%2Fis-there-a-apache-flink-machine-learning-tutorial-in-java-language%2F44040819%2344040819&data=02%7C01%7CJeremy.D.Branham%40sprint.com%7Ca4cddcbaad9843dacf8f08d4ceda095d%7C4f8bc0acbd784bf5b55f1b31301d9adf%7C0%7C0%7C636360883235855952&sdata=tqod8bLAlECIJFU7xJbiedYCJSaA4znLECcmTKQAZM8%3D&reserved=0> > > [2] - https://github.com/apache/flink/blob/master/flink- > examples/flink-examples-streaming/src/main/scala/org/ > apache/flink/streaming/scala/examples/ml/IncrementalLearningSkeleton.scala > <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink%2Fblob%2Fmaster%2Fflink-examples%2Fflink-examples-streaming%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fflink%2Fstreaming%2Fscala%2Fexamples%2Fml%2FIncrementalLearningSkeleton.scala&data=02%7C01%7CJeremy.D.Branham%40sprint.com%7Ca4cddcbaad9843dacf8f08d4ceda095d%7C4f8bc0acbd784bf5b55f1b31301d9adf%7C0%7C0%7C636360883235865966&sdata=etFHVGjXsdc1PYRRca7n%2FBSWVm6J8BOmE%2FHKqra2Gss%3D&reserved=0> > > > > > > > > Jeremy D. Branham > > Technology Architect - Sprint > O: +1 (972) 405-2970 <(972)%20405-2970> | M: +1 (817) 791-1627 > <(817)%20791-1627> > > jeremy.d.bran...@sprint.com > > #gettingbettereveryday > > > > > ------------------------------ > > > This e-mail may contain Sprint proprietary information intended for the > sole use of the recipient(s). Any use by others is prohibited. If you are > not the intended recipient, please contact the sender and delete all copies > of the message. > > >