Thanks Fabian – I’m interested in the early development of ML on streams. Harshith and I plan on doing some prototyping for NRT anomaly detection leveraging the stream API. It would be great if we could produce something reusable for the community.
From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: Wednesday, July 19, 2017 2:12 PM To: Branham, Jeremy [IT] <jeremy.d.bran...@sprint.com> Cc: user@flink.apache.org Subject: Re: Flink ML with DataStream Hi, unfortunately, it is not possible to convert a DataStream into a DataSet. Flink's DataSet and DataStream APIs are distinct APIs that cannot be used together. The FlinkML library is only available for the DataSet API. There is some ongoing work to add a machine learning library for streaming use cases as well, but this is still in an early stage and mostly focusing on model serving on streams, i.e, applying an externally trained model on streaming data. Best, Fabian 2017-07-19 19:07 GMT+02:00 Branham, Jeremy [IT] <jeremy.d.bran...@sprint.com<mailto:jeremy.d.bran...@sprint.com>>: Hello – I’ve been successful working with Flink in Java, but have some trouble trying to leverage the ML library, specifically with KNN. From my understanding, this is easier in Scala [1] so I’ve been converting my code. One issue I’ve encountered is – How do I get a DataSet[Vector] from a DataStream[MyClass]? I’ve attempted to use windowing, but scala is completely new to me and I may need a push in the right direction. The below code executes properly, I’m just unsure of the next step. [cid:image001.png@01D30231.89869CF0] I’ve also seen an example [2] that looks like something I need to implement – especially the PartialModelBuilder. Am I on the right track? Thoughts? Thanks! [1] - https://stackoverflow.com/questions/44039857/is-there-a-apache-flink-machine-learning-tutorial-in-java-language/44040819#44040819<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F44039857%2Fis-there-a-apache-flink-machine-learning-tutorial-in-java-language%2F44040819%2344040819&data=02%7C01%7CJeremy.D.Branham%40sprint.com%7Ca4cddcbaad9843dacf8f08d4ceda095d%7C4f8bc0acbd784bf5b55f1b31301d9adf%7C0%7C0%7C636360883235855952&sdata=tqod8bLAlECIJFU7xJbiedYCJSaA4znLECcmTKQAZM8%3D&reserved=0> [2] - https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/scala/org/apache/flink/streaming/scala/examples/ml/IncrementalLearningSkeleton.scala<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink%2Fblob%2Fmaster%2Fflink-examples%2Fflink-examples-streaming%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fflink%2Fstreaming%2Fscala%2Fexamples%2Fml%2FIncrementalLearningSkeleton.scala&data=02%7C01%7CJeremy.D.Branham%40sprint.com%7Ca4cddcbaad9843dacf8f08d4ceda095d%7C4f8bc0acbd784bf5b55f1b31301d9adf%7C0%7C0%7C636360883235865966&sdata=etFHVGjXsdc1PYRRca7n%2FBSWVm6J8BOmE%2FHKqra2Gss%3D&reserved=0> Jeremy D. Branham Technology Architect - Sprint O: +1 (972) 405-2970<tel:(972)%20405-2970> | M: +1 (817) 791-1627<tel:(817)%20791-1627> jeremy.d.bran...@sprint.com<mailto:jeremy.d.bran...@sprint.com> #gettingbettereveryday ________________________________ This e-mail may contain Sprint proprietary information intended for the sole use of the recipient(s). Any use by others is prohibited. If you are not the intended recipient, please contact the sender and delete all copies of the message.