I would also have a look at the random cut forest algorithm. This is the base algorithm that is used for anomaly detection in several AWS services (Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it working with Flink, but may be a good place to start for an algorithm.
https://github.com/aws/random-cut-forest-by-aws Ryan From: Marta Paes Moreira <ma...@ververica.com> Sent: Friday, April 3, 2020 5:25 AM To: Salvador Vigo <salvador...@gmail.com> Cc: user <user@flink.apache.org> Subject: RE: [EXTERNAL] Anomaly detection Apache Flink CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi, Salvador. You can find some more examples of real-time anomaly detection with Flink in these presentations from Microsoft [1] and Salesforce [2] at Flink Forward. This blogpost [3] also describes how to build that kind of application using Kinesis Data Analytics (based on Flink). Let me know if these resources help! [1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI [2] https://www.youtube.com/watch?v=D4kk1JM8Kcg [3] https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo <salvador...@gmail.com<mailto:salvador...@gmail.com>> wrote: Hi there, I am working in an approach to make some experiments related with anomaly detection in real time with Apache Flink. I would like to know if there are already some open issues in the community. The only example I found was the one of Scott Kidder<https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any one is already working in this topic or know some related work or publication I will be grateful. Best,