Re: Is Flink SQL a good fit for alerting?

Teoh, Hong Wed, 27 Jul 2022 10:51:50 -0700

Re-pasting from Slack

[cid:image001.png@01D8A1E9.DA582010]
Hong Teoh<https://app.slack.com/team/U03HY4HLVD4>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917368496069?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
I can give some examples, but they are all using DataStream API
https://aws.amazon.com/blogs/big-data/building-a-real-time-notification-system-with-[…]-dynamodb-and-amazon-kinesis-data-analytics-for-apache-flink/<https://aws.amazon.com/blogs/big-data/building-a-real-time-notification-system-with-amazon-kinesis-data-streams-for-amazon-dynamodb-and-amazon-kinesis-data-analytics-for-apache-flink/>
https://aws.amazon.com/blogs/big-data/real-time-bushfire-alerting-with-complex-event[…]cessing-in-apache-flink-on-amazon-emr-and-iot-sensor-network/<https://aws.amazon.com/blogs/big-data/real-time-bushfire-alerting-with-complex-event%5b…%5dcessing-in-apache-flink-on-amazon-emr-and-iot-sensor-network/>
FlinkSQL is quite powerful though, are there any operations that you would like 
that is not currently supported in SQL?


[cid:image002.jpg@01D8A1E9.DA582010]
salvalcantara<https://app.slack.com/team/U03HMEM4QBH>  8 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658916880292849?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
Thanks a lot @Hong Teoh<https://apache-flink.slack.com/team/U03HY4HLVD4>! For 
my use case, Flink SQL should be capable enough...what worries me is how to 
manage/deploy those alerts, if implemented as SQL scripts. In particular, 
having one sql job per user alert looks impractical...even if deployed on the 
same cluster (session mode?). (edited)

[cid:image001.png@01D8A1E9.DA582010]
Hong Teoh<https://app.slack.com/team/U03HY4HLVD4>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917368496069?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
I see…Probably I’d try to design the job to not have to change per user, but 
use the user as a key [:thinking_face:] Or at least split it into typical job 
families, with filters for the “types” of users that should be following each 
code pathIf you have to have a custom job graph per user, sounds like you want 
to design some form of Platform to run Flink jobs in general…

[cid:image002.jpg@01D8A1E9.DA582010]
salvalcantara<https://app.slack.com/team/U03HMEM4QBH>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917749150269?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
yeah...the thing is that I need alerts to run as separate jobs so that I can 
enable/disable specific alerts without affecting the others... (edited)

[cid:image002.jpg@01D8A1E9.DA582010]
salvalcantara<https://app.slack.com/team/U03HMEM4QBH>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917866840939?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
Or...maybe a user changes the definition for a given alert, I just want to 
redeploy this specific alert definition, without affecting the others which 
should continue running without interruption

[cid:image001.png@01D8A1E9.DA582010]
Hong Teoh<https://app.slack.com/team/U03HY4HLVD4>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917941841549?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
Maybe consider having a control stream (with user-key and enable/disable 
field), that can update an in-memory table?OR.. use a lookup join? 
https://github.com/ververica/flink-sql-cookbook/blob/main/joins/04_lookup_joins/04_lookup_joins.md

[cid:image002.jpg@01D8A1E9.DA582010]
salvalcantara<https://app.slack.com/team/U03HMEM4QBH>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917942044659?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
From what I'm seeing...Flink SQL is very good for doing adhoc / low-in-code 
analytics here and there but I don't think it could tackle my use case...having 
said that, I might be wrong since I'm just getting started with Flink SQL...

[cid:image001.png@01D8A1E9.DA582010]
Hong Teoh<https://app.slack.com/team/U03HY4HLVD4>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658917963542199?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
That way you can adjust the “user-specific” configuration in the external 
database without redeploying the job

[cid:image002.jpg@01D8A1E9.DA582010]
salvalcantara<https://app.slack.com/team/U03HMEM4QBH>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658918161944339?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
mmmm....there are two features that I like from Flink SQL that I thought could 
be very useful for alerting purposes: JSON Functions & MATCH_RECOGNIZE (CEP)

[cid:image002.jpg@01D8A1E9.DA582010]
salvalcantara<https://app.slack.com/team/U03HMEM4QBH>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658918220421169?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
coming back to your comment, I guess that I should not try to implement each 
alert as a separate (self-contained) job (SQL script) but instead, I should try 
to use one common job / SQL script... (edited)

[cid:image001.png@01D8A1E9.DA582010]
Hong Teoh<https://app.slack.com/team/U03HY4HLVD4>  7 hours 
ago<https://apache-flink.slack.com/archives/C03G7LJTS2G/p1658918277004899?thread_ts=1658911135.622749&cid=C03G7LJTS2G>
Yeah I think that would be a good way forward!



From: Salva Alcántara <salcantara...@gmail.com>
Date: Wednesday, 27 July 2022 at 09:55
To: user <user@flink.apache.org>
Subject: [EXTERNAL] Is Flink SQL a good fit for alerting?


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I'm recently getting into Flink SQL, which I find great for conducting 
low-in-code analytics. However, I was just wondering whether it could be a good 
fit for alerting applications, too. Alerts of the form `cpu.usage > 75% and 
mem.usage > 75%` would be easy to translate into SQL, for example. For more 
complicated alerts, there are nice features such as JSON Functions or the 
MATCH_RECOGNIZE clausule that would come in very handy.

However, in a system where users can define their own alerts, that would mean 
having one SQL job per alert, meaning that one would end up with many such jobs 
in production. Would something like this work in practice? Or would it just be 
too expensive or impractical to manage?

The best alerting-related resource that I've found so far is this blog post 
series:
https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html
https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html

but this is based on the DataStream API, maybe confirming my Flink SQL 
unsuitability for such use cases?

Thanks in advance,

Salva

Re: Is Flink SQL a good fit for alerting?

Reply via email to