Re: Notify on 0 events in a Tumbling Event Time Window

Dario Heinisch Mon, 09 May 2022 06:30:56 -0700

It depends on the user case, in Shilpa's use case it is about users sothe user ids are probably know beforehand.

https://dpaste.org/cRe3G <= This is an example with out an window butessentially Shilpa you would be reregistering the timers every time theyfire.You would also have to ingest the user ids before hand into yourpipeline, so that if a user never has any event he still gets anotification. So probably on startup ingest the user ids with a singlesource

from the DB.


My example is pretty minimal but the idea in your case stays the same:

- key by user
- have a co-process function to init the state with the user ids
- reregister the timers every time they fire

- use `env.getConfig().setAutoWatermarkInterval(1000)` to move the eventtime forward even if there is no data coming in (this is what you areprobably looking for!!)- then collect an Optionable/CustomStruct/Null or so depending on ifdata is present or not- and then u can check whether the event was triggered because there wasdata or because there wasn't data


Best regards,

Dario

On 09.05.22 15:19, Andrew Otto wrote:

This sounds similar to a non streaming problem we had at WMF. Weingest all event data from Kafka into HDFS/Hive and partition the Hivetables in hourly directories. If there are no events in a Kafka topicfor a given hour, we have no way of knowing if the hour has beeningested successfully. For all we know, the upstream producerpipeline might be broken.
We solved this by emitting artificial 'canary' events into each topicmultiple times an hour. The canary events producer uses the same codepathways and services that (most) of our normal event producers do. Then, when ingesting into Hive, we filter out the canary events. Theingestion code has work to do and can mark an hour as complete, butstill end up writing no events to it.
Perhaps you could do the same? Always emit artificial events, andfilter them out in your windowing code? The window should still firesince it will always have events, even if you don't use them?
On Mon, May 9, 2022 at 8:55 AM Shilpa Shankar <sshan...@bandwidth.com>wrote:
    Hello,
    We are building a flink use case where we are consuming from a
    kafka topic and performing aggregations and generating alerts
    based on average, max, min thresholds. We also need to notify the
    users when there are 0 events in a Tumbling Event Time Windows. We
    are having trouble coming up with a solution to do the same. The
    options we considered are below, please let us know if there are
    other ideas we haven't looked into.

    [1] Querable State : Save the keys in each of the Process Window
    Functions. Query the state from an external application and alert
    when a key is missing after the 20min time interval has expired.
    We see Queryable state feature is being deprecated in the future.
    We do not want to go down this path when we already know there is
    an EOL for it.

    [2] Use Processing Time Windows :  Using Processing time instead
    of Event time would have been an option if our downstream
    applications would send out events in real time. Maintenances of
    the downstream applications, delays etc would result in a lot of
    data loss which is undesirable.

    Flink version : 1.14.3

    Thanks,
    Shilpa

Re: Notify on 0 events in a Tumbling Event Time Window

Reply via email to