Hello, I would like to propose an enrichment of existing Flink SQL MATCH_RECOGNIZE syntax to cover for the case of the absence of an event. Such an enrichment would help our company solve a business case containing timed-out patterns handling. An example of usage of such a clause from Flink training exercises could be a task of identification of taxi rides with a START event that is not followed by an END event within two hours. Currently, a solution to such a task could be achieved with the use of CEP and a timeout handler. However, as far as I know, it is impossible to take advantage of Flink SQL syntax for this task.
I can think of two ways for such a feature to be incorporated into existing MATCH_RECOGNIZE syntax: - In analogy to CEP, a keyword could be added which would determine, if timed out matches should be dropped altogether or available either through side output or main output. SQL usage could be similar to the current WITHIN clause, f.e. "PATTERN (A B C) TIMEOUT INTERVAL '30' SECOND" would output partially matched patterns 30 seconds after A event appearance. - Add possibility to define absence of event inside pattern definition - for example "PATTERN (A B !C) WITHIN INTERVAL '30' SECOND" would output partially matched patterns with the occurrence of A and B event 30 seconds after A event appearance. In our company we did some basic testing of this concept - we modified existing MatchCodeGenerator to add processTimedOutMatch function based on a boolean trigger and tested it against the aforementioned business case containing timed-out patterns handling. I'm interested to hear your thoughts about how we could help Flink SQL be able to express these kinds of cases. With regards, Kosma Grochowski