[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683618#comment-16683618 ]
ASF GitHub Bot commented on FLINK-10625: ---------------------------------------- twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232605051 ########## File path: docs/dev/table/streaming/match_recognize.md ########## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns <span class="label label-danger" style="font-size:50%">Experimental</span>' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +------------- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( + PARTITION BY symbol + ORDER BY rowtime + MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp + ONE ROW PER MATCH + AFTER MATCH SKIP TO LAST UP + PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) + DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL + ) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +====== ==================== ======= +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM +========= ================== ================== ================== +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. Review comment: This should also part of an `Overview` part, because it nicely summarizes the feature. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > --------------------------------- > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API & SQL > Affects Versions: 1.7.0 > Reporter: Till Rohrmann > Assignee: Dawid Wysakowicz > Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)