Note to other ML users: this seems to be a duplicate, let's respond only to the earlier message.
On Tue, Jul 6, 2021 at 3:28 PM Maciej Bryński <mac...@brynski.pl> wrote: > Hi, > I have a very strange bug when using MATCH_RECOGNIZE. > > I'm using some joins and unions to create an event stream. Sample > event stream (for one user) looks like this: > > uuid cif event_type v balance ts > 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx > 4294.380000000000000000 74.524950000000000000 2021-05-01 04:42:57 > 7b2bc022-b069-41ca-8bbf-e93e3f0e85a7 0004091386 application 0E-18 > 74.524950000000000000 2021-05-01 10:29:10 > 942cd3ce-fb3d-43d3-a69a-aaeeec5ee90e 0004091386 application 0E-18 > 74.524950000000000000 2021-05-01 10:39:02 > 433ac9bc-d395-457n-986c-19e30e375f2e 0004091386 trx > 4294.380000000000000000 74.524950000000000000 2021-05-01 04:42:57 > > Then I'm using following MATCH_RECOGNIZE definition (trace function > will be explained later) > > CREATE VIEW scenario_1 AS ( > SELECT * FROM events > MATCH_RECOGNIZE( > PARTITION BY cif > ORDER BY ts > MEASURES > TRX.v as trx_amount, > TRX.ts as trx_ts, > APP_1.ts as app_1_ts, > APP_2.ts as app_2_ts, > APP_2.balance as app_2_balance > ONE ROW PER MATCH > PATTERN (TRX ANY_EVENT*? APP_1 NOT_LOAN*? APP_2) WITHIN > INTERVAL '10' DAY > DEFINE > TRX AS trace(TRX.event_type = 'trx' AND TRX.v > 1000, > 'TRX', TRX.uuid, TRX.cif, TRX.event_type, TRX.ts), > ANY_EVENT AS trace(true, > 'ANY_EVENT', TRX.uuid, ANY_EVENT.cif, > ANY_EVENT.event_type, ANY_EVENT.ts), > APP_1 AS trace(APP_1.event_type = 'application' AND APP_1.ts < > TRX.ts + INTERVAL '3' DAY, > 'APP_1', TRX.uuid, APP_1.cif, APP_1.event_type, > APP_1.ts), > APP_2 AS trace(APP_2.event_type = 'application' AND APP_2.ts > > APP_1.ts > AND APP_2.ts < APP_1.ts + INTERVAL '7' DAY AND > APP_2.balance < 100, > 'APP_2', TRX.uuid, APP_2.cif, APP_2.event_type, > APP_2.ts), > NOT_LOAN AS trace(NOT_LOAN.event_type <> 'loan', > 'NOT_LOAN', TRX.uuid, NOT_LOAN.cif, > NOT_LOAN.event_type, NOT_LOAN.ts) > )) > > > This scenario could be matched by sample events because: > - TRX is matched by event with ts 2021-05-01 04:42:57 > - APP_1 by ts 2021-05-01 10:29:10 > - APP_2 by ts 2021-05-01 10:39:02 > Unfortunately I'm not getting any data. And it's not watermarks fault. > > Trace function has following code and gives me some logs: > > public class TraceUDF extends ScalarFunction { > > public Boolean eval(Boolean condition, @DataTypeHint(inputGroup = > InputGroup.ANY) Object ... message) { > log.info((condition ? "Condition true: " : "Condition false: > ") + > Arrays.stream(message).map(Object::toString).collect(Collectors.joining(" > "))); > return condition; > } > } > > And log from this trace function is following. > > 2021-07-06 13:09:43,762 INFO TraceUDF [] - > Condition true: TRX 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 > trx 2021-05-01T04:42:57 > 2021-07-06 13:12:28,914 INFO TraceUDF [] > - Condition true: ANY_EVENT 621456e9-389b-409b-aaca-bca99eeb43b3 > 0004091386 trx 2021-05-01T15:28:34 > 2021-07-06 13:12:28,915 INFO TraceUDF [] > - Condition false: APP_1 621456e9-389b-409b-aaca-bca99eeb43b3 > 0004091386 trx 2021-05-01T15:28:34 > 2021-07-06 13:12:28,915 INFO TraceUDF [] > - Condition false: TRX 433ac9bc-d395-457n-986c-19e30e375f2e 0004091386 > trx 2021-05-01T15:28:34 > > As you can see 2 events are missing. > What can I do ? > I failed with create minimal example of this bug. Any other ideas ? > > Regards, > -- > Maciek Bryński >