Hi Maciek, could you bypass the MATCH_RECOGNIZE (=comment out) and check if the records appear in a shortcutted output?
I suspect that they may be filtered out before (for example because of number conversion issues with 0E-18) On Tue, Jul 6, 2021 at 3:26 PM Maciek Bryński <mac...@brynski.pl> wrote: > Hi, > I have a very strange bug when using MATCH_RECOGNIZE. > > I'm using some joins and unions to create event stream. Sample event > stream (for one user) looks like this: > > uuid cif event_type v balance ts > 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx > 4294.380000000000000000 74.524950000000000000 2021-05-01 04:42:57 > 7b2bc022-b069-41ca-8bbf-e93e3f0e85a7 0004091386 application > 0E-18 74.524950000000000000 2021-05-01 10:29:10 > 942cd3ce-fb3d-43d3-a69a-aaeeec5ee90e 0004091386 application > 0E-18 74.524950000000000000 2021-05-01 10:39:02 > 433ac9bc-d395-457n-986c-19e30e375f2e 0004091386 trx > 4294.380000000000000000 74.524950000000000000 2021-05-01 04:42:57 > > Then I'm using following MATCH_RECOGNIZE definition (trace function will > be explained later) > > CREATE VIEW scenario_1 AS ( > SELECT * FROM events > MATCH_RECOGNIZE( > PARTITION BY cif > ORDER BY ts > MEASURES > TRX.v as trx_amount, > TRX.ts as trx_ts, > APP_1.ts as app_1_ts, > APP_2.ts as app_2_ts, > APP_2.balance as app_2_balance > ONE ROW PER MATCH > PATTERN (TRX ANY_EVENT*? APP_1 NOT_LOAN*? APP_2) WITHIN INTERVAL > '10' DAY > DEFINE > TRX AS trace(TRX.event_type = 'trx' AND TRX.v > 1000, > 'TRX', TRX.uuid, TRX.cif, TRX.event_type, TRX.ts), > ANY_EVENT AS trace(true, > 'ANY_EVENT', TRX.uuid, ANY_EVENT.cif, > ANY_EVENT.event_type, ANY_EVENT.ts), > APP_1 AS trace(APP_1.event_type = 'application' AND APP_1.ts < > TRX.ts + INTERVAL '3' DAY, > 'APP_1', TRX.uuid, APP_1.cif, APP_1.event_type, > APP_1.ts), > APP_2 AS trace(APP_2.event_type = 'application' AND APP_2.ts > > APP_1.ts > AND APP_2.ts < APP_1.ts + INTERVAL '7' DAY AND > APP_2.balance < 100, > 'APP_2', TRX.uuid, APP_2.cif, APP_2.event_type, > APP_2.ts), > NOT_LOAN AS trace(NOT_LOAN.event_type <> 'loan', > 'NOT_LOAN', TRX.uuid, NOT_LOAN.cif, NOT_LOAN.event_type, > NOT_LOAN.ts) > )) > > > This scenario could be matched by sample events because: > - TRX is matched by event with ts 2021-05-01 04:42:57 > - APP_1 by ts 2021-05-01 10:29:10 > - APP_2 by ts 2021-05-01 10:39:02 > Unfortunately I'm not getting any data. And it's not watermarks fault. > > Trace function has following code and gives me some logs: > > public class TraceUDF extends ScalarFunction { > > public Boolean eval(Boolean condition, @DataTypeHint(inputGroup = > InputGroup.ANY) Object ... message) { > log.info((condition ? "Condition true: " : "Condition false: ") + > Arrays.stream(message).map(Object::toString).collect(Collectors.joining(" > "))); > return condition; > } > } > > And log from this trace function is following. > > 2021-07-06 13:09:43,762 INFO TraceUDF [] - > Condition true: TRX 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx > 2021-05-01T04:42:57 > 2021-07-06 13:12:28,914 INFO TraceUDF [] - > Condition true: ANY_EVENT 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 > trx 2021-05-01T15:28:34 > 2021-07-06 13:12:28,915 INFO TraceUDF [] - > Condition false: APP_1 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx > 2021-05-01T15:28:34 > 2021-07-06 13:12:28,915 INFO TraceUDF [] - > Condition false: TRX 433ac9bc-d395-457n-986c-19e30e375f2e 0004091386 trx > 2021-05-01T15:28:34 > > As you can see 2 events are missing. > What can I do ? > I failed with create minimal example of this bug. Any other ideas ? >