I am using time-windowed join only. Here's a sample query - SELECT a1.order_id, a2.order.restaurant_id FROM awz_s3_stream1 a1 INNER JOIN awz_s3_stream2 a2 ON CAST(a1.order_id AS VARCHAR) = a2.order_id AND a1.to_state = 'PLACED' AND a1.proctime BETWEEN a2.proctime - INTERVAL '2' HOUR AND a2.proctime + INTERVAL '2' HOUR GROUP BY HOP(a2.proctime, INTERVAL '2' MINUTE, INTERVAL '1' HOUR), a2.`order`.restaurant_id
Just to simplify my question - Suppose I have a TM with 4 slots and I deploy a flink job with parallelism=4 with 2 container - 1 JM and 1 TM. Each parallel instance will be deployed in one task slot each in the TM (the entire job pipeline running per slot ).My jobs does a join(SQL time-windowed join on non-keyed stream) and they buffer last few hours of data. My question is will these threads running in different task slot share this data buffered for join. What all data is shared across these threads. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/