First of all, if you want a stream output, you should add the 'STREAM' keyword after 'SELECT'.
There isn't quite enough information to make the query well-defined. I need to assume that each stream has an increasing rowtime column, and that the "OVER (ROWS 3 PRECEDING)" windows on each stream are intended to be linked on rowtime. I'd rather that the system doesn't assume anything, and that these linkages are explicit in either the stream definition or the query, but let's take it as a given for now. This query would not be equivalent to join queries with different windows. Consider the following queries: (1) semi-infinite x semi-infinite SELECT STREAM id, value, cost FROM Orders OVER (ORDER BY rowtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) JOIN Shipments OVER (ORDER BY rowtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) ON Orders.id = Shipments.id (2) semi-infinite x 1 SELECT STREAM id, value, cost FROM Orders OVER (ORDER BY rowtime BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) JOIN Shipments OVER (ORDER BY rowtime ROWS CURRENT ROW) ON Orders.id = Shipments.id (3) 3 x 1 SELECT STREAM id, value, cost FROM Orders OVER (ORDER BY rowtime ROWS 3 PRECEDING) JOIN Shipments OVER (ORDER BY rowtime ROWS CURRENT ROW) ON Orders.id = Shipments.id At any moment your 3 x 3 query will have up to 9 rows in the cartesian product, and any given Orders row might match 3 Shipments, and any given Shipments row might match 3 Orders. So, you get a multiplicative effect. And query (1) semi-infinite x semi-infinite could, if id is not unique in orders and shipments, produce an even bigger multiplier. For people writing practical queries I would only recommend having a multi-row window on one side of the join, and/or make sure that their id column is unique on one side of the join. To summarize: if we make assumptions about joining on rowtime, then your query and queries (1), (2), (3) are valid, but due to their different windows are not equivalent. Julian On Fri, Mar 6, 2015 at 4:28 PM, Yi Pan <nickpa...@gmail.com> wrote: > Hi, Julian, > > I am writing down some detailed examples of join and need your further help > in understanding the semantic meaning of the following example: > > SELECT id, value, cost FROM Orders OVER (ROWS 3 PRECEDING) JOIN Shipments > OVER (ROWS 3 PROCEDING) ON Orders.id = Shipments.id > > In this example, if id is not the row number, what would be the default > semantic meaning of the join? Does it mean that each sliding window in > Orders now need to join with each sliding window in Shipments? If that's > the case, that would be equivalent to removing all the windows and just do > a full stream-to-stream join, which is not possible for infinite streams. > Hence, does not make sense to define window in the query here. On the other > hand, if the semantic of the join is to join the "current windows" from > both streams, how do we align the windows from two streams? Because in this > case, when the query starts from two streams now determines the results. > > Does the above example make sense in SQL? Or should we just declare that > this is not supported as joins of two streams w/o time-bound? > > Thanks! > > -Yi