Re: Handling defaults and windowed aggregates in stream queries

2015-03-06 Thread Yi Pan
Yeah, I am still thinking about it. Jay pointed out for event-time window, the window start time may be derivable if we just keep a single starting value for fixed length windows. I yet to think about the tuple window case and the windows with dynamic length (i.e. session window example in MillWhee

Re: Handling defaults and windowed aggregates in stream queries

2015-03-06 Thread Yi Pan
Hi, Milinda, Let me put my answers below: On Thu, Mar 5, 2015 at 11:42 AM, Milinda Pathirage wrote: > > > a) Each window will have a closing policy: it would either be wall-clock > > based timeout, or the arrival of messages indicating that we have > received > > all messages in the correspondin

Re: Handling defaults and windowed aggregates in stream queries

2015-03-06 Thread Milinda Pathirage
I think my previous comment about maintaining start and end offsets as the window state will not work when there are delays. We may need to keep multiple such offsets. But this may not be a clean solution. On Thu, Mar 5, 2015 at 2:42 PM, Milinda Pathirage wrote: > Hi Yi, > > Please find my comme

Re: Handling defaults and windowed aggregates in stream queries

2015-03-05 Thread Milinda Pathirage
Hi Yi, Please find my comments inline. On Thu, Mar 5, 2015 at 1:18 PM, Yi Pan wrote: > Hi, Milinda, > > We have recently some discussions on the MillWheel model: > http://www.infoq.com/presentations/millwheel. Yes. Above is a very interesting talk. I asked the above question regarding the lan

Re: Handling defaults and windowed aggregates in stream queries

2015-03-05 Thread Yi Pan
Hi, Milinda, We have recently some discussions on the MillWheel model: http://www.infoq.com/presentations/millwheel. It is very interesting talk and have one striking point that we did not think about before: handle late arrivals as a "correction" to the earlier results. Hence, if we follow that

Re: Handling defaults and windowed aggregates in stream queries

2015-03-04 Thread Julian Hyde
I think that is something to be handled at the stream level, not at the query language. The stream basically needs to declare “all data timestamped before 11:00 has already arrived”. How it does that is a matter of policy. Reasonable policies could be: 1. The wall-clock has reached 11:15 2. A

Re: Handling defaults and windowed aggregates in stream queries

2015-03-04 Thread Milinda Pathirage
Hi Julian, I went through the draft and it covers most of our requirements. But aggregation over a window will not be as simple as mentioned in the draft. In the stream extension draft we have following: 'How did Calcite know that the 10:00:00 sub-totals were complete at > 11:00:00, so that it c

Re: Handling defaults and windowed aggregates in stream queries

2015-03-03 Thread Julian Hyde
Sorry to show up late to this party. I've had my head down writing a description of streaming SQL which I hoped would answer questions like this. Here is the latest draft: https://github.com/julianhyde/incubator-calcite/blob/chi/doc/STREAM.md I've been avoiding windows for now. They are not nee

Re: Handling defaults and windowed aggregates in stream queries

2015-03-02 Thread Milinda Pathirage
Hi Yi, As I understand rules and re-writes basically do the same thing (changing/re-writing the operator tree). But in case of rules this happens during planning based on the query planner configuration. And re-writing is done on the planner output, after the query goes through the planner. In Cal

Re: Handling defaults and windowed aggregates in stream queries

2015-03-02 Thread Yi Pan
Hi, Milinda, +1 on your default window idea. One question: what's the difference between a rule and a re-write? Thanks! On Mon, Mar 2, 2015 at 7:14 AM, Milinda Pathirage wrote: > @Chris > Yes, I was referring to that mail. Actually I was wrong about the ‘Now’ > window, it should be a ‘Unbounde

Re: Handling defaults and windowed aggregates in stream queries

2015-03-02 Thread Milinda Pathirage
@Chris Yes, I was referring to that mail. Actually I was wrong about the ‘Now’ window, it should be a ‘Unbounded’ window for most the default scenarios (Section 6.4 of https://cs.uwaterloo.ca/~david/cs848/stream-cql.pdf). Because applying a ‘Now’ window with size of 1 will double the number of even

Re: Handling defaults and windowed aggregates in stream queries

2015-03-01 Thread Yi Pan
Hi, Milinda, Sorry to reply late on this. Here are some of my comments: 1) In Calcite's model, it seems that there is no stream-to-relation conversion step. In the first example where the window specification is missing, I like your solution to add the default LogicalNowWindow operator s.t. it mak

Re: Handling defaults and windowed aggregates in stream queries

2015-03-01 Thread Chris Riccomini
Hey Milinda, Are you referring to this thread? http://mail-archives.apache.org/mod_mbox/incubator-calcite-dev/201502.mbox/%3CCACwebjTshFNi=es4qz1zkkqmuygzn+lwj_bapqpdrvsy2tq...@mail.gmail.com%3E It appears as though your question remains unanswered. :( > If we consider LogicalFilter as a relati