Hi, Julian,

Thanks for explanation. I got your point that the physical layer
"stream-scan" can be used to get the delta(filter(..)) in the logical
algebra.
My question on this model is:
If a window operation is implemented as filter(tuple.isInWindow(),
stream-scan(Orders)) in the physical layer, it still involves "stream-scan"
on a stream and output a delta relation; while the output of the select
operator is a delta relation and should have a scan(<select-result>)
operation to output it to the stream.

Or, unless that you were referring to the model that even in the physical
operators for non-stream SQL, the operators should operate on "streams" of
tuples coming into the operator logic. Just that the "streams" are
generated by scanning the regular relational table during the operation.
Then, I agree that essentially the physical operators for non-stream and
stream queries may be merged in one model. Am I interpreting your idea
correctly?

On Wed, Jan 28, 2015 at 4:52 PM, Julian Hyde <jul...@hydromatic.net> wrote:

> Consider this simple query (I'll express in 3 equivalent ways):
>
> * select stream * from Orders where state = 'CA' (in streaming SQL)
> * istream [ select * from Orders where state = 'CA' ] (in CQL)
> * delta(filter(state = 'CA', scan(Orders))) (in logical algebra)
>
> In CQL there are no named streams, just streamable tables. So we have to
> ask for the istream of it.
>
> But in Samza or any other streaming system, Orders is a stream. You can
> simply convert the logical algebra
>
>   delta(filter(state = 'CA', scan(Orders)))
>
> to the physical algebra
>
>   filter(state = 'CA', stream-scan(Orders))
>
> In the physical algebra the data stays in streaming format all the way
> through.
>
> My point was that stream-to-relation and relation-to-stream occur in EVERY
> CQL query (and logical algebra) but do not necessarily occur in the
> physical algebra.
>
> Julian
>
>
> > On Jan 28, 2015, at 2:18 PM, Yi Pan <nickpa...@gmail.com> wrote:
> >
> > Hi, Julian,
> >
> > Thanks! I think we all agreed on the point to isolate between SQL AST and
> > the logical algebra.
> >
> > Focusing on your comment below:
> > "The stream-to-relation and relation-to-stream operators are in the
> logical
> > algebra but very likely have disappeared by the time you get to the
> > physical algebra. And the physical algebra introduces new constructs like
> > lookups into time-varying materializations and partitioning."
> >
> > In our case, the physical algebra is the Samza operators. I found it hard
> > to understand how we can make the stream-to-relation and
> relation-to-stream
> > operators going away. For example, window operator is a construct to
> create
> > a time-varying materializations of relation and istream operators is a
> > construct to take the insertions of new rows in a time-varying relation
> and
> > output to a stream of tuples. I agree on your comments on rstream, which
> > seems just have academic meanings. But I am not sure w/o the physical
> > operators performing the relation/stream conversions, how do we implement
> > the window operator?
> >
> > -Yi
> >
> >
> > On Wed, Jan 28, 2015 at 2:01 PM, Julian Hyde <jul...@hydromatic.net>
> wrote:
> >
> >>
> >> On Jan 28, 2015, at 10:02 AM, Yi Pan <nickpa...@gmail.com> wrote:
> >>
> >>> I try to understand your comments below: "But there is not a simple
> >>> mapping between
> >>> true SQL and a data-flow graph that you can execute." What is the
> >> specific
> >>> meaning of this statement? Could you elaborate on this a bit more?
> >>
> >> The structure of a SQL query (and its AST) is different to the structure
> >> of the relational algebra that it translates to. The elements of a SQL
> >> query are its clauses (FROM, WHERE, GROUP BY, SELECT, HAVING, ORDER BY)
> and
> >> the elements of a relational algebra expression are the relational
> >> operators (scan, join, filter, aggregate, project, sort) and for simple
> >> queries there is a simple mapping. But the mapping becomes complex when
> >> there are sub-queries and especially correlations, but even a 3-way
> outer
> >> join can be complex. In Calcite, SqlToRelConverter, which performs this
> >> task, started off 100 lines long and is now 5,000.
> >>
> >> My point was that you shouldn’t conflate the SQL AST with the logical
> >> algebra. It sounds like the point is already taken.
> >>
> >> In non-streaming databases, it is almost possible to execute the logical
> >> algebra as is. (You need to use iterators, i.e. convert relations into
> >> streams, and when joining, you need to be careful not to create
> cartesian
> >> products before you start applying filters, but otherwise you’re safe.)
> >>
> >> But in streaming databases, the logical algebra is not implementable.
> You
> >> cannot literally implement the stream-to-relation or relation-to-stream
> >> operators, or, heaven forbid, the r-stream, that re-transmits the whole
> >> table every clock-tick. So in addition to the logical algebra you need a
> >> physical algebra. The stream-to-relation and relation-to-stream
> operators
> >> are in the logical algebra but very likely have disappeared by the time
> you
> >> get to the physical algebra. And the physical algebra introduces new
> >> constructs like lookups into time-varying materializations and
> partitioning.
> >>
> >> Julian
>
>

Reply via email to