Re: [DISCUSS] Some thoughts about unify Stream SQL and Batch SQL grammer

Aljoscha Krettek Thu, 18 Aug 2016 07:34:42 -0700

Hi,
I personally would like it a lot if the SQL queries for batch and stream
programs looked the same. With the decision to move the Table API on top of
Calcite and also use the Calcite SQL parser Flink is somewhat tied to
Calcite so I don't know whether we can add our own window constructs and
teach the parser to properly read them.


Maybe Fabian and Timo have more insights here since they worked on the move
to Calcite.

Cheers,
Aljoscha

+Timo looping him in directly

On Tue, 16 Aug 2016 at 09:29 Jark Wu <wuchong...@alibaba-inc.com> wrote:

> Hi,
>
> Currently, Flink use Calcite for SQL parsing. So we use the StreamSQL
> grammer proposed by Calcite[1] which we have to use the `STREAM` keyword in
> SQL. For example, `SELECT *
> FROM Orders` is a regular standard SQL and will be translated to a batch
> job. If you want to statement a stream job, you have add the `STREAM`
> keyword, `SELECT STREAM *
> FROM Orders`.
>
> I'm thinking of why do we distinguish between StreamSQL and BatchSQL
> grammer? We already have separate high-level API for batch(DataSet) and
> stream(DataStream). And we have a unified Table API for batch and stream
> (that's great!). Why do we have to separate them again in SQL?
>
> I hope we can manipulate stream data like a table. Such as `SELECT *
> FROM Orders`, if Orders is a table (or run in batch execution env), then
> it's a batch job. If Orders is a stream (or run in stream execution env),
> then it's a stream job. The grammer of StreamSQL and BatchSQL is totally
> the same. And that is what we did in Blink SQL.
>
> The benefits if we unify the grammar :
>
> 1. Easy to use StreamSQL for anyone who knows regular SQL. There is no
> difference between StreamSQL and regular SQL.
> 2. Not blocked by Calcite. Currently, Calcite StreamSQL is not fullly
> supported. Not support stream-to-stream JOIN, not support window aggregate,
> not support aggregate without window, etc. We may need to wait for calcite
> to support them before we start work. As they are supported by regular SQL
> besides window. We can implement window via user-defined-function. So if we
> can use regular SQL instead of StreamSQL, we can start to work it right now
> and not wait for Calcite.
> 3. Blink SQL can merge back to community to accelerate Flink SQL evolving.
> Blink SQL has done most work of it. We implement UDF/UDTF/UDAF, aggregate
> with/without window, and stream-to-stream JOIN, and so on.
> 4. Window also can work in batch job.
>
> Just my thoughts :)
>
> What do you think about this ?
>
> [1] https://calcite.apache.org/docs/stream.html
>
> - Jark Wu
>
>

Re: [DISCUSS] Some thoughts about unify Stream SQL and Batch SQL grammer

Reply via email to