Re: SQL on Flink

Kostas Tzoumas Wed, 27 May 2015 07:35:07 -0700

I think Fabian's arguments make a lot of sense.

However, if Timo *really wants* to start SQL on top of Table, that is what
he will do a great job at :-) As usual, we can keep it in beta status in
flink-staging until it is mature... and it will help create issues for the
Table API and give direction to its development. Perhaps we will have a
feature-poor SQL for a bit, then switch to hardening the Table API to
support more features and then back to SQL.


I'm just advocating for "committer passion"-first here :-) Perhaps Timo
should weight in

On Wed, May 27, 2015 at 4:19 PM, Fabian Hueske <[email protected]> wrote:

> IMO, it is better to have one feature that is reasonably well developed
> instead of two half-baked features. That's why I proposed to advance the
> Table API a bit further before starting the next big thing. I played around
> with the Table API recently and I think it definitely needs a bit more
> contributor attention and more features to be actually usable. Also since
> all features of the SQL interface need to be included in the Table API
> (given we follow the SQL on Table approach) it makes sense IMO to push the
> Table API a bit further before going for the next thing.
>
> 2015-05-27 16:06 GMT+02:00 Stephan Ewen <[email protected]>:
>
> > I see no reason why a SQL interface cannot be "bootstrapped"
> concurrently.
> > It would initially not support many operations,
> > but would act as a good source to test and drive functionality from the
> > Table API.
> >
> >
> > @Ted:
> >
> > I would like to learn a bit more about the stack and internal
> abstractions
> > of Drill. It may make sense to
> > reuse some of the query execution operators from Drill. I especially like
> > the "learning schema on the fly" part of drill.
> >
> > Flink DataSets and Streams have a schema, but it may in several cases be
> a
> > "schema lower bound", like the greatest common superclass.
> > Those cases may benefit big time from Drill's ability to refine schema on
> > the fly.
> >
> > That may be useful also in the Table API, making it again available to
> > LINQ-like programs, and SQL scripts.
> >
> > On Wed, May 27, 2015 at 3:49 PM, Robert Metzger <[email protected]>
> > wrote:
> >
> > > I didn't know that paper...  Thanks for sharing.
> > >
> > > I've worked on a SQL layer for Stratosphere some time ago, using Apache
> > > Calcite (called Optiq back then). I think the project provides a lot of
> > > very good tooling for creating a SQL layer. So if we decide to go for
> SQL
> > > on Flink, I would suggest to use Calcite.
> > > I can also help you a bit with Calcite to get started with it.
> > >
> > > I agree with Fabian that it would probably make more sense for now to
> > > enhance the Table API.
> > > I think the biggest limitation right now is that it only supports
> POJOs.
> > > We should also support Tuples (I know thats difficult to do), data from
> > > HCatalog (that includes parquet & orc), JSON, ...
> > > Then, I would add filter and projection pushdown into the table API.
> > >
> > >
> > >
> > > On Tue, May 26, 2015 at 10:03 PM, Ted Dunning <[email protected]>
> > > wrote:
> > >
> > > > It would also be relatively simple (I think) to retarget drill to
> Flink
> > > if
> > > > Flink doesn't provide enough typing meta-data to do traditional SQL.
> > > >
> > > >
> > > >
> > > > On Tue, May 26, 2015 at 12:52 PM, Fabian Hueske <[email protected]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Flink's Table API is pretty close to what SQL provides. IMO, the
> best
> > > > > approach would be to leverage that and build a SQL parser (maybe
> > > together
> > > > > with a logical optimizer) on top of the Table API. Parser (and
> > > optimizer)
> > > > > could be built using Apache Calcite which is providing exactly
> this.
> > > > >
> > > > > Since the Table API is still a fairly new component and not very
> > > feature
> > > > > rich, it might make sense to extend and strengthen it before
> putting
> > > > > something major on top.
> > > > >
> > > > > Cheers, Fabian
> > > > >
> > > > > 2015-05-26 21:38 GMT+02:00 Timo Walther <[email protected]>:
> > > > >
> > > > > > Hey everyone,
> > > > > >
> > > > > > I would be interested in having a complete SQL API in Flink. How
> is
> > > the
> > > > > > status there? Is someone already working on it? If not, I would
> > like
> > > to
> > > > > > work on it. I found
> > http://ijcsi.org/papers/IJCSI-12-1-1-169-174.pdf
> > > > but
> > > > > > I couldn't find anything on the mailing list or Jira. Otherwise I
> > > would
> > > > > > open an issue and start a discussion about it there.
> > > > > >
> > > > > > Regards,
> > > > > > Timo
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: SQL on Flink

Reply via email to