Re: using the Hive SQL parser in Spark

Yin Huai Fri, 18 Dec 2015 13:18:51 -0800

Let me add Reynold to the thread.

On Fri, Dec 18, 2015 at 12:36 PM, Gopal Vijayaraghavan <gop...@apache.org>
wrote:


>
> >We have looked into various options, and it looks like the best option is
> >to copy the ANTLR grammar file from Hive into Spark. Because the grammar
> >file is tightly coupled with Hive's semantic analysis, we need to refactor
> >some code to use them so it will end up becoming the .g file plus some
> >coupled code.
>
> Is the eventual goal to contribute that fork back into Hive & have Hive
> devs maintain a compatible parser for SparkSQL?
>
> Would that affect Hive's ability to refactor the SQL parser in the future
> or is this a one-time only deal?
>
> >parser. From Hive's perspective this does not provide any immediate
> >benefits. From Spark's perspective, we iterate very quickly so having to
> >depend on an external component also slow down our development. We also
> >have some requirements that simply don't apply in other projects (e.g.
> >being able to parse DataFrame expressions).
>
> From that I assume, this involves some form of cut-paste duplication of
> the code into SparkSQL project with that version diverging away from
> Hive's.
>
> > Thanks a lot for developing this parser, and we will try our best to
> > contribute back as we fix bugs. I will also make sure we have the proper
> > acknowledgment when we do this.
>
>
> Under the Apache license, there's no actual restriction against a hostile
> embrace-extend by copying hive's code verbatim as long as the fork retains
> license notices.
>
> The maintainability concerns are mostly around whether this is intended as
> an ongoing relationship, including any compatibility committments from
> hive-dev@.
>
>
> Cheers,
> Gopal
>
>
>

Re: using the Hive SQL parser in Spark

Reply via email to