Hi Julian,

Thanks for sharing your thought. I'm certainly on board on code sharing
among project. However, I don't see immediate benefits for Hive by
separating Beeline into two modules. Instead, it requires additional work
and potentially creates instability, while code sharing isn't achieved
until the proposed hive-sqlline module is promoted to an independent
project.

On the other hand, I'm thinking if it makes more sense to fork sqlline
directly into Apache. upon its completion, Hive gets rid of its copy of
sqlline and creates a dependency on the forked sqlline instead. I guess
this is a top-down approach and the benefits are immediate across multiple
projects.

Thanks,
Xuefu


On Mon, Feb 3, 2014 at 10:49 AM, Julian Hyde <julianh...@gmail.com> wrote:

> As you probably know, Hive's SQL command-line interface Beeline was
> created by forking Sqlline [1] [2]. At the time it was a useful but
> low-activity project languishing on SourceForge without an active owner.
> Around the same time, I independently picked up the Sqlline code, moved it
> to github [3], put in place a maven build process, and gave it some love.
> Now several projects are using it, including Apache Drill, Apache Phoenix,
> Cascading Lingual and Optiq. So, now we have two active forks of Sqlline.
>
> I propose to merge these development forks.
>
> This will achieve a few things. We should be able to fix more bugs, and
> add more features, and get more people using sqlline. (Just today, someone
> ran into a bug that Drill was not saving/restoring command history, then
> noticed that it was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug
> still exists in Hive's beeline.)
>
> I propose the following:
> 1. Move the parts of hive-beeline module that do not depend upon Hive
> (about 90% of the code) into a new module in the hive repo, hive-sqlline.
> 2. What remains in the hive-beeline module is Beeline.java (a derived
> class of Sqlline.java) and Hive-specific extensions. The hive-beeline
> module depends upon the hive-sqlline module.
> 3. Make sure that the new Hive sqlline module contains all fixes and
> useful changes from both forks.
> 4. Release sqlline as a maven artifact, say {groupId=org.apache.hive,
> artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate
> to it.
> 5. Longer term, consider moving hive-sqlline out of Hive, but still within
> Apache.
>
> This achieves continuity for Hive's users, gives the users of the non-Hive
> sqlline a version with minimal dependencies, unifies the two code lines,
> and brings everything under the Apache roof.
>
> Please let me know if this sounds like a good proposal. I'll log a jira
> case, then start work on a patch.
>
> Julian
>
> [1] https://issues.apache.org/jira/browse/HIVE-987
> [2] https://issues.apache.org/jira/browse/HIVE-3100
> [3] https://github.com/julianhyde/sqlline
> [4] https://github.com/julianhyde/sqlline/issues/19
> [5] https://issues.apache.org/jira/browse/DRILL-327

Reply via email to