In this case, maybe we can bring sqlparser-rs into the ASF umbrella following 
the arrow-datafusion model?

Once DataFusion becomes a top-level project, we could move it to 
datafusion-sqlparser-rs — it would be a quasi-independent project just like how 
DataFusion is today w.r.t. Arrow. But it would get most benefits of having a 
community behind it.

> On Feb 27, 2024, at 2:11 AM, Andrew Lamb <al...@influxdata.com> wrote:
> 
> Julian, thank you for your insight. I very much agree with it.
> 
>> I think the ASF is wrong on this. I think it needs to provide a home
>> for medium-sized projects such as sqlparser-rs in an existing
>> top-level project;
> 
> It could be said that DataFusion fits this model  -- it isn't really an
> "Arrow" project but needed a place to live and grow, and the Arrow ASF
> community provided that.
> 
> Andrew
> 
> 
> 
> 
> On Mon, Feb 26, 2024 at 1:09 PM Julian Hyde <jh...@apache.org> wrote:
> 
>> I am torn on this.
>> 
>> One one hand, I am a big fan of components that are standalone - have
>> no more dependencies than necessary, and are self-evidently
>> standalone. So, I think that re-absorbing sqlparser-rs back into
>> DataFusion would not be a good step. It would reduce the perception
>> that it is standalone.
>> 
>> On the other hand, it sounds as if sqlparser-rs would benefit by
>> having an Apache-like community around it. DataFusion isn't a perfect
>> fit - there is not much overlap between DataFusion and sqlparser-rs
>> users - but it takes a lot of effort to create and run a top-level
>> project, and DataFusion is already up and running.
>> 
>> The tension is that people want to consume components that they
>> perceive to be standalone, and yet the ASF wants to create communities
>> that produce either a single large component or sets of highly-coupled
>> components. The ASF used to do 'umbrella projects' whose sub-projects
>> were in the same subject area but had little or no dependencies. For
>> example, Apache DB [ https://db.apache.org/ ] has JDO, Derby and
>> Torque. And commons included many useful Java libraries. Umbrella
>> projects caused problems during the Jakarta and Hadoop eras, and now
>> are strongly discouraged at the ASF.
>> 
>> I think the ASF is wrong on this. I think it needs to provide a home
>> for medium-sized projects such as sqlparser-rs in an existing
>> top-level project; maybe those projects grow into top-level projects,
>> or maybe they remain medium-sized projects. This is especially
>> necessary in the Rust community, where there are many exciting
>> projects, but they are almost all happening outside ASF. (This is
>> exactly where Java was in ~2005. Maybe we need a rust-commons or
>> rust-db?)
>> 
>> My conclusion is to leave sqlparser-rs where it is for now, but to
>> continue talking about what might be an attractive home for it in ASF.
>> 
>> Julian
>> 
>> On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb <al...@influxdata.com> wrote:
>>> 
>>> Sorry for the late reply,
>>> 
>>> I think sqlparser-rs users are quite a bit more varied than DataFusion
>> and
>>> there is not a large overlap between the contributors of the two
>> projects.
>>> I currently seem to be the one reviewing / merging most sqlparser-rs
>>> reviews, and I would definitely love some more help.
>>> 
>>> However, given that the project is not an Apache project, I did not have
>>> good luck attracting help.  A related discussion is here [1].
>>> 
>>> If the DataFusion community would like to accelerate releases, we can
>> also
>>> try to do that without bringing it into Apache governance. Specifically,
>> it
>>> would be great to have help reviewing the PRs -- the actual release
>> process
>>> is pretty low overhead. The reviews are what take the vast majority of
>> the
>>> maintenance time.
>>> 
>>> Andrew
>>> 
>>> [1]: https://github.com/sqlparser-rs/sqlparser-rs/issues/818
>>> 
>>> 
>>> 
>>> On Sat, Feb 17, 2024 at 4:44 PM Aldrin <octalene....@pm.me.invalid>
>> wrote:
>>> 
>>>> do users of sqlparser-rs mostly use datafusion? I don't know the
>>>> community, but it seems like it would be an annoying change for users
>> who
>>>> use it with a different query engine. Just a thought
>>>> 
>>>> Sent from Proton Mail <https://proton.me/mail/home> for iOS
>>>> 
>>>> 
>>>> On Sat, Feb 17, 2024 at 10:26, Andy Grove <andygrov...@gmail.com
>>>> <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>> wrote:
>>>> 
>>>> I agree that it simplifies shipping new SQL features in DataFusion
>> since we
>>>> can develop the changes in the parser concurrently with the changes in
>>>> other DataFusion crates and then release them all together.
>>>> 
>>>> The name of the crate would not need to change, so downstream users
>> should
>>>> see no impact.
>>>> 
>>>> We would need to decide if we want to keep a separate version number or
>>>> bring it in line with DataFusion version numbers (I have no preference
>>>> either way).
>>>> 
>>>> 
>>>> 
>>>> On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak <o...@synnada.ai>
>>>> wrote:
>>>> 
>>>>> Doing this will probably reduce the time-to-ship for DataFusion
>> features
>>>>> that need parsing support due to increased convenience, so I’m
>> inclined
>>>> to
>>>>> see it in a positive light.
>>>>> 
>>>>> What would be the impact of doing this on people who use only
>>>>> sqlparser-rs, if any?
>>>>> 
>>>>>> On Feb 17, 2024, at 7:16 PM, Andy Grove <andygrov...@gmail.com>
>> wrote:
>>>>>> 
>>>>>> The sqlparser-rs project [1] seems to have become the de-facto SQL
>>>> parser
>>>>>> for Rust, with almost 4 million downloads so far. This was
>> originally
>>>>> part
>>>>>> of DataFusion very early on, and I moved it into a separate project
>>>>> because
>>>>>> it seemed useful for other projects. This was before DataFusion was
>>>> known
>>>>>> as a composable query engine, and with hindsight, I probably should
>>>> have
>>>>>> left it as part of the DataFusion project.
>>>>>> 
>>>>>> Now that DataFusion has a reputation as a composable query engine,
>> I
>>>>> think
>>>>>> it would make sense to move this code back into DataFusion, where
>> it
>>>>> would
>>>>>> benefit from a larger community of maintainers.
>>>>>> 
>>>>>> I would like to hear thoughts from the Apache Arrow / DataFusion
>>>>> community.
>>>>>> Does this seem like a good idea?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Andy.
>>>>>> 
>>>>>> [1] https://github.com/sqlparser-rs/sqlparser-rs
>>>>> 
>>>>> 
>>>> 
>>>> 
>> 

Reply via email to