In this case, maybe we can bring sqlparser-rs into the ASF umbrella following the arrow-datafusion model?
Once DataFusion becomes a top-level project, we could move it to datafusion-sqlparser-rs — it would be a quasi-independent project just like how DataFusion is today w.r.t. Arrow. But it would get most benefits of having a community behind it. > On Feb 27, 2024, at 2:11 AM, Andrew Lamb <al...@influxdata.com> wrote: > > Julian, thank you for your insight. I very much agree with it. > >> I think the ASF is wrong on this. I think it needs to provide a home >> for medium-sized projects such as sqlparser-rs in an existing >> top-level project; > > It could be said that DataFusion fits this model -- it isn't really an > "Arrow" project but needed a place to live and grow, and the Arrow ASF > community provided that. > > Andrew > > > > > On Mon, Feb 26, 2024 at 1:09 PM Julian Hyde <jh...@apache.org> wrote: > >> I am torn on this. >> >> One one hand, I am a big fan of components that are standalone - have >> no more dependencies than necessary, and are self-evidently >> standalone. So, I think that re-absorbing sqlparser-rs back into >> DataFusion would not be a good step. It would reduce the perception >> that it is standalone. >> >> On the other hand, it sounds as if sqlparser-rs would benefit by >> having an Apache-like community around it. DataFusion isn't a perfect >> fit - there is not much overlap between DataFusion and sqlparser-rs >> users - but it takes a lot of effort to create and run a top-level >> project, and DataFusion is already up and running. >> >> The tension is that people want to consume components that they >> perceive to be standalone, and yet the ASF wants to create communities >> that produce either a single large component or sets of highly-coupled >> components. The ASF used to do 'umbrella projects' whose sub-projects >> were in the same subject area but had little or no dependencies. For >> example, Apache DB [ https://db.apache.org/ ] has JDO, Derby and >> Torque. And commons included many useful Java libraries. Umbrella >> projects caused problems during the Jakarta and Hadoop eras, and now >> are strongly discouraged at the ASF. >> >> I think the ASF is wrong on this. I think it needs to provide a home >> for medium-sized projects such as sqlparser-rs in an existing >> top-level project; maybe those projects grow into top-level projects, >> or maybe they remain medium-sized projects. This is especially >> necessary in the Rust community, where there are many exciting >> projects, but they are almost all happening outside ASF. (This is >> exactly where Java was in ~2005. Maybe we need a rust-commons or >> rust-db?) >> >> My conclusion is to leave sqlparser-rs where it is for now, but to >> continue talking about what might be an attractive home for it in ASF. >> >> Julian >> >> On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb <al...@influxdata.com> wrote: >>> >>> Sorry for the late reply, >>> >>> I think sqlparser-rs users are quite a bit more varied than DataFusion >> and >>> there is not a large overlap between the contributors of the two >> projects. >>> I currently seem to be the one reviewing / merging most sqlparser-rs >>> reviews, and I would definitely love some more help. >>> >>> However, given that the project is not an Apache project, I did not have >>> good luck attracting help. A related discussion is here [1]. >>> >>> If the DataFusion community would like to accelerate releases, we can >> also >>> try to do that without bringing it into Apache governance. Specifically, >> it >>> would be great to have help reviewing the PRs -- the actual release >> process >>> is pretty low overhead. The reviews are what take the vast majority of >> the >>> maintenance time. >>> >>> Andrew >>> >>> [1]: https://github.com/sqlparser-rs/sqlparser-rs/issues/818 >>> >>> >>> >>> On Sat, Feb 17, 2024 at 4:44 PM Aldrin <octalene....@pm.me.invalid> >> wrote: >>> >>>> do users of sqlparser-rs mostly use datafusion? I don't know the >>>> community, but it seems like it would be an annoying change for users >> who >>>> use it with a different query engine. Just a thought >>>> >>>> Sent from Proton Mail <https://proton.me/mail/home> for iOS >>>> >>>> >>>> On Sat, Feb 17, 2024 at 10:26, Andy Grove <andygrov...@gmail.com >>>> <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>> wrote: >>>> >>>> I agree that it simplifies shipping new SQL features in DataFusion >> since we >>>> can develop the changes in the parser concurrently with the changes in >>>> other DataFusion crates and then release them all together. >>>> >>>> The name of the crate would not need to change, so downstream users >> should >>>> see no impact. >>>> >>>> We would need to decide if we want to keep a separate version number or >>>> bring it in line with DataFusion version numbers (I have no preference >>>> either way). >>>> >>>> >>>> >>>> On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak <o...@synnada.ai> >>>> wrote: >>>> >>>>> Doing this will probably reduce the time-to-ship for DataFusion >> features >>>>> that need parsing support due to increased convenience, so I’m >> inclined >>>> to >>>>> see it in a positive light. >>>>> >>>>> What would be the impact of doing this on people who use only >>>>> sqlparser-rs, if any? >>>>> >>>>>> On Feb 17, 2024, at 7:16 PM, Andy Grove <andygrov...@gmail.com> >> wrote: >>>>>> >>>>>> The sqlparser-rs project [1] seems to have become the de-facto SQL >>>> parser >>>>>> for Rust, with almost 4 million downloads so far. This was >> originally >>>>> part >>>>>> of DataFusion very early on, and I moved it into a separate project >>>>> because >>>>>> it seemed useful for other projects. This was before DataFusion was >>>> known >>>>>> as a composable query engine, and with hindsight, I probably should >>>> have >>>>>> left it as part of the DataFusion project. >>>>>> >>>>>> Now that DataFusion has a reputation as a composable query engine, >> I >>>>> think >>>>>> it would make sense to move this code back into DataFusion, where >> it >>>>> would >>>>>> benefit from a larger community of maintainers. >>>>>> >>>>>> I would like to hear thoughts from the Apache Arrow / DataFusion >>>>> community. >>>>>> Does this seem like a good idea? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Andy. >>>>>> >>>>>> [1] https://github.com/sqlparser-rs/sqlparser-rs >>>>> >>>>> >>>> >>>> >>