I started a discussion about moving sqlparser into Apache Software Foundation governance[1].
Please provide any comments you may have there [1]: https://github.com/sqlparser-rs/sqlparser-rs/issues/1294 On Thu, Feb 29, 2024 at 5:02 PM Andy Grove <andygrov...@gmail.com> wrote: > I will put this proposal on hold for now and restart the conversation later > this year once DataFusion is a top-level ASF project. > > Thanks again for all the feedback. > > Andy. > > On Wed, Feb 28, 2024 at 9:58 AM Andy Grove <andygrov...@gmail.com> wrote: > > > Thanks for all the feedback so far. > > > > It does seem that the least contentious way to do this would be to follow > > Andrew's suggestion of having a separate > > apache/[arrow-]datafusion-sqlparser repository as this will ensure that > we > > do not end up adding any DataFusion dependencies to the sqlparser > project, > > and that it continues to have its own release process. > > > > The main benefit here is that it would bring it under ASF governance and > > allow those who have permission from their employers to contribute to > > Apache Arrow/DataFusion to be able to help with the maintenance burden. > > > > Andy. > > > > > > > > On Wed, Feb 28, 2024 at 4:28 AM Andrew Lamb <al...@influxdata.com> > wrote: > > > >> One potential way "moving sqlparser-rs into DataFusion" could look is > that > >> code/repo is moved from the sqlparser-rs [1] organization to the apache > >> organization. For example > >> > >> https://github.com/sqlparser-rs/sqlparser-rs > >> to > >> https://github.com/apache/datafusion-sqlparser > >> > >> We could continue development separately from any other code, release it > >> as > >> a separate artifact, but use the same overarching governance structure > >> (voting on releases, committer access, etc) > >> > >> To follow this model, I think the largest work item would be to run the > IP > >> clearance process, and since sqlparser-rs has many distinct contributors > >> that may take a while > >> > >> Andrew > >> > >> > >> > >> On Wed, Feb 28, 2024 at 1:45 AM Aldrin <octalene....@pm.me.invalid> > >> wrote: > >> > >> > Maybe it would be valuable to more explicitly define "moving back into > >> > DataFusion project". > >> > > >> > I assumed it meant absorbing into the datafusion repo, but it occurs > to > >> me > >> > that may not be the case. Then, how would sqlparser-rs be "moved"? > >> > > >> > > >> > > >> > # ------------------------------ > >> > # Aldrin > >> > > >> > > >> > https://github.com/drin/ > >> > https://gitlab.com/octalene > >> > https://keybase.io/octalene > >> > > >> > > >> > On Tuesday, February 27th, 2024 at 16:20, Chak-Pong Chung < > >> > chakpongch...@gmail.com> wrote: > >> > > >> > > There are cases where people need datafusion but not a SQL parser. > For > >> > > example, people building a composable query engine for graph or > other > >> > data > >> > > modality may not choose SQL as the DSL. Decoupling them seems to be > a > >> > good > >> > > idea. > >> > > > >> > > >> > > On Tue, Feb 27, 2024, 6:20 AM Mehmet Ozan Kabak o...@synnada.ai > >> wrote: > >> > > > >> > > >> > > > In this case, maybe we can bring sqlparser-rs into the ASF > umbrella > >> > > > following the arrow-datafusion model? > >> > > > > >> > > >> > > > Once DataFusion becomes a top-level project, we could move it to > >> > > > datafusion-sqlparser-rs — it would be a quasi-independent project > >> just > >> > like > >> > > > how DataFusion is today w.r.t. Arrow. But it would get most > >> benefits of > >> > > > having a community behind it. > >> > > > > >> > > >> > > > > On Feb 27, 2024, at 2:11 AM, Andrew Lamb al...@influxdata.com > >> wrote: > >> > > > > > >> > > >> > > > > Julian, thank you for your insight. I very much agree with it. > >> > > > > > >> > > >> > > > > > I think the ASF is wrong on this. I think it needs to provide > a > >> > home > >> > > > > > for medium-sized projects such as sqlparser-rs in an existing > >> > > > > > top-level project; > >> > > > > > >> > > >> > > > > It could be said that DataFusion fits this model -- it isn't > >> really > >> > an > >> > > > > "Arrow" project but needed a place to live and grow, and the > Arrow > >> > ASF > >> > > > > community provided that. > >> > > > > > >> > > >> > > > > Andrew > >> > > > > > >> > > >> > > > > On Mon, Feb 26, 2024 at 1:09 PM Julian Hyde jh...@apache.org > >> wrote: > >> > > > > > >> > > >> > > > > > I am torn on this. > >> > > > > > > >> > > >> > > > > > One one hand, I am a big fan of components that are > standalone - > >> > have > >> > > > > > no more dependencies than necessary, and are self-evidently > >> > > > > > standalone. So, I think that re-absorbing sqlparser-rs back > into > >> > > > > > DataFusion would not be a good step. It would reduce the > >> perception > >> > > > > > that it is standalone. > >> > > > > > > >> > > >> > > > > > On the other hand, it sounds as if sqlparser-rs would benefit > by > >> > > > > > having an Apache-like community around it. DataFusion isn't a > >> > perfect > >> > > > > > fit - there is not much overlap between DataFusion and > >> sqlparser-rs > >> > > > > > users - but it takes a lot of effort to create and run a > >> top-level > >> > > > > > project, and DataFusion is already up and running. > >> > > > > > > >> > > >> > > > > > The tension is that people want to consume components that > they > >> > > > > > perceive to be standalone, and yet the ASF wants to create > >> > communities > >> > > > > > that produce either a single large component or sets of > >> > highly-coupled > >> > > > > > components. The ASF used to do 'umbrella projects' whose > >> > sub-projects > >> > > > > > were in the same subject area but had little or no > dependencies. > >> > For > >> > > > > > example, Apache DB [ https://db.apache.org/ ] has JDO, Derby > >> and > >> > > > > > Torque. And commons included many useful Java libraries. > >> Umbrella > >> > > > > > projects caused problems during the Jakarta and Hadoop eras, > and > >> > now > >> > > > > > are strongly discouraged at the ASF. > >> > > > > > > >> > > >> > > > > > I think the ASF is wrong on this. I think it needs to provide > a > >> > home > >> > > > > > for medium-sized projects such as sqlparser-rs in an existing > >> > > > > > top-level project; maybe those projects grow into top-level > >> > projects, > >> > > > > > or maybe they remain medium-sized projects. This is especially > >> > > > > > necessary in the Rust community, where there are many exciting > >> > > > > > projects, but they are almost all happening outside ASF. (This > >> is > >> > > > > > exactly where Java was in ~2005. Maybe we need a rust-commons > or > >> > > > > > rust-db?) > >> > > > > > > >> > > >> > > > > > My conclusion is to leave sqlparser-rs where it is for now, > but > >> to > >> > > > > > continue talking about what might be an attractive home for it > >> in > >> > ASF. > >> > > > > > > >> > > >> > > > > > Julian > >> > > > > > > >> > > >> > > > > > On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb > >> al...@influxdata.com > >> > > > > > wrote: > >> > > > > > > >> > > >> > > > > > > Sorry for the late reply, > >> > > > > > > > >> > > >> > > > > > > I think sqlparser-rs users are quite a bit more varied than > >> > DataFusion > >> > > > > > > and > >> > > > > > > there is not a large overlap between the contributors of the > >> two > >> > > > > > > projects. > >> > > > > > > I currently seem to be the one reviewing / merging most > >> > sqlparser-rs > >> > > > > > > reviews, and I would definitely love some more help. > >> > > > > > > > >> > > >> > > > > > > However, given that the project is not an Apache project, I > >> did > >> > not > >> > > > > > > have > >> > > > > > > good luck attracting help. A related discussion is here 1. > >> > > > > > > > >> > > >> > > > > > > If the DataFusion community would like to accelerate > releases, > >> > we can > >> > > > > > > also > >> > > > > > > try to do that without bringing it into Apache governance. > >> > > > > > > Specifically, > >> > > > > > > it > >> > > > > > > would be great to have help reviewing the PRs -- the actual > >> > release > >> > > > > > > process > >> > > > > > > is pretty low overhead. The reviews are what take the vast > >> > majority of > >> > > > > > > the > >> > > > > > > maintenance time. > >> > > > > > > > >> > > >> > > > > > > Andrew > >> > > > > > > > >> > > >> > > > > > > On Sat, Feb 17, 2024 at 4:44 PM Aldrin > >> octalene....@pm.me.invalid > >> > > > > > > wrote: > >> > > > > > > > >> > > >> > > > > > > > do users of sqlparser-rs mostly use datafusion? I don't > know > >> > the > >> > > > > > > > community, but it seems like it would be an annoying > change > >> > for users > >> > > > > > > > who > >> > > > > > > > use it with a different query engine. Just a thought > >> > > > > > > > > >> > > >> > > > > > > > Sent from Proton Mail https://proton.me/mail/home for iOS > >> > > > > > > > > >> > > >> > > > > > > > On Sat, Feb 17, 2024 at 10:26, Andy Grove < > >> > andygrov...@gmail.com > >> > > > > > > > <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>> > >> > wrote: > >> > > > > > > > > >> > > >> > > > > > > > I agree that it simplifies shipping new SQL features in > >> > DataFusion > >> > > > > > > > since we > >> > > > > > > > can develop the changes in the parser concurrently with > the > >> > changes in > >> > > > > > > > other DataFusion crates and then release them all > together. > >> > > > > > > > > >> > > >> > > > > > > > The name of the crate would not need to change, so > >> downstream > >> > users > >> > > > > > > > should > >> > > > > > > > see no impact. > >> > > > > > > > > >> > > >> > > > > > > > We would need to decide if we want to keep a separate > >> version > >> > number > >> > > > > > > > or > >> > > > > > > > bring it in line with DataFusion version numbers (I have > no > >> > preference > >> > > > > > > > either way). > >> > > > > > > > > >> > > >> > > > > > > > On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak > >> > o...@synnada.ai > >> > > > > > > > wrote: > >> > > > > > > > > >> > > >> > > > > > > > > Doing this will probably reduce the time-to-ship for > >> > DataFusion > >> > > > > > > > > features > >> > > > > > > > > that need parsing support due to increased convenience, > so > >> > I’m > >> > > > > > > > > inclined > >> > > > > > > > > to > >> > > > > > > > > see it in a positive light. > >> > > > > > > > > > >> > > >> > > > > > > > > What would be the impact of doing this on people who use > >> only > >> > > > > > > > > sqlparser-rs, if any? > >> > > > > > > > > > >> > > >> > > > > > > > > > On Feb 17, 2024, at 7:16 PM, Andy Grove > >> > andygrov...@gmail.com > >> > > > > > > > > > wrote: > >> > > > > > > > > > > >> > > >> > > > > > > > > > The sqlparser-rs project 1 seems to have become the > >> > de-facto SQL > >> > > > > > > > > > parser > >> > > > > > > > > > for Rust, with almost 4 million downloads so far. This > >> was > >> > > > > > > > > > originally > >> > > > > > > > > > part > >> > > > > > > > > > of DataFusion very early on, and I moved it into a > >> > separate project > >> > > > > > > > > > because > >> > > > > > > > > > it seemed useful for other projects. This was before > >> > DataFusion was > >> > > > > > > > > > known > >> > > > > > > > > > as a composable query engine, and with hindsight, I > >> > probably should > >> > > > > > > > > > have > >> > > > > > > > > > left it as part of the DataFusion project. > >> > > > > > > > > > > >> > > >> > > > > > > > > > Now that DataFusion has a reputation as a composable > >> query > >> > engine, > >> > > > > > > > > > I > >> > > > > > > > > > think > >> > > > > > > > > > it would make sense to move this code back into > >> > DataFusion, where > >> > > > > > > > > > it > >> > > > > > > > > > would > >> > > > > > > > > > benefit from a larger community of maintainers. > >> > > > > > > > > > > >> > > >> > > > > > > > > > I would like to hear thoughts from the Apache Arrow / > >> > DataFusion > >> > > > > > > > > > community. > >> > > > > > > > > > Does this seem like a good idea? > >> > > > > > > > > > > >> > > >> > > > > > > > > > Thanks, > >> > > > > > > > > > > >> > > >> > > > > > > > > > Andy. > >> > > > > > > > > > > >> > > >> > > > > > > > > > 1 https://github.com/sqlparser-rs/sqlparser-rs > >> > > >