I started a discussion about moving sqlparser into Apache Software
Foundation governance[1].

Please provide any comments you may have there

[1]: https://github.com/sqlparser-rs/sqlparser-rs/issues/1294

On Thu, Feb 29, 2024 at 5:02 PM Andy Grove <andygrov...@gmail.com> wrote:

> I will put this proposal on hold for now and restart the conversation later
> this year once DataFusion is a top-level ASF project.
>
> Thanks again for all the feedback.
>
> Andy.
>
> On Wed, Feb 28, 2024 at 9:58 AM Andy Grove <andygrov...@gmail.com> wrote:
>
> > Thanks for all the feedback so far.
> >
> > It does seem that the least contentious way to do this would be to follow
> > Andrew's suggestion of having a separate
> > apache/[arrow-]datafusion-sqlparser repository as this will ensure that
> we
> > do not end up adding any DataFusion dependencies to the sqlparser
> project,
> > and that it continues to have its own release process.
> >
> > The main benefit here is that it would bring it under ASF governance and
> > allow those who have permission from their employers to contribute to
> > Apache Arrow/DataFusion to be able to help with the maintenance burden.
> >
> > Andy.
> >
> >
> >
> > On Wed, Feb 28, 2024 at 4:28 AM Andrew Lamb <al...@influxdata.com>
> wrote:
> >
> >> One potential way "moving sqlparser-rs into DataFusion" could look is
> that
> >> code/repo is moved from the sqlparser-rs [1] organization to the apache
> >> organization. For example
> >>
> >> https://github.com/sqlparser-rs/sqlparser-rs
> >> to
> >> https://github.com/apache/datafusion-sqlparser
> >>
> >> We could continue development separately from any other code, release it
> >> as
> >> a separate artifact, but use the same overarching governance structure
> >> (voting on releases, committer access, etc)
> >>
> >> To follow this model, I think the largest work item would be to run the
> IP
> >> clearance process, and since sqlparser-rs has many distinct contributors
> >> that may take a while
> >>
> >> Andrew
> >>
> >>
> >>
> >> On Wed, Feb 28, 2024 at 1:45 AM Aldrin <octalene....@pm.me.invalid>
> >> wrote:
> >>
> >> > Maybe it would be valuable to more explicitly define "moving back into
> >> > DataFusion project".
> >> >
> >> > I assumed it meant absorbing into the datafusion repo, but it occurs
> to
> >> me
> >> > that may not be the case. Then, how would sqlparser-rs be "moved"?
> >> >
> >> >
> >> >
> >> > # ------------------------------
> >> > # Aldrin
> >> >
> >> >
> >> > https://github.com/drin/
> >> > https://gitlab.com/octalene
> >> > https://keybase.io/octalene
> >> >
> >> >
> >> > On Tuesday, February 27th, 2024 at 16:20, Chak-Pong Chung <
> >> > chakpongch...@gmail.com> wrote:
> >> >
> >> > > There are cases where people need datafusion but not a SQL parser.
> For
> >> > > example, people building a composable query engine for graph or
> other
> >> > data
> >> > > modality may not choose SQL as the DSL. Decoupling them seems to be
> a
> >> > good
> >> > > idea.
> >> > >
> >> >
> >> > > On Tue, Feb 27, 2024, 6:20 AM Mehmet Ozan Kabak o...@synnada.ai
> >> wrote:
> >> > >
> >> >
> >> > > > In this case, maybe we can bring sqlparser-rs into the ASF
> umbrella
> >> > > > following the arrow-datafusion model?
> >> > > >
> >> >
> >> > > > Once DataFusion becomes a top-level project, we could move it to
> >> > > > datafusion-sqlparser-rs — it would be a quasi-independent project
> >> just
> >> > like
> >> > > > how DataFusion is today w.r.t. Arrow. But it would get most
> >> benefits of
> >> > > > having a community behind it.
> >> > > >
> >> >
> >> > > > > On Feb 27, 2024, at 2:11 AM, Andrew Lamb al...@influxdata.com
> >> wrote:
> >> > > > >
> >> >
> >> > > > > Julian, thank you for your insight. I very much agree with it.
> >> > > > >
> >> >
> >> > > > > > I think the ASF is wrong on this. I think it needs to provide
> a
> >> > home
> >> > > > > > for medium-sized projects such as sqlparser-rs in an existing
> >> > > > > > top-level project;
> >> > > > >
> >> >
> >> > > > > It could be said that DataFusion fits this model -- it isn't
> >> really
> >> > an
> >> > > > > "Arrow" project but needed a place to live and grow, and the
> Arrow
> >> > ASF
> >> > > > > community provided that.
> >> > > > >
> >> >
> >> > > > > Andrew
> >> > > > >
> >> >
> >> > > > > On Mon, Feb 26, 2024 at 1:09 PM Julian Hyde jh...@apache.org
> >> wrote:
> >> > > > >
> >> >
> >> > > > > > I am torn on this.
> >> > > > > >
> >> >
> >> > > > > > One one hand, I am a big fan of components that are
> standalone -
> >> > have
> >> > > > > > no more dependencies than necessary, and are self-evidently
> >> > > > > > standalone. So, I think that re-absorbing sqlparser-rs back
> into
> >> > > > > > DataFusion would not be a good step. It would reduce the
> >> perception
> >> > > > > > that it is standalone.
> >> > > > > >
> >> >
> >> > > > > > On the other hand, it sounds as if sqlparser-rs would benefit
> by
> >> > > > > > having an Apache-like community around it. DataFusion isn't a
> >> > perfect
> >> > > > > > fit - there is not much overlap between DataFusion and
> >> sqlparser-rs
> >> > > > > > users - but it takes a lot of effort to create and run a
> >> top-level
> >> > > > > > project, and DataFusion is already up and running.
> >> > > > > >
> >> >
> >> > > > > > The tension is that people want to consume components that
> they
> >> > > > > > perceive to be standalone, and yet the ASF wants to create
> >> > communities
> >> > > > > > that produce either a single large component or sets of
> >> > highly-coupled
> >> > > > > > components. The ASF used to do 'umbrella projects' whose
> >> > sub-projects
> >> > > > > > were in the same subject area but had little or no
> dependencies.
> >> > For
> >> > > > > > example, Apache DB [ https://db.apache.org/ ] has JDO, Derby
> >> and
> >> > > > > > Torque. And commons included many useful Java libraries.
> >> Umbrella
> >> > > > > > projects caused problems during the Jakarta and Hadoop eras,
> and
> >> > now
> >> > > > > > are strongly discouraged at the ASF.
> >> > > > > >
> >> >
> >> > > > > > I think the ASF is wrong on this. I think it needs to provide
> a
> >> > home
> >> > > > > > for medium-sized projects such as sqlparser-rs in an existing
> >> > > > > > top-level project; maybe those projects grow into top-level
> >> > projects,
> >> > > > > > or maybe they remain medium-sized projects. This is especially
> >> > > > > > necessary in the Rust community, where there are many exciting
> >> > > > > > projects, but they are almost all happening outside ASF. (This
> >> is
> >> > > > > > exactly where Java was in ~2005. Maybe we need a rust-commons
> or
> >> > > > > > rust-db?)
> >> > > > > >
> >> >
> >> > > > > > My conclusion is to leave sqlparser-rs where it is for now,
> but
> >> to
> >> > > > > > continue talking about what might be an attractive home for it
> >> in
> >> > ASF.
> >> > > > > >
> >> >
> >> > > > > > Julian
> >> > > > > >
> >> >
> >> > > > > > On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb
> >> al...@influxdata.com
> >> > > > > > wrote:
> >> > > > > >
> >> >
> >> > > > > > > Sorry for the late reply,
> >> > > > > > >
> >> >
> >> > > > > > > I think sqlparser-rs users are quite a bit more varied than
> >> > DataFusion
> >> > > > > > > and
> >> > > > > > > there is not a large overlap between the contributors of the
> >> two
> >> > > > > > > projects.
> >> > > > > > > I currently seem to be the one reviewing / merging most
> >> > sqlparser-rs
> >> > > > > > > reviews, and I would definitely love some more help.
> >> > > > > > >
> >> >
> >> > > > > > > However, given that the project is not an Apache project, I
> >> did
> >> > not
> >> > > > > > > have
> >> > > > > > > good luck attracting help. A related discussion is here 1.
> >> > > > > > >
> >> >
> >> > > > > > > If the DataFusion community would like to accelerate
> releases,
> >> > we can
> >> > > > > > > also
> >> > > > > > > try to do that without bringing it into Apache governance.
> >> > > > > > > Specifically,
> >> > > > > > > it
> >> > > > > > > would be great to have help reviewing the PRs -- the actual
> >> > release
> >> > > > > > > process
> >> > > > > > > is pretty low overhead. The reviews are what take the vast
> >> > majority of
> >> > > > > > > the
> >> > > > > > > maintenance time.
> >> > > > > > >
> >> >
> >> > > > > > > Andrew
> >> > > > > > >
> >> >
> >> > > > > > > On Sat, Feb 17, 2024 at 4:44 PM Aldrin
> >> octalene....@pm.me.invalid
> >> > > > > > > wrote:
> >> > > > > > >
> >> >
> >> > > > > > > > do users of sqlparser-rs mostly use datafusion? I don't
> know
> >> > the
> >> > > > > > > > community, but it seems like it would be an annoying
> change
> >> > for users
> >> > > > > > > > who
> >> > > > > > > > use it with a different query engine. Just a thought
> >> > > > > > > >
> >> >
> >> > > > > > > > Sent from Proton Mail https://proton.me/mail/home for iOS
> >> > > > > > > >
> >> >
> >> > > > > > > > On Sat, Feb 17, 2024 at 10:26, Andy Grove <
> >> > andygrov...@gmail.com
> >> > > > > > > > <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>>
> >> > wrote:
> >> > > > > > > >
> >> >
> >> > > > > > > > I agree that it simplifies shipping new SQL features in
> >> > DataFusion
> >> > > > > > > > since we
> >> > > > > > > > can develop the changes in the parser concurrently with
> the
> >> > changes in
> >> > > > > > > > other DataFusion crates and then release them all
> together.
> >> > > > > > > >
> >> >
> >> > > > > > > > The name of the crate would not need to change, so
> >> downstream
> >> > users
> >> > > > > > > > should
> >> > > > > > > > see no impact.
> >> > > > > > > >
> >> >
> >> > > > > > > > We would need to decide if we want to keep a separate
> >> version
> >> > number
> >> > > > > > > > or
> >> > > > > > > > bring it in line with DataFusion version numbers (I have
> no
> >> > preference
> >> > > > > > > > either way).
> >> > > > > > > >
> >> >
> >> > > > > > > > On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak
> >> > o...@synnada.ai
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> >
> >> > > > > > > > > Doing this will probably reduce the time-to-ship for
> >> > DataFusion
> >> > > > > > > > > features
> >> > > > > > > > > that need parsing support due to increased convenience,
> so
> >> > I’m
> >> > > > > > > > > inclined
> >> > > > > > > > > to
> >> > > > > > > > > see it in a positive light.
> >> > > > > > > > >
> >> >
> >> > > > > > > > > What would be the impact of doing this on people who use
> >> only
> >> > > > > > > > > sqlparser-rs, if any?
> >> > > > > > > > >
> >> >
> >> > > > > > > > > > On Feb 17, 2024, at 7:16 PM, Andy Grove
> >> > andygrov...@gmail.com
> >> > > > > > > > > > wrote:
> >> > > > > > > > > >
> >> >
> >> > > > > > > > > > The sqlparser-rs project 1 seems to have become the
> >> > de-facto SQL
> >> > > > > > > > > > parser
> >> > > > > > > > > > for Rust, with almost 4 million downloads so far. This
> >> was
> >> > > > > > > > > > originally
> >> > > > > > > > > > part
> >> > > > > > > > > > of DataFusion very early on, and I moved it into a
> >> > separate project
> >> > > > > > > > > > because
> >> > > > > > > > > > it seemed useful for other projects. This was before
> >> > DataFusion was
> >> > > > > > > > > > known
> >> > > > > > > > > > as a composable query engine, and with hindsight, I
> >> > probably should
> >> > > > > > > > > > have
> >> > > > > > > > > > left it as part of the DataFusion project.
> >> > > > > > > > > >
> >> >
> >> > > > > > > > > > Now that DataFusion has a reputation as a composable
> >> query
> >> > engine,
> >> > > > > > > > > > I
> >> > > > > > > > > > think
> >> > > > > > > > > > it would make sense to move this code back into
> >> > DataFusion, where
> >> > > > > > > > > > it
> >> > > > > > > > > > would
> >> > > > > > > > > > benefit from a larger community of maintainers.
> >> > > > > > > > > >
> >> >
> >> > > > > > > > > > I would like to hear thoughts from the Apache Arrow /
> >> > DataFusion
> >> > > > > > > > > > community.
> >> > > > > > > > > > Does this seem like a good idea?
> >> > > > > > > > > >
> >> >
> >> > > > > > > > > > Thanks,
> >> > > > > > > > > >
> >> >
> >> > > > > > > > > > Andy.
> >> > > > > > > > > >
> >> >
> >> > > > > > > > > > 1 https://github.com/sqlparser-rs/sqlparser-rs
> >>
> >
>

Reply via email to