First, a +1000 on Will's blog post! [1]

Continue:

Building tools that benefit users of all languages, with particular kudos
to ADBC for providing an ABI-stable way to write database drivers that can
be used by practitioners in C++, Ruby, Python, Java, Go, and (soon!) R.

Start:

I wonder if this is the year that we can find a way to write compute
functions in such a way that separate implementations don't have to exist
for C++, Go, and Rust (and maybe others I don't know about).

Stop:

Will's comment that we should stop building data scientist-facing tools
under the Arrow name struck a particular chord with me...the R package is
very much data scientist facing and we have a rather large disjoint between
the technical capacity of our users and the technical capacity required to
contribute to the package (e.g., maintaining a development Arrow C++
install). The types of things we have to do to make RecordBatchReader,
Arrays, Buffer, RecordBatch and Table structures available to R users and
the types of things we have to do to provide an Acero dplyr backend are
vastly different.

[1] https://www.datawill.io/posts/apache-arrow-2022-reflection/

On Thu, Dec 29, 2022 at 4:09 PM Jacob Wujciak <ja...@voltrondata.com.invalid>
wrote:

> This is a great idea, I will add some thoughts later but just wanted to
> quickly add that the Zulip Chat [1] was recently switched to allow anyone
> to register without the need for an invite link!
> [1]:  https://ursalabs.zulipchat.com/
>
>
> On Wed, Dec 28, 2022 at 11:27 PM Will Jones <will.jones...@gmail.com>
> wrote:
>
> > Thanks for suggesting this Andrew.
> >
> > I just uploaded a blog post with my thoughts in long form [1]. Here are
> > some suggestions pulled from that:
> >
> > Continue:
> >
> > I hope we will continue prioritizing updating the spec for new array
> > formats. [2] I think this is very important for avoiding fragmentation
> and
> > may even open opportunities for consolidation in the C++ ecosystem.
> >
> > +1 on additional improvements for documentation, examples, no-invite
> chats.
> > I am particularly keen on seeing evangelism for our protocols; existing
> > ones like C Data Interface aren't nearly as widely known as they ought to
> > be and I'm excited for new ones like ADBC.
> >
> > Start:
> >
> > Find ways for each subproject to publicly develop a clear roadmap.
> > Otherwise by default these discussions happen in private, either between
> > individual ICs or within corporate environments. Some subprojects, such
> as
> > Acero could likely use their own sync call to help facilitate this, even
> if
> > on a slower cadence than the main biweekly call.
> >
> > Also, other sync calls might consider adapting to the sync call note
> style
> > used in the Rust projects, where all notes are in one google doc [3]
> rather
> > than spread across main mailing list threads. That seems like a format
> that
> > would make it easy for new contributors to catch up on the major focuses
> of
> > the project.
> >
> > Stop:
> >
> > Don't create end-user (e.g. data scientist) facing tools under the name
> > Arrow; prefer keeping separate brand identities for those tools and
> keeping
> > arrow libraries as developer-facing libraries.
> >
> > [1] https://www.datawill.io/posts/apache-arrow-2022-reflection/
> > [2] https://lists.apache.org/thread/49qzofswg1r5z7zh39pjvd1m2ggz2kdq
> > [3]
> >
> >
> https://docs.google.com/document/d/1atCVnoff5SR4eM4Lwf2M1BBJTY6g3_HUNR6qswYJW_U/edit#heading=h.qkuvi08gk4qa
> >
> > On Mon, Dec 26, 2022 at 10:12 AM Andrew Lamb <al...@influxdata.com>
> wrote:
> >
> > > Hi all,
> > >
> > > I am very excited and honored to help steer the Arrow Project this year
> > as
> > > Arrow PMC Chair.
> > >
> > > Something Kou suggested, and the PMC thought would be valuable, is to
> > have
> > > a small retrospective about the state of the project and where we want
> to
> > > take it. I would like to try doing so via a  “state of the project”
> type
> > > discussion on this mailing list, inspired by an example from Apache
> > Calcite
> > > [1].
> > >
> > > I welcome any / all comments on the following topics: What things /
> > > activities, if any, do you you think the Apache Arrow Community should:
> > >
> > > 1. Continue
> > > 2. Start
> > > 3. Stop
> > >
> > > My thoughts are below.
> > >
> > > Andrew
> > >
> > > [1] https://lists.apache.org/thread/tx8gw3vxc4kwfzjs6q2gqwgywnsm1zbf
> > >
> > > Continue:
> > >
> > > I hope we can continue to encourage and support community growth,
> focused
> > > especially on supporting the sub projects and their leadership. I also
> > > would like to continue and grow the outward facing evangelism about the
> > > project with blog posts and presentations.
> > >
> > > Start:
> > >
> > > Lower the barrier to contributors and accepting those contributions
> even
> > > more, especially for casual contributors. The move to github issues
> from
> > > JIRA I see as one example of lowering this barrier (by reducing the
> > > required account maintenance). I would love to see additional
> > improvements
> > > in areas like documentation, examples, no-invite-needed chat, etc.
> > >
> > > Stop:
> > >
> > > It would be nice to stop (reduce) the reliance on the relatively small
> > > number of core contributors for code review. I don’t have any
> particular
> > > insight on how to accomplish this, and suspect we will always have less
> > > review capacity than we would like, but it would be nice to encourage
> the
> > > growth.
> > >
> >
>

Reply via email to