Hi,

Thanks for the detailed answer.

In contrast to my previous email, my opinionated part:

Generally I like the idea of smaller crates, it helps with a lot of stuff
(different targets, build time), but those benefits can be achieved by
feature gates too.
The upside would be out-of-sync crate releases.

Maintenance is important, historically speaking I've seen it solved for
open source by private companies offering it as a paid service.
You are right that currently only 3 months of support is provided for free,
but personally I don't see that as an issue.
There are professional libraries and software with close to 100% market
share in their field which support the last or last two versions only
(Chrome, OS-es, compilers).
I find it hard to imagine we'd want to do it *better*, that sounds to be an
illusion, but I'd like to be wrong on this one :)
Professionally speaking, when picking projects, having Apache (or other)
governance and community is more important for the businesses I worked
with, than the release schedule or API stability / versioning.


Based on the above and that there are about a dozen active Rust arrow
contributors, any promise for reliable maintenance over years would be a
lie in my eyes.
DataFusion, Polars, odbc2parquet and others had issues with the changes
being too slow, not too fast.

I'm a big advocate of middle grounds and I still believe that your efforts
and ideal setup is compatible with arrow-rs, nobody would stop you creating
a 5.23.0 release next to the 6.1.0 if you'd want to backport anything and
nobody would stop you cutting an out-of-schedule 6.2 or even 7.0 release if
it's to ensure security. The frequent Apache release process - which we
were afraid of - was smooth so far, with surprisingly nice support from
members of different languages / implementations.

Also I believe that any plan you'd have turning arrow2 into arrow-rs 6.0
would be more than welcome on a public vote, along with the technical
chances you propose (eg. cutting a separate arrow-io crate).


At least 6 key members showed their excitement for your changes in this
thread and even more on Slack/GitHub ;)

Best regards,
Adam Lippai

On Fri, Aug 6, 2021 at 10:07 AM Jorge Cardoso Leitão <
jorgecarlei...@gmail.com> wrote:

> Hi,
>
> Thanks for your input.
>
> Every time there is a new major release, all new development shifts towards
> that new API and users of previous APIs are left behind. It is not just a
> matter of SemVer and size of version numbers, there is a whole development
> shift to be on top of the new API.
>
> I disagree that a software that has a major release every 3 months and no
> maintenance window over previous versions is stable. I alluded to the Tokio
> example because Tokio 1.0 recently became the runtime of rust-based AWS
> lambda functions [1]; this commitment is only possible by enforcing API
> stability and maintenance beyond a 3 month period (at least 3 years in
> their case).
>
> Also, imo the current major version number is not meaningless: divided by
> the software age, it constitutes the historical release pattern and is
> usually a good predictor of the pattern used in future releases.
>
> The evidence is that we haven't been able to support any version for any
> period of time; recently, Andrew has been doing amazing work at supporting
> the latest version for a period of 3 months. I.e. an application that
> depends on `arrow = ^5.0` has a support window of 3 months. Given that we
> have not backported any security fixes to previous versions, it is
> reasonable to assume that security patches are also applied within a 3
> month period only.
>
> As contributor of arrow2, I would rather not have arrow2 under Apache Arrow
> than having to release it under its current versioning and scheduling (this
> is similar to some of Julia's concerns). As a contributor to the Apache
> Arrow, I currently cannot guarantee a maintenance window over arrow-rs for
> any period of time because it is unsafe by design and I do not have the
> motivation to fix it. As both, I am confident that the core arrow2 will
> soon reach a point where we can live with and develop on top of it for at
> least a year. This is not true to the whole API surface, though: there are
> APIs that we will need to change more often until stability can be
> promised.
>
> So, I am requesting that we tie the discussion of arrow2 to how it will be
> released.
>
> Could a middle ground be somewhere along the lines of splitting the crate
> in smaller crates that are versioned independently. I.e. continue to
> release `arrow` under the same versioning and cadence, and create 3 new
> crates, arrow-core, arrow-compute, and arrow-io (see also [2]) that would
> have their own versioning at 0.X until stability is achieved, based on
> arrow2's code base. The migration of the `arrow` crate to arrow2's API
> would be to re-export from the smaller crates (e.g. `pub use
> arrow_core::array`).
>
> [1] https://crates.io/crates/lambda_runtime/0.3.1/dependencies
> [2] https://github.com/jorgecarleitao/arrow2/issues/257
>
> Best,
> Jorge
>
>
> On Thu, Aug 5, 2021 at 11:53 PM Adam Lippai <a...@rigo.sk> wrote:
>
> > Not taking sides, just two technical notes below.
> >
> > Server.org clearly defines (
> > https://semver.org/#how-do-i-know-when-to-release-100) the versions
> > >1.0.0.
> > * If it's used in production, it's 1.0.0.
> > * If it provides an API others depend on then it's 1.0.0.
> > * If you intend to keep backward compatibility, it's 1.0.0.
> > Tl;Dr 1.0.0 represents a version which from point we guarantee that
> > non-production releases are marked (alpha, beta, rc) and breaking (API)
> > changes, backwards incompatible changes result in major version bump.
> This
> > we already do, 4x per year.
> >
> > The second fact is that arrow2 uses the arrow name, but it doesn't have
> > apache governance. It's not released from GitHub.com/apache, there are no
> > formal releases, there are no votes. This is not correct or fair usage of
> > the brand (on the same level as DataFuse, or db-benchmark calling a
> custom
> > R implementation arrow) even if it's "unofficial". My understanding is
> that
> > arrow2 can be an unofficial implementation with a different name or an
> > arrow-rs experiment with the intention to merge the code, but not both.
> >
> > I think both issues could be solved and I really value and like the
> arrow2
> > work so far. That's the right way. I hope we'll see it in prod either way
> > as soon as it's ready.
> >
> > Best regards,
> > Adam Lippai
> >
> > On Wed, Aug 4, 2021, 08:25 QP Hou <houqp....@gmail.com> wrote:
> >
> > > Just my two cents.
> > >
> > > I think we all have the same goal here, which is to accelerate the
> > > transitioning of arrow to arrow2 as the official arrow rust
> > > implementation.
> > >
> > > In my opinion, the biggest gain we can get from merging two projects
> > > into one repo is to have some kind of a policy to enforce that every
> > > new feature/test added to the current arrow implementation also  needs
> > > to be added to the arrow2 implementation. This way, we can make sure
> > > the gap between arrow and arrow2 is closing on every iteration.
> > > Without this, I tend to agree with Jorge that merging two repos would
> > > add more overhead to his work and slow him down.
> > >
> > > For those who want to contribute to arrow2 to accelerate the
> > > transition, I don't think they would have problem sending PRs to the
> > > arrow2 repo. For those who are not interested in contributing to
> > > arrow2, merging the arrow2 code base into the current arrow-rs repo
> > > won't incentivize them to contribute. Merging arrow2 into current
> > > arrow-rs repo could help with discovery. But I think this can be
> > > achieved by adding a big note in the current arrow-rs README to
> > > encourage contributions to the arrow2 repo as well.
> > >
> > > At the end of the day, Jorge is currently the sole active contributor
> > > to the arrow2 implementation, so I think he would have the most say on
> > > what's the most productive way to push arrow2 forward. The only
> > > concern I have with regards to merging arrow2 into arrow-rs right now
> > > is Jorge spent all the efforts to do the merge, then it turned out
> > > that he is still the only active contributor to arrow2 within
> > > arrow-rs, but with more overhead that he has to deal with.
> > >
> > > As for maintaining semantic versioning for arrow2, Andy had a good
> > > point that we could still release arrow2 with its own versioning even
> > > if we merge it into the arrow-rs repo. So I don't think we should
> > > worry/focus too much about versioning in our discussion. Velocity to
> > > close the gap between arrow-rs and arrow2 is the most important thing.
> > >
> > > Lastly, I do agree with Andrew that it would be good to only maintain
> > > a single arrow crate in crates.io in the long run. As he mentioned,
> > > when the current arrow2 code base becomes stable, we could still
> > > release it under the arrow namespace in crates.io with a major version
> > > bump. The absolute value in the major version doesn't really matter as
> > > long as we stick to the convention that breaking change will result in
> > > a major version bump.
> > >
> > > Thanks,
> > > QP
> > >
> > >
> > >
> > > On Tue, Aug 3, 2021 at 5:31 PM paddy horan <paddyho...@hotmail.com>
> > wrote:
> > > >
> > > > Hi Jorge,
> > > >
> > > > I see value in consolidating development in a single repo and
> releasing
> > > under the existing arrow crate.  Regarding versioning, I think once we
> > > follow semantic versioning we are fine.  I don't think it's worth
> > migrating
> > > to a different repo and crate to comply with the de-facto standard you
> > > mention.
> > > >
> > > > Just one person's opinion though,
> > > > Paddy
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com>
> > > > Sent: Tuesday, August 3, 2021 5:23 PM
> > > > To: dev@arrow.apache.org
> > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward
> > > >
> > > > Hi Paddy,
> > > >
> > > > > What do you think about moving Arrow2 into the main Arrow repo
> where
> > > > > it
> > > > is only enabled via an "experimental" feature flag?
> > > >
> > > > AFAIK this is already possible:
> > > > * add `arrow2 = { version = "0.2.0", optional = true }` to Cargo.toml
> > > > * add `#[cfg(feature = "arrow2")]\npub mod arrow2;\n` to lib.rs
> > > >
> > > > We do this kind of thing to expose APIs from non-arrow crates such as
> > > parts of the parquet-format-rs crate, and is generally the way to go
> > when a
> > > crate wants to expose a third-party API.
> > > >
> > > > I would not recommend doing this, though: by exposing arrow2 from
> > arrow,
> > > we double the compilation time and binary size of all dependencies that
> > > activate the flag. Furthermore, there are users of arrow2 that do not
> > need
> > > the arrow crate, which this model would not support.
> > > >
> > > > AFAIK where development happens is unrelated to this aspect, Rust
> > > enables this by design.
> > > >
> > > > > but also this would be a clear signal that Arrow2 is <1.0.
> > > > > the experimental flag will be a clear signal to the existing Arrow
> > > > community that Arrow2 is the future but that it is <1.0
> > > >
> > > > arrow2 is already <1.0 <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Farrow2&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=bJEw92M9Lz8cxJZ0o3vc0ezpou%2BuQx1S0MYeODKCKmE%3D&amp;reserved=0
> > >.
> > > My argument is that the arrow/arrow-flight/parquet are not versioned
> > > according to the Rust community standards: It is a de facto practice in
> > > Rust to delay major releases until the API is stable. Tokio's blog post
> > > about their 1.0 <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio.rs%2Fblog%2F2020-12-tokio-1-0&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=En8p4k7Etyc%2BnQ3mJC4woQD%2Fkt7Uhmhw%2Bzf8scHhdgQ%3D&amp;reserved=0
> > >
> > > (i.e. "[...] we commit to holding back on a Tokio 2.0 release for at
> > least
> > > 3 years."). 10 most downloaded
> > > > crates:
> > > >
> > > > *
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=sBxp1XYBLl6OIV57nM%2FGsZO0AmbgyBeRaoPANEvdZGE%3D&amp;reserved=0
> > > (0.8.4)
> > > > *
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fsyn&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oeQliVwSgrvgART7r49XeiM%2F72TYa7hX8M3QyVDrqsk%3D&amp;reserved=0
> > > (1.0.74)
> > > > *
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Flibc&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OULOu9vhaWEgnavRqedebM7ceZRsVnaF7YjYuq1MJ3Y%3D&amp;reserved=0
> > > (0.2.98)
> > > > *
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand_core&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mx6X86bNRis6UykbWR%2FWTGEgAjq8h6JylmOSAQlfsh0%3D&amp;reserved=0
> > > (0.6.3)
> > > > * quote (1.0.9)
> > > > * unicode-xid (0.2.2)
> > > > * proc-macro2 (1.0.28)
> > > > * cfg-if (1.0.0)
> > > > *
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fserde&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=p%2FNgTB0839C1%2F1Zn4GeEnRtvr0hiFhOuBJ5tF76aW5E%3D&amp;reserved=0
> > > (1.0.126)
> > > > * bitflags (1.2.1)
> > > >
> > > > These are small crates with a small scope, but even larger projects
> > > share the same pattern:
> > > >
> > > > * crossbeam <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fcrossbeam&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9C%2BX5DnKLpp%2F8aTGrmKNB73Jf5JanlL4OhuC0YKgw9s%3D&amp;reserved=0
> > >
> > > (0.8.1)
> > > > * rocket <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frocket&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Jh93g%2BiXxoeKlTNzhaOKvs3bsBfIJO3DJeetBI3nBV0%3D&amp;reserved=0
> > >
> > > (0.5)
> > > > * polars <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fpolars&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Pdzno7bF3oqviXmv6nxInZemHD1d0SsaxmfdUxJ57T0%3D&amp;reserved=0
> > >
> > > (0.14.8)
> > > > * tower <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftower&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AmUGvrzXd8giphnKq0FNwjnc4a4Ki3T3GJL3P8rvEeM%3D&amp;reserved=0
> > >
> > > (0.4.8)
> > > > * Tokio <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftokio&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Z%2FqBVQ%2Fi0BCmSJiBL7E6y%2F%2BbMVGKYXdo3oCRGOjm5UA%3D&amp;reserved=0
> > >
> > > (1.9.0)
> > > > * hyper <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fhyper&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=c%2Fy4eY0BQCXE8XIoSb6UZAVUx4U%2BwcRUKN9jGJs5v3w%3D&amp;reserved=0
> > >
> > > (0.14.11)
> > > >
> > > > Crates that arrow depends on
> > > > <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-rs%2Fblob%2Fmaster%2Farrow%2FCargo.toml&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=DdGZFC5Hf7i362%2FmhfFQUVVPnkDBJzw0zM6AzQ4jgcQ%3D&amp;reserved=0
> > > >,
> > > > that DataFusion
> > > > depends on
> > > > <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-datafusion%2Fblob%2Fmaster%2Fdatafusion%2FCargo.toml&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OXKyW4O6q4hn6ZCHTN2jIvJpI3Iv8JvBBa0zKzBgZag%3D&amp;reserved=0
> > > >,
> > > > all share the same pattern of being either 0.X, 1.X when their API is
> > > stable, and 2.X when they needed a large change in the API. This
> > contrasts
> > > with Apache Arrow's releases where we are now at 5.0 (and we have yet
> to
> > > arrive at a safe design).
> > > >
> > > > > existing users will be well supported in this transition
> > > >
> > > > How so? imo people either PR to the arrow/arrow2 code base or they
> > won't.
> > > > This is largely independent of where the development of either arrow2
> > or
> > > arrow happens; people google the crate, click on the repository link
> and
> > > file an issue or field a PR.
> > > >
> > > > > In general, I think the longer that development proceeds in
> separate
> > > > repos the harder it will be to eventually merge the two in a way that
> > > supports existing users.
> > > >
> > > > How so? I may be mistaken, but API design is unrelated to on which
> repo
> > > the development happens: it is primarily driven by who is designing it
> > and
> > > from where or who they are inspired by. Both arrow and parquet's crate
> > > design are inspired by the C++ implementation and have gradually been
> > > migrated to "idiomatic" Rust, as "idiomatic" is becoming more well
> > defined
> > > in Rust.
> > > > Arrow2 is inspired by the current crate and the pains of using it in
> > > DataFusion. Datafuse, a fork of datafusion, recently migrated to arrow2
> > > > <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdatafuselabs%2Fdatafuse%2Fpull%2F1239&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0W9AeIxXcAvCrXkOE%2F1h0o%2BWam15PHEP7Pf7U1L84As%3D&amp;reserved=0
> > >:
> > > +1,947 −3,484, which shows that the crate is capturing important
> patterns
> > > from the arrow crate and exposing ones that are useful / result in less
> > > code for the same or higher performance.
> > > >
> > > > On the opposite side, merging the development of crates under the
> same
> > > repo leads to: more triagging of PRs; more work for releases and
> > > changelogging; tagging based on crates; multiple READMEs in subpaths of
> > the
> > > repo, curation of the CI to accommodate this, a workspace with many
> > crates
> > > each with its own set of dependencies, increasing compilation and
> > > development; mixed commit logs, difficulties in reverts and
> cherry-picks;
> > > more difficult to find stuff in the repo. See e.g. how tokio-rs does
> it:
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nZUiKNr1DmeTNJLqiZgKX5P7nb6jt0OuZlufMywmDBE%3D&amp;reserved=0
> > ,
> > > even for small crates like bytes <
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs%2Fbytes&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ltf66TZejbomCtlqvhmDswFfdrunChIz5rDTeZzwyRU%3D&amp;reserved=0
> > > >.
> > > >
> > > > Best,
> > > > Jorge
> > > >
> > > > On Tue, Aug 3, 2021 at 3:13 PM paddy horan <paddyho...@hotmail.com>
> > > wrote:
> > > >
> > > > > Hi Jorge,
> > > > >
> > > > > What do you think about moving Arrow2 into the main Arrow repo
> where
> > > > > it is only enabled via an "experimental" feature flag?  This would
> > > > > allow development of Arrow2 to proceed in the main repo but also
> this
> > > > > would be a clear signal that Arrow2 is <1.0.  When we feel ready
> > (i.e.
> > > > > Arrow2 is 1.0) we can release it in the next main release with
> Arrow2
> > > > > being the default and move the existing implementation behind a
> > > "legacy" feature flag.
> > > > >
> > > > > Here is why I think this might work well:
> > > > >  - People contributing to the Arrow project will naturally
> contribute
> > > > > to Arrow2.  At the moment, some people will still contribute to
> Arrow
> > > > > instead of Arrow2 just by virtue of it being the "official"
> > > implementation.
> > > > > However, if both are in one repo people will want to contribute to
> > the
> > > > > "future", i.e. Arrow2.
> > > > >  - the experimental flag will be a clear signal to the existing
> Arrow
> > > > > community that Arrow2 is the future but that it is <1.0
> > > > >  - existing users will be well supported in this transition
> > > > >  - In general, I think the longer that development proceeds in
> > > > > separate repos the harder it will be to eventually merge the two
> in a
> > > > > way that supports existing users.
> > > > >
> > > > > Do you think would work?
> > > > >
> > > > > Paddy
> > > > >
> > > > > -----Original Message-----
> > > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com>
> > > > > Sent: Monday, August 2, 2021 1:59 PM
> > > > > To: dev@arrow.apache.org
> > > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward
> > > > >
> > > > > Hi,
> > > > >
> > > > > Sorry for the delay.
> > > > >
> > > > > If there is a path towards an official release under a <1.0.0
> > > > > versioning schema aligned with the rest of the Rust ecosystem and
> in
> > > > > line with the stability of the API, then IMO we should move all
> > > > > development to within Apache experimental asap (I can handle this
> and
> > > > > the likely IP clearance round). If we require a release >=1.X.Y to
> it
> > > > > and/or a schedule, then I prefer to keep expectations aligned and
> > > postpone any movement.
> > > > >
> > > > > Under the move situation, I was thinking in something as follows:
> > > > >
> > > > > * gradually stop maintaining "arrow" in crates, offering a
> > maintenance
> > > > > window over which we release patches (*)
> > > > > * work towards achieving feature parity on arrow2/parquet2 on the
> > > > > experimental repos.
> > > > > * keep releasing arrow2/parquet2 under a 0.X model during the step
> > > > > above
> > > > > (**)
> > > > > * migrate to arrow-rs and archive experimentals (***)
> > > > > * break arrow2 in smaller crates so that we can version the APIs
> at a
> > > > > different cadence
> > > > > * once a crate reaches some stability (this is always opinionated,
> > but
> > > > > it is fine), we bump it to 1.0 and announce a maintenance plan ala
> > > > > tokio <
> > > > >
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio
> > > > >
> > .rs%2Fblog%2F2020-12-tokio-1-0&amp;data=04%7C01%7C%7Ca37de2cddc6e447a7
> > > > >
> > 77b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225
> > > > >
> > 764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi
> > > > >
> > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oHPQI8MeSumgLTEsawCkRN
> > > > > 5hANft%2BkbLTEmLZ3pIDiU%3D&amp;reserved=0
> > > > > >.
> > > > >
> > > > > (*) e.g. "we will continue to patch the arrow crate up to at least
> 6
> > > > > months starting after the first release of arrow2 that supports
> > > > > a) nested parquet read and write
> > > > > b) union array (including IPC integration tests)
> > > > > c) map array (including IPC integration tests)"
> > > > >
> > > > > (**) officially or un-officially (I would suggest officially so
> that
> > > > > we can acknowledge everyone's work on it, but no strong feelings)
> > > > >
> > > > > (***) something like:
> > > > > 1. place arrow2 on top of a clear arrow repo so that the full
> > > > > contribution history up to that point preserved 2. make arrow-rs
> the
> > > > > home of arrow2 (i.e. we start releasing arrow2 from
> > > > > arrow-rs) and archive the experimental repos; create
> arrow-rs-parquet
> > > > > or something for parquet2.
> > > > >
> > > > > In summary, the core pain point for me is the current versioning of
> > > > > arrow, which I feel is incompatible with my goals for arrow2 and
> the
> > > > > ecosystem I envision it supporting :)
> > > > >
> > > > > Best,
> > > > > Jorge
> > > > >
> > > > > On Fri, Jul 30, 2021 at 8:44 PM Wes McKinney <wesmck...@gmail.com>
> > > wrote:
> > > > >
> > > > > > I think it would also be fine to push "beta" arrow2 crates out
> of a
> > > > > > repo under apache/ so long as they are not marked on crates.io
> as
> > > > > > being Apache-official releases. There's a possible slippery slope
> > > > > > there, but as long as we are on a path to formalizing the
> releases
> > I
> > > > > think it is okay.
> > > > > >
> > > > > > On Fri, Jul 30, 2021 at 1:07 PM Andrew Lamb <
> al...@influxdata.com>
> > > > > wrote:
> > > > > >
> > > > > > > Jorge -- do you feel like we have a resolution on what to do
> with
> > > > > > > arrow2
> > > > > > in
> > > > > > > the near term?
> > > > > > >
> > > > > > > The current state of affairs seems to me that arrow2 is
> released
> > > > > > > from
> > > > > > >
> > > > >
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
> > > > > b.com
> > %2Fjorgecarleitao%2Farrow2&amp;data=04%7C01%7C%7Ca37de2cddc6e447a
> > > > >
> > 777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C63763622
> > > > >
> > 5764541982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> > > > >
> > iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=jNo5puUzWEOmWj3wIs8CN
> > > > > p44WmsoaRQGfsRdWgrftwE%3D&amp;reserved=0
> > > > > to crates.io (which is fine).
> > > > > > > Are
> > > > > > > you happy with keeping development in the jorgecarleitao repo
> > > > > > > where you will retain maximal control and flexibility until it
> is
> > > > > > > ready to start integrating?
> > > > > > >
> > > > > > > Or would you prefer to put it into one of the apache repos and
> > > > > > > subject
> > > > > > its
> > > > > > > development and release to the normal Arrow governance model
> > > > > > > (tarball, vote, etc)?
> > > > > > >
> > > > > > > Since you are the primary author/architect I think you should
> > have
> > > > > > > a substantial say at this stage.
> > > > > > >
> > > > > > > Andrew
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jul 27, 2021 at 7:16 PM Andrew Lamb <
> > al...@influxdata.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > I would be happy with this approach. Thank you for the
> > > > > > > > suggestion
> > > > > > > >
> > > > > > > > This hybrid approach of both arrow and arrow2 in the same
> repo
> > > > > > > > seems better to me than separate repos.
> > > > > > > >
> > > > > > > > What I really care about is ensuring we don't have two
> > > > > > > > crates/APIs indefinitely -- as long as we are continually
> > making
> > > > > > > > progress towards unification that is what is important to me.
> > > > > > > >
> > > > > > > > Andrew
> > > > > > > >
> > > > > > > > On Tue, Jul 27, 2021 at 1:40 PM Andy Grove
> > > > > > > > <andygrov...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Apologies for being late to this discussion.
> > > > > > > >>
> > > > > > > >> There is a hybrid option to consider here where we add the
> > > > > > > >> arrow2 code into the arrow crate as a separate module, so we
> > > > > > > >> release one crate
> > > > > > containing
> > > > > > > >> the "old" API (which we can mark as deprecated) as well as
> the
> > > > > > > >> new
> > > > > > API.
> > > > > > > >> Java did a similar thing a long time ago with "java.io"
> > versus
> > > > > > > "java.nio"
> > > > > > > >> (new IO).
> > > > > > > >>
> > > > > > > >> I agree that the versioning wouldn't be ideal, but this
> seems
> > > > > > > >> like it might be a pragmatic compromise?
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >>
> > > > > > > >> Andy.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Tue, Jul 20, 2021 at 5:41 AM Andrew Lamb
> > > > > > > >> <al...@influxdata.com>
> > > > > > > wrote:
> > > > > > > >>
> > > > > > > >> > What I meant is that when you decide arrow2 is suitable
> for
> > > > > > > >> > release
> > > > > > to
> > > > > > > >> > existing arrow users, I stand ready to help you
> incorporate
> > > > > > > >> > it into
> > > > > > > >> arrow.
> > > > > > > >> >
> > > > > > > >> > All the feedback I have heard so far from the rest of the
> > > > > > > >> > community
> > > > > > is
> > > > > > > >> that
> > > > > > > >> > we are ready. One might even say we are anxious to do so
> :)
> > > > > > > >> >
> > > > > > > >> > Andrew
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>

Reply via email to