Hi, Thanks for the detailed answer.
In contrast to my previous email, my opinionated part: Generally I like the idea of smaller crates, it helps with a lot of stuff (different targets, build time), but those benefits can be achieved by feature gates too. The upside would be out-of-sync crate releases. Maintenance is important, historically speaking I've seen it solved for open source by private companies offering it as a paid service. You are right that currently only 3 months of support is provided for free, but personally I don't see that as an issue. There are professional libraries and software with close to 100% market share in their field which support the last or last two versions only (Chrome, OS-es, compilers). I find it hard to imagine we'd want to do it *better*, that sounds to be an illusion, but I'd like to be wrong on this one :) Professionally speaking, when picking projects, having Apache (or other) governance and community is more important for the businesses I worked with, than the release schedule or API stability / versioning. Based on the above and that there are about a dozen active Rust arrow contributors, any promise for reliable maintenance over years would be a lie in my eyes. DataFusion, Polars, odbc2parquet and others had issues with the changes being too slow, not too fast. I'm a big advocate of middle grounds and I still believe that your efforts and ideal setup is compatible with arrow-rs, nobody would stop you creating a 5.23.0 release next to the 6.1.0 if you'd want to backport anything and nobody would stop you cutting an out-of-schedule 6.2 or even 7.0 release if it's to ensure security. The frequent Apache release process - which we were afraid of - was smooth so far, with surprisingly nice support from members of different languages / implementations. Also I believe that any plan you'd have turning arrow2 into arrow-rs 6.0 would be more than welcome on a public vote, along with the technical chances you propose (eg. cutting a separate arrow-io crate). At least 6 key members showed their excitement for your changes in this thread and even more on Slack/GitHub ;) Best regards, Adam Lippai On Fri, Aug 6, 2021 at 10:07 AM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote: > Hi, > > Thanks for your input. > > Every time there is a new major release, all new development shifts towards > that new API and users of previous APIs are left behind. It is not just a > matter of SemVer and size of version numbers, there is a whole development > shift to be on top of the new API. > > I disagree that a software that has a major release every 3 months and no > maintenance window over previous versions is stable. I alluded to the Tokio > example because Tokio 1.0 recently became the runtime of rust-based AWS > lambda functions [1]; this commitment is only possible by enforcing API > stability and maintenance beyond a 3 month period (at least 3 years in > their case). > > Also, imo the current major version number is not meaningless: divided by > the software age, it constitutes the historical release pattern and is > usually a good predictor of the pattern used in future releases. > > The evidence is that we haven't been able to support any version for any > period of time; recently, Andrew has been doing amazing work at supporting > the latest version for a period of 3 months. I.e. an application that > depends on `arrow = ^5.0` has a support window of 3 months. Given that we > have not backported any security fixes to previous versions, it is > reasonable to assume that security patches are also applied within a 3 > month period only. > > As contributor of arrow2, I would rather not have arrow2 under Apache Arrow > than having to release it under its current versioning and scheduling (this > is similar to some of Julia's concerns). As a contributor to the Apache > Arrow, I currently cannot guarantee a maintenance window over arrow-rs for > any period of time because it is unsafe by design and I do not have the > motivation to fix it. As both, I am confident that the core arrow2 will > soon reach a point where we can live with and develop on top of it for at > least a year. This is not true to the whole API surface, though: there are > APIs that we will need to change more often until stability can be > promised. > > So, I am requesting that we tie the discussion of arrow2 to how it will be > released. > > Could a middle ground be somewhere along the lines of splitting the crate > in smaller crates that are versioned independently. I.e. continue to > release `arrow` under the same versioning and cadence, and create 3 new > crates, arrow-core, arrow-compute, and arrow-io (see also [2]) that would > have their own versioning at 0.X until stability is achieved, based on > arrow2's code base. The migration of the `arrow` crate to arrow2's API > would be to re-export from the smaller crates (e.g. `pub use > arrow_core::array`). > > [1] https://crates.io/crates/lambda_runtime/0.3.1/dependencies > [2] https://github.com/jorgecarleitao/arrow2/issues/257 > > Best, > Jorge > > > On Thu, Aug 5, 2021 at 11:53 PM Adam Lippai <a...@rigo.sk> wrote: > > > Not taking sides, just two technical notes below. > > > > Server.org clearly defines ( > > https://semver.org/#how-do-i-know-when-to-release-100) the versions > > >1.0.0. > > * If it's used in production, it's 1.0.0. > > * If it provides an API others depend on then it's 1.0.0. > > * If you intend to keep backward compatibility, it's 1.0.0. > > Tl;Dr 1.0.0 represents a version which from point we guarantee that > > non-production releases are marked (alpha, beta, rc) and breaking (API) > > changes, backwards incompatible changes result in major version bump. > This > > we already do, 4x per year. > > > > The second fact is that arrow2 uses the arrow name, but it doesn't have > > apache governance. It's not released from GitHub.com/apache, there are no > > formal releases, there are no votes. This is not correct or fair usage of > > the brand (on the same level as DataFuse, or db-benchmark calling a > custom > > R implementation arrow) even if it's "unofficial". My understanding is > that > > arrow2 can be an unofficial implementation with a different name or an > > arrow-rs experiment with the intention to merge the code, but not both. > > > > I think both issues could be solved and I really value and like the > arrow2 > > work so far. That's the right way. I hope we'll see it in prod either way > > as soon as it's ready. > > > > Best regards, > > Adam Lippai > > > > On Wed, Aug 4, 2021, 08:25 QP Hou <houqp....@gmail.com> wrote: > > > > > Just my two cents. > > > > > > I think we all have the same goal here, which is to accelerate the > > > transitioning of arrow to arrow2 as the official arrow rust > > > implementation. > > > > > > In my opinion, the biggest gain we can get from merging two projects > > > into one repo is to have some kind of a policy to enforce that every > > > new feature/test added to the current arrow implementation also needs > > > to be added to the arrow2 implementation. This way, we can make sure > > > the gap between arrow and arrow2 is closing on every iteration. > > > Without this, I tend to agree with Jorge that merging two repos would > > > add more overhead to his work and slow him down. > > > > > > For those who want to contribute to arrow2 to accelerate the > > > transition, I don't think they would have problem sending PRs to the > > > arrow2 repo. For those who are not interested in contributing to > > > arrow2, merging the arrow2 code base into the current arrow-rs repo > > > won't incentivize them to contribute. Merging arrow2 into current > > > arrow-rs repo could help with discovery. But I think this can be > > > achieved by adding a big note in the current arrow-rs README to > > > encourage contributions to the arrow2 repo as well. > > > > > > At the end of the day, Jorge is currently the sole active contributor > > > to the arrow2 implementation, so I think he would have the most say on > > > what's the most productive way to push arrow2 forward. The only > > > concern I have with regards to merging arrow2 into arrow-rs right now > > > is Jorge spent all the efforts to do the merge, then it turned out > > > that he is still the only active contributor to arrow2 within > > > arrow-rs, but with more overhead that he has to deal with. > > > > > > As for maintaining semantic versioning for arrow2, Andy had a good > > > point that we could still release arrow2 with its own versioning even > > > if we merge it into the arrow-rs repo. So I don't think we should > > > worry/focus too much about versioning in our discussion. Velocity to > > > close the gap between arrow-rs and arrow2 is the most important thing. > > > > > > Lastly, I do agree with Andrew that it would be good to only maintain > > > a single arrow crate in crates.io in the long run. As he mentioned, > > > when the current arrow2 code base becomes stable, we could still > > > release it under the arrow namespace in crates.io with a major version > > > bump. The absolute value in the major version doesn't really matter as > > > long as we stick to the convention that breaking change will result in > > > a major version bump. > > > > > > Thanks, > > > QP > > > > > > > > > > > > On Tue, Aug 3, 2021 at 5:31 PM paddy horan <paddyho...@hotmail.com> > > wrote: > > > > > > > > Hi Jorge, > > > > > > > > I see value in consolidating development in a single repo and > releasing > > > under the existing arrow crate. Regarding versioning, I think once we > > > follow semantic versioning we are fine. I don't think it's worth > > migrating > > > to a different repo and crate to comply with the de-facto standard you > > > mention. > > > > > > > > Just one person's opinion though, > > > > Paddy > > > > > > > > > > > > -----Original Message----- > > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com> > > > > Sent: Tuesday, August 3, 2021 5:23 PM > > > > To: dev@arrow.apache.org > > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward > > > > > > > > Hi Paddy, > > > > > > > > > What do you think about moving Arrow2 into the main Arrow repo > where > > > > > it > > > > is only enabled via an "experimental" feature flag? > > > > > > > > AFAIK this is already possible: > > > > * add `arrow2 = { version = "0.2.0", optional = true }` to Cargo.toml > > > > * add `#[cfg(feature = "arrow2")]\npub mod arrow2;\n` to lib.rs > > > > > > > > We do this kind of thing to expose APIs from non-arrow crates such as > > > parts of the parquet-format-rs crate, and is generally the way to go > > when a > > > crate wants to expose a third-party API. > > > > > > > > I would not recommend doing this, though: by exposing arrow2 from > > arrow, > > > we double the compilation time and binary size of all dependencies that > > > activate the flag. Furthermore, there are users of arrow2 that do not > > need > > > the arrow crate, which this model would not support. > > > > > > > > AFAIK where development happens is unrelated to this aspect, Rust > > > enables this by design. > > > > > > > > > but also this would be a clear signal that Arrow2 is <1.0. > > > > > the experimental flag will be a clear signal to the existing Arrow > > > > community that Arrow2 is the future but that it is <1.0 > > > > > > > > arrow2 is already <1.0 < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Farrow2&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=bJEw92M9Lz8cxJZ0o3vc0ezpou%2BuQx1S0MYeODKCKmE%3D&reserved=0 > > >. > > > My argument is that the arrow/arrow-flight/parquet are not versioned > > > according to the Rust community standards: It is a de facto practice in > > > Rust to delay major releases until the API is stable. Tokio's blog post > > > about their 1.0 < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio.rs%2Fblog%2F2020-12-tokio-1-0&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=En8p4k7Etyc%2BnQ3mJC4woQD%2Fkt7Uhmhw%2Bzf8scHhdgQ%3D&reserved=0 > > > > > > (i.e. "[...] we commit to holding back on a Tokio 2.0 release for at > > least > > > 3 years."). 10 most downloaded > > > > crates: > > > > > > > > * > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sBxp1XYBLl6OIV57nM%2FGsZO0AmbgyBeRaoPANEvdZGE%3D&reserved=0 > > > (0.8.4) > > > > * > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fsyn&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oeQliVwSgrvgART7r49XeiM%2F72TYa7hX8M3QyVDrqsk%3D&reserved=0 > > > (1.0.74) > > > > * > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Flibc&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OULOu9vhaWEgnavRqedebM7ceZRsVnaF7YjYuq1MJ3Y%3D&reserved=0 > > > (0.2.98) > > > > * > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand_core&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mx6X86bNRis6UykbWR%2FWTGEgAjq8h6JylmOSAQlfsh0%3D&reserved=0 > > > (0.6.3) > > > > * quote (1.0.9) > > > > * unicode-xid (0.2.2) > > > > * proc-macro2 (1.0.28) > > > > * cfg-if (1.0.0) > > > > * > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fserde&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=p%2FNgTB0839C1%2F1Zn4GeEnRtvr0hiFhOuBJ5tF76aW5E%3D&reserved=0 > > > (1.0.126) > > > > * bitflags (1.2.1) > > > > > > > > These are small crates with a small scope, but even larger projects > > > share the same pattern: > > > > > > > > * crossbeam < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fcrossbeam&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9C%2BX5DnKLpp%2F8aTGrmKNB73Jf5JanlL4OhuC0YKgw9s%3D&reserved=0 > > > > > > (0.8.1) > > > > * rocket < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frocket&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Jh93g%2BiXxoeKlTNzhaOKvs3bsBfIJO3DJeetBI3nBV0%3D&reserved=0 > > > > > > (0.5) > > > > * polars < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fpolars&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Pdzno7bF3oqviXmv6nxInZemHD1d0SsaxmfdUxJ57T0%3D&reserved=0 > > > > > > (0.14.8) > > > > * tower < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftower&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=AmUGvrzXd8giphnKq0FNwjnc4a4Ki3T3GJL3P8rvEeM%3D&reserved=0 > > > > > > (0.4.8) > > > > * Tokio < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftokio&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Z%2FqBVQ%2Fi0BCmSJiBL7E6y%2F%2BbMVGKYXdo3oCRGOjm5UA%3D&reserved=0 > > > > > > (1.9.0) > > > > * hyper < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fhyper&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=c%2Fy4eY0BQCXE8XIoSb6UZAVUx4U%2BwcRUKN9jGJs5v3w%3D&reserved=0 > > > > > > (0.14.11) > > > > > > > > Crates that arrow depends on > > > > < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-rs%2Fblob%2Fmaster%2Farrow%2FCargo.toml&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DdGZFC5Hf7i362%2FmhfFQUVVPnkDBJzw0zM6AzQ4jgcQ%3D&reserved=0 > > > >, > > > > that DataFusion > > > > depends on > > > > < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-datafusion%2Fblob%2Fmaster%2Fdatafusion%2FCargo.toml&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OXKyW4O6q4hn6ZCHTN2jIvJpI3Iv8JvBBa0zKzBgZag%3D&reserved=0 > > > >, > > > > all share the same pattern of being either 0.X, 1.X when their API is > > > stable, and 2.X when they needed a large change in the API. This > > contrasts > > > with Apache Arrow's releases where we are now at 5.0 (and we have yet > to > > > arrive at a safe design). > > > > > > > > > existing users will be well supported in this transition > > > > > > > > How so? imo people either PR to the arrow/arrow2 code base or they > > won't. > > > > This is largely independent of where the development of either arrow2 > > or > > > arrow happens; people google the crate, click on the repository link > and > > > file an issue or field a PR. > > > > > > > > > In general, I think the longer that development proceeds in > separate > > > > repos the harder it will be to eventually merge the two in a way that > > > supports existing users. > > > > > > > > How so? I may be mistaken, but API design is unrelated to on which > repo > > > the development happens: it is primarily driven by who is designing it > > and > > > from where or who they are inspired by. Both arrow and parquet's crate > > > design are inspired by the C++ implementation and have gradually been > > > migrated to "idiomatic" Rust, as "idiomatic" is becoming more well > > defined > > > in Rust. > > > > Arrow2 is inspired by the current crate and the pains of using it in > > > DataFusion. Datafuse, a fork of datafusion, recently migrated to arrow2 > > > > < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdatafuselabs%2Fdatafuse%2Fpull%2F1239&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0W9AeIxXcAvCrXkOE%2F1h0o%2BWam15PHEP7Pf7U1L84As%3D&reserved=0 > > >: > > > +1,947 −3,484, which shows that the crate is capturing important > patterns > > > from the arrow crate and exposing ones that are useful / result in less > > > code for the same or higher performance. > > > > > > > > On the opposite side, merging the development of crates under the > same > > > repo leads to: more triagging of PRs; more work for releases and > > > changelogging; tagging based on crates; multiple READMEs in subpaths of > > the > > > repo, curation of the CI to accommodate this, a workspace with many > > crates > > > each with its own set of dependencies, increasing compilation and > > > development; mixed commit logs, difficulties in reverts and > cherry-picks; > > > more difficult to find stuff in the repo. See e.g. how tokio-rs does > it: > > > > > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nZUiKNr1DmeTNJLqiZgKX5P7nb6jt0OuZlufMywmDBE%3D&reserved=0 > > , > > > even for small crates like bytes < > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs%2Fbytes&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ltf66TZejbomCtlqvhmDswFfdrunChIz5rDTeZzwyRU%3D&reserved=0 > > > >. > > > > > > > > Best, > > > > Jorge > > > > > > > > On Tue, Aug 3, 2021 at 3:13 PM paddy horan <paddyho...@hotmail.com> > > > wrote: > > > > > > > > > Hi Jorge, > > > > > > > > > > What do you think about moving Arrow2 into the main Arrow repo > where > > > > > it is only enabled via an "experimental" feature flag? This would > > > > > allow development of Arrow2 to proceed in the main repo but also > this > > > > > would be a clear signal that Arrow2 is <1.0. When we feel ready > > (i.e. > > > > > Arrow2 is 1.0) we can release it in the next main release with > Arrow2 > > > > > being the default and move the existing implementation behind a > > > "legacy" feature flag. > > > > > > > > > > Here is why I think this might work well: > > > > > - People contributing to the Arrow project will naturally > contribute > > > > > to Arrow2. At the moment, some people will still contribute to > Arrow > > > > > instead of Arrow2 just by virtue of it being the "official" > > > implementation. > > > > > However, if both are in one repo people will want to contribute to > > the > > > > > "future", i.e. Arrow2. > > > > > - the experimental flag will be a clear signal to the existing > Arrow > > > > > community that Arrow2 is the future but that it is <1.0 > > > > > - existing users will be well supported in this transition > > > > > - In general, I think the longer that development proceeds in > > > > > separate repos the harder it will be to eventually merge the two > in a > > > > > way that supports existing users. > > > > > > > > > > Do you think would work? > > > > > > > > > > Paddy > > > > > > > > > > -----Original Message----- > > > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com> > > > > > Sent: Monday, August 2, 2021 1:59 PM > > > > > To: dev@arrow.apache.org > > > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward > > > > > > > > > > Hi, > > > > > > > > > > Sorry for the delay. > > > > > > > > > > If there is a path towards an official release under a <1.0.0 > > > > > versioning schema aligned with the rest of the Rust ecosystem and > in > > > > > line with the stability of the API, then IMO we should move all > > > > > development to within Apache experimental asap (I can handle this > and > > > > > the likely IP clearance round). If we require a release >=1.X.Y to > it > > > > > and/or a schedule, then I prefer to keep expectations aligned and > > > postpone any movement. > > > > > > > > > > Under the move situation, I was thinking in something as follows: > > > > > > > > > > * gradually stop maintaining "arrow" in crates, offering a > > maintenance > > > > > window over which we release patches (*) > > > > > * work towards achieving feature parity on arrow2/parquet2 on the > > > > > experimental repos. > > > > > * keep releasing arrow2/parquet2 under a 0.X model during the step > > > > > above > > > > > (**) > > > > > * migrate to arrow-rs and archive experimentals (***) > > > > > * break arrow2 in smaller crates so that we can version the APIs > at a > > > > > different cadence > > > > > * once a crate reaches some stability (this is always opinionated, > > but > > > > > it is fine), we bump it to 1.0 and announce a maintenance plan ala > > > > > tokio < > > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio > > > > > > > .rs%2Fblog%2F2020-12-tokio-1-0&data=04%7C01%7C%7Ca37de2cddc6e447a7 > > > > > > > 77b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225 > > > > > > > 764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi > > > > > > > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oHPQI8MeSumgLTEsawCkRN > > > > > 5hANft%2BkbLTEmLZ3pIDiU%3D&reserved=0 > > > > > >. > > > > > > > > > > (*) e.g. "we will continue to patch the arrow crate up to at least > 6 > > > > > months starting after the first release of arrow2 that supports > > > > > a) nested parquet read and write > > > > > b) union array (including IPC integration tests) > > > > > c) map array (including IPC integration tests)" > > > > > > > > > > (**) officially or un-officially (I would suggest officially so > that > > > > > we can acknowledge everyone's work on it, but no strong feelings) > > > > > > > > > > (***) something like: > > > > > 1. place arrow2 on top of a clear arrow repo so that the full > > > > > contribution history up to that point preserved 2. make arrow-rs > the > > > > > home of arrow2 (i.e. we start releasing arrow2 from > > > > > arrow-rs) and archive the experimental repos; create > arrow-rs-parquet > > > > > or something for parquet2. > > > > > > > > > > In summary, the core pain point for me is the current versioning of > > > > > arrow, which I feel is incompatible with my goals for arrow2 and > the > > > > > ecosystem I envision it supporting :) > > > > > > > > > > Best, > > > > > Jorge > > > > > > > > > > On Fri, Jul 30, 2021 at 8:44 PM Wes McKinney <wesmck...@gmail.com> > > > wrote: > > > > > > > > > > > I think it would also be fine to push "beta" arrow2 crates out > of a > > > > > > repo under apache/ so long as they are not marked on crates.io > as > > > > > > being Apache-official releases. There's a possible slippery slope > > > > > > there, but as long as we are on a path to formalizing the > releases > > I > > > > > think it is okay. > > > > > > > > > > > > On Fri, Jul 30, 2021 at 1:07 PM Andrew Lamb < > al...@influxdata.com> > > > > > wrote: > > > > > > > > > > > > > Jorge -- do you feel like we have a resolution on what to do > with > > > > > > > arrow2 > > > > > > in > > > > > > > the near term? > > > > > > > > > > > > > > The current state of affairs seems to me that arrow2 is > released > > > > > > > from > > > > > > > > > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu > > > > > b.com > > %2Fjorgecarleitao%2Farrow2&data=04%7C01%7C%7Ca37de2cddc6e447a > > > > > > > 777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C63763622 > > > > > > > 5764541982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI > > > > > > > iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jNo5puUzWEOmWj3wIs8CN > > > > > p44WmsoaRQGfsRdWgrftwE%3D&reserved=0 > > > > > to crates.io (which is fine). > > > > > > > Are > > > > > > > you happy with keeping development in the jorgecarleitao repo > > > > > > > where you will retain maximal control and flexibility until it > is > > > > > > > ready to start integrating? > > > > > > > > > > > > > > Or would you prefer to put it into one of the apache repos and > > > > > > > subject > > > > > > its > > > > > > > development and release to the normal Arrow governance model > > > > > > > (tarball, vote, etc)? > > > > > > > > > > > > > > Since you are the primary author/architect I think you should > > have > > > > > > > a substantial say at this stage. > > > > > > > > > > > > > > Andrew > > > > > > > > > > > > > > > > > > > > > On Tue, Jul 27, 2021 at 7:16 PM Andrew Lamb < > > al...@influxdata.com> > > > > > > wrote: > > > > > > > > > > > > > > > I would be happy with this approach. Thank you for the > > > > > > > > suggestion > > > > > > > > > > > > > > > > This hybrid approach of both arrow and arrow2 in the same > repo > > > > > > > > seems better to me than separate repos. > > > > > > > > > > > > > > > > What I really care about is ensuring we don't have two > > > > > > > > crates/APIs indefinitely -- as long as we are continually > > making > > > > > > > > progress towards unification that is what is important to me. > > > > > > > > > > > > > > > > Andrew > > > > > > > > > > > > > > > > On Tue, Jul 27, 2021 at 1:40 PM Andy Grove > > > > > > > > <andygrov...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > >> Apologies for being late to this discussion. > > > > > > > >> > > > > > > > >> There is a hybrid option to consider here where we add the > > > > > > > >> arrow2 code into the arrow crate as a separate module, so we > > > > > > > >> release one crate > > > > > > containing > > > > > > > >> the "old" API (which we can mark as deprecated) as well as > the > > > > > > > >> new > > > > > > API. > > > > > > > >> Java did a similar thing a long time ago with "java.io" > > versus > > > > > > > "java.nio" > > > > > > > >> (new IO). > > > > > > > >> > > > > > > > >> I agree that the versioning wouldn't be ideal, but this > seems > > > > > > > >> like it might be a pragmatic compromise? > > > > > > > >> > > > > > > > >> Thanks, > > > > > > > >> > > > > > > > >> Andy. > > > > > > > >> > > > > > > > >> > > > > > > > >> On Tue, Jul 20, 2021 at 5:41 AM Andrew Lamb > > > > > > > >> <al...@influxdata.com> > > > > > > > wrote: > > > > > > > >> > > > > > > > >> > What I meant is that when you decide arrow2 is suitable > for > > > > > > > >> > release > > > > > > to > > > > > > > >> > existing arrow users, I stand ready to help you > incorporate > > > > > > > >> > it into > > > > > > > >> arrow. > > > > > > > >> > > > > > > > > >> > All the feedback I have heard so far from the rest of the > > > > > > > >> > community > > > > > > is > > > > > > > >> that > > > > > > > >> > we are ready. One might even say we are anxious to do so > :) > > > > > > > >> > > > > > > > > >> > Andrew > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >