Re: How to move forward about Rust? antioxidant, cargo2guix, etc.

Efraim Flashner Wed, 26 Feb 2025 03:46:13 -0800

On Wed, Feb 26, 2025 at 03:35:13AM +0100, Nicolas Graves wrote:
> 
> I rewrote this email 3 times already! I'm making a layout then.
> Sorry if the style is inconsistent.
> 
> 
> 1) About Antioxidant/rust-build-system
> 
> Tremendous, back in the day I could rebuild half the whole rust world in
> about a day.
> 
> Complexity was a thing however (definitely): you had to check the
> version, and then by trial and error you had all the #:features
> requires by any package in the chain for the final package, and then
> finally sometimes patch by hand... Just to notice that we need 5
> different versions of the same package anyway. The perfect alternative
> would be the cargo that doesn't require rebuilds in its computing model,
> because calling the compilers like in antioxidant... is indeed
> low-level.
> 
> IIRC, I didn't have cross-building with rust-build-system.
> 
> To be fair, it is possible to maintain but requires more human time and
> expertise (although less compilation time).
> 
> 2) About Workspaces
> 
> A thing I did that was not in antioxidant and that is applicable to the
> current build system is to build whole workspaces instead of only
> building packages. Here workspace = a set of packages built together.
> 
> The idea is that some entire files (crates-gtk for instance, but
> crates-crypto too, basically every -impl or -types package is in a
> workspace with its dependent package) are developped and built in the
> same repositories, but only released as different crates.
> 
> Antioxidant (but I expect cargo to be able too) was able to build a
> whole workspace at once.  I used multiple outputs, one for each crate to
> make that possible.  I didn't do that, but if we want real packages from
> there, we can just use a copy-build-system or a trivial-build-system
> with a simple link with the source set to the crate (one of the outputs
> of the workspace).
> 
> My crates-gtk.scm was only a hundred lines long IIRC. This would also
> help build times a bit, because although we rebuild a lot, if we produce
> 10 packages from a single build, we rebuild 10 times less.
> 
> I did factor out my rust-workspace-build-system in rust-build-system
> with this simple observation : a package can be described as a workspace
> with a single package, so I didn't even needed to carry a special
> build-system. (It made some composition fun too: I could inject a few
> packages to build together in a virtually created workspace).
> 
> It's also my objective to bring this idea to Node and possibly Go, which
> have the same concept of workspaces could make the maintenance burden
> significantly lower in Guix : we try to do too much compared to what
> these tools already do!
> 
> I haven't tried workspaces for this though : the cases where we have
> circular dependencies and might require #:cargo-inputs and
> #:cargo-development-inputs in the first place might be handled well by
> cargo by putting all the problematic packages (as seen by Guix) in the
> same workspace, letting cargo handle the build, and then composing
> packages on Guix side with copies/links to the outputs. I don't know
> however if we could build a suitable solution for getting rid of these
> from there. Something like "if a circular dependency affects only
> packages with node/rust/go build-systems, put them all in the same
> workspace, and then separate them back on Guix side".
> 
> 3) Current Rust in Guix situation is the worse than even in a classic 
> distribution
> 
> My point is however that "debundling" / the guix approach is a bad term
> for what currently happens in our crates.
> 
> Truthfully, we most likely don't even care about rust intermediate
> crates (like 90% of our rust crates).  We spend enormous efforts and
> energy trying to build and rebuild and rebuild crates that are only ever
> important in that they are a part of a user-facing binary or library.
> I would even argue provocatively that these 90% efforts are of no use to
> no one.
> 
> For instance take the case of a dependency of 3 user-facing packages at
> depth 5 for the three, currently is it rebuilt at least 15 times (or
> possibly double with tests).
> 
> Even in a classic linux distribution, it'll be rebuild multiple times,
> but only 3 times (or possibly double with tests).
> 
> Let's face facts: either we wish to go the full guix way, either we
> don't and should have a different approach (Murilo's suggestion is one).
> 
> 
> 4) The full Guix way
> 
> Most likely : we don't have the manpower in the short term for
> going the full guix way, because, and I second Murilo's point here, we
> dive in dependency hell quickly with more required efforts than
> reasonable (having implemented it myself, despite being quite proud of
> it). 
> 
> BUT: Maybe antioxidant is still doable. IIRC the hardest was not setting
> #:features correctly (I hope IRC). This is because the information about
> features required for parent packages is located in parent / dependent
> packages, not in child packages.
> 
> Cargo chooses to rebuild everything ; but there is at least another
> solution for that : parse all the dependents of a package to record the
> different features/versions a package would need to be generated
> with. This means when importing, also sinking into crates.io huge
> database, but we could construct an inverse cache or something.
> And an importer could then generate more than one version of a package,
> but several variants, if several variants are indeed required to build
> all its transitive parents.
> 
> To be even more precise we could search only in blessed.rs or lib.rs.
> But we need to pull that from Rust, with dependents too (and not in Guix
> like Antioxidant did).
> 
> This could lead to a channel like we've seen for Emacs packages or CRAN,
> generating regularly from complete upstream info, and we could
> incorporate piece by piece packages that build in this channel into
> Guix. With limitations: the deduplication of packages (having 4-5
> versions of the same package) will probably still be necessary.
> 
> Otherwise there's just a lot of work by hand, we probably would like to
> avoid.
> 
> 5) The classic distribution way, Guix flavored
> 
> Combining the transitive package approach with the workspace approach,
> we could reach such a workflow for a package:
> 
> - Extract all inputs from a Cargo.lock
> - If necessary, patch inputs in Guix (debundling)
> - Create a big workspace and inject all inputs inside (Guix side) (here,
>   no need (because no benefit) in defining intermediary packages in
>   Guix, just download sources). Not a whole file with hundreds of lines
>   to update regularly like Murilo proposed, rather a single <package> we
>   could bump like any other (as long as inputs' checksums are recorded
>   somewhere). 
> - Let cargo handle the "build" part, patching if necessary
> - Only extract the final output package we want users to have access to
> (binary or library)
> 
> Sounds a lot like what any classic distribution would do :p
> 
> I don't like the idea, but the environmental activist in me tells me
> it's a more sober way in both human and computer resources.  As
> counter-intuitive it might seem, as long as cargo's build model and
> ecosystem doesn't change, I don't see a better way.
> 
> We should still strive to check for binaries / nonfree code / bundled
> libraries in the inputs of course, replacing pure crate sources by our
> unbundled/patched versions, but not building them for the sake of it.


We currently have a 'check-for-pregenerated-files phase which searches
for some pre-compiled files with a common suffix:
(find-files "." "\\.(a|dll|dylib|exe|lib)$")
We also know which packages have bundled libraries that we unbundle, and
we should be able to create some sort of rule to check for them during a
build.  In general if a package is rust-foo-sys then we remove the file
'foo' from the sources.

> The main responsibility of the rust team in this case would simply be to
> manage a huge file of checksums (could be automated), check that inputs
> are not corrupted and don't bundle external libraries, that licenses are
> OK and that some necessary patches (linking libraries) are well managed.

We have a couple of packages with the comment "Inspired by Debian's
patch for bzip2-sys." where we basically rewrite the crate completely to
just link to the C library.  Some where we do some work to link to
packaged versions of libraries or make it build with newer versions of
rust.  Sometimes remove unneeded features.

> That would make us delete 90% of our crate definitions, but if they are
> of no use to no one (this is provocative, but prove me wrong here :p),
> that is probably the right thing to do.  We would basically keep
> rust-apps.scm, and any crate that is used in another package without
> being rebuilt.
> 
> (I see now that tests might be a thing ; yes but I'm still pretty sure
> it's not worth the cost).

I use the tests for those 90% crates just to see that they have the
correct inputs.  If it's all automated then they shouldn't be needed
beyond their sources.

> Conclusion)
> 
> I think we should go with (5) for now, maybe some motivated individual
> (not me this time, but I can make some guidance based on memory) will be
> able to tackle (4), rebase antioxidant, branch it to a rust package
> provider, get it to work automatically WITH FEATURES SUPPORT in a
> channel and then, once it works, at some point we ~should~ reconsider
> having another rust-build-system in Guix and slowly switch (from the
> ground up).
> 
> But the promise of a potential guix-aligned rust-build-system should not
> hinder a proper rethink of the way we currently do things, and I think
> we should do (5) no matter what. Even deleting so hard-earned
> crates, they most likely won't be helpful in a (4) scenario.
> 
> ---
> Below is a first version of my answer, after the discussion on
> workspaces and as a response to Murilo. While writing it, I stumbled on
> the obvious fact that it was still missing the point of the usefulness of
> such a "debundled mainline rust", since everything would still be rebuilt
> everywhere. Why bother bulding and maintaining for no benefit to no one? 
> ---
> 
> What about a mix of both ? (Haskell has the same issue too)
> 
> Like having a proper guix way for "mainline rust" (as seen in here for
> instance : https://blessed.rs/crates or later if sucessful lib.rs), and
> forcing importers to pick up from there when a suitable version is
> present (developping version comparison is a need in Guix anyway, at least
> for everything related to recursive imports which are poorly handled
> now) (I mean with support for && || >= > < <= operations and conversions
> with semver notation), and we try and use rust workspaces in our
> "mainline rust", which should also handle everything that has a
> FFI/dependency in Guix.
> 
> That way
> - we debundle the bulk with our "mainline rust" and bring down both
>   maintainance burden and compilation times.
> - we give users an easier way to update their packages.
> - we could possibly even outsource what is not "mainline rust"
> neither user-facing applications we want in Guix to specialised
> channels (e.g. a guix-rust-past for old versions we don't support
> anymore).
> 
> PS I see this only now but since you wish to build everything in a file
> and then update that file, you could as well... do it in a cargo
> workspace! i.e. Inject all inputs, let cargo do its thing, and have a
> single output. Only care in guix about the final package
> description/version and everything. No strings attached, no need to
> support hundreds of lines we don't care about! (I still vomit writing
> package descriptions for -impl crates).
> 
> -- 
> Best regards,
> Nicolas Graves

-- 
Efraim Flashner   <efr...@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

signature.asc
Description: PGP signature

Re: How to move forward about Rust? antioxidant, cargo2guix, etc.

Reply via email to