On Mon, Jul 31, 2017 at 6:29 PM, vdelecroix <20100.delecr...@gmail.com> wrote: > Hi Eric, > > Currently at a workshop in Leiden [1] we figured out one another possible > use case for your proposal. Some people does develop PARI/GP in parallel of > Sage. One simple way to have a testing environment would be to have: > * a git repo for PARI/GP > * a git repo for SAGE > * telling SAGE to use the development version PARI/GP (wherever it is > installed)
Yes, this is exactly the kind of use case I had in mind for a "source" origin for Sage packages (for myself, I wanted to do something similar, but with Singular, and found it to be currently a bit more trouble than it should be). > Though, it triggers one question: how one would relaunch the chain > compilation due to a PARI/GP update? Would it be automatically handled by > the Makefile? (the same question holds for system packages of course) It sort of depends on what you mean by "a PARI/GP" update. The general idea here is that you would install pari in Sage sort of more or less the same way one does now (although currently one almost never does so manually since it's a required standard package). The basic idea is that you would configure (either via the configure script or some other means) the origin of the "pari" spkg to be a source tree, and provide the full path to where the pari source code is. When pari is installed (or reinstalled) it would install it more or less the way it installs the standard spkg. The main difference in this case being that instead of downloading and extracting a source tarball, it would build/install from the source tree already provided. This would be using the same spkg-install script that it would use normally, so it's *possible* that if you're trying to develop Sage against a development version of PARI that the existing spkg-install script won't work. In this case one needs to make a branch in Sage in order to make the necessary edits to the pari spkg. But this is good, because one would want to save those edits anyways in anticipation of eventually updating the spkg. Once pari is installed into Sage everything else works the same, so it would still trigger possible rebuilds of any spkgs that depend on pari, via Sage's Makefile. Does that make sense / seem useful? Erik (P.S. I am still working on the plan I described in this thread, or something close to it, but there's been some preliminary work I've needed to do first on making installation/uninstallation of Sage packages more idempotent; see https://trac.sagemath.org/ticket/22510 and its dependencies) > Le vendredi 26 mai 2017 15:01:36 UTC+2, Erik Bray a écrit : >> >> Hi folks interested in Sage packaging, >> >> Almost every time the topic comes up, I complain that it isn't easier >> to use more system packages as both build- and run-time dependencies >> of Sage. I'd like to make some progress on actually doing something >> about that, and I have some ideas, but I'd like to bounce them off >> anyone who's interested first before just going off and doing it. >> >> There is enough work involved in this that I believe it can and should >> be broken up into a number of smaller tasks. I would also like to >> approach this in a way that works well and integrates with the >> existing "sage-the-distribution" infrastructure. I believe there are >> advantages to being able to develop on Sage in the "normal" way we're >> already used to, while also being able to take advantage of existing >> system packages wherever possible. >> >> So I'm just going to try to organize my existing thoughts on this and >> see what anyone thinks. Sorry if it's TL;DR, but I'm hoping that >> having a detailed discussion about this will make it more likely that >> something will actually be accomplished on it soon (because I think >> the actual implementation, once decided on, is not terribly >> difficult). >> >> Note: In this message I'm using "package" loosely to refer to any >> program, library, database, or other collection of files that is >> distributed and installed as a self-contained unit. It doesn't >> necessarily relate to any particular "packaging system". >> >> >> 1. Why? >> ======= >> >> The extent and scope to which Sage "vendors" its dependencies, in the >> form of what some call "sage-the-distribution", is *not* particularly >> normal in the open source world. Vendoring *some* dependencies is not >> unusual, but Sage does nearly all (even down the gcc, in certain >> cases). I've learned a lot of the history to this over the past year, >> and agree that most of the time this has been done with good reasons. >> >> For example, I can't think of any other software that forces me to >> build its own copy of ncurses just to build/install it. This was >> added for good reasons [1], but not reasons that can't also resolved >> in part by installing the appropriate system packages, or that might >> not be resolved by now in system packages that depend on ncurses (i.e. >> that should be built with ncurses support). Point being, this issue >> does not necessarily impact everyone, and building Sage's own ncurses >> is overkill in that case. It would be one thing if we were just >> talking one or two packages (I didn't pick on ncurses for any deep >> reason), but now multiply that by around 250 (give or take, depending >> on how many dependencies are even available as system packages) and it >> becomes real overhead to getting started *and* making progress with >> Sage development. >> >> I wouln't propose *removing* any existing spkgs that are still >> relevant. I think it's really useful that Sage has a list of >> known-good pinned versions of its dependencies. Further, >> "sage-the-distribution" makes it very easy to install those >> dependencies in such a way that they can be used as build/runtime >> dependencies by Sage without having to hunt the 'net for the right >> source packages of the right versions of those dependencies, and >> figure out how to configure and build them in a piecemeal fashion. In >> other words, even if we do expand the ability to use system packages >> for Sage's dependencies, it's still very nice that it's easy with a >> few commands to use the spkg if something goes wrong with the system >> package. It's also, of course, important for power users who wish to >> compile some dependencies on their own--especially highly tuned >> numerical libraries (but even those users usually only care about >> being able to hand-configure a few dependencies, not most). >> >> To summarize: being able to more aggressively rely on system packages >> can save a lot of time and frustration during normal development of >> Sage, and is also less jarring especially to new developers, of whom >> we would like to attract more. It should also decrease the time >> required to regularly build binary distributions of Sage (e.g. for >> Docker, Windows, and Linux distros). >> >> >> 2. Overview of how Sage manages dependencies now (and what won't change) >> ======================================================================== >> >> For many of you this will be unnecessary review, but I want to discuss >> a little about how dependencies are currently checked and installed in >> Sage-the-distribution. Doing so is helpful for me too, to make sure I >> understand it clearly (and correct me if I have any >> misunderstandings). >> >> Sage-the-distribution uses *Make* itself (cleverly, IMO) to manage >> dependencies insofar as making sure all dependencies are installed, >> and that when a package changes all packages that depend (directly or >> indirectly) on that package are rebuilt. Make works on files and >> timestamps, which does not translate directly to entire software >> packages, so to track whether or not an spkg is up to date, Sage uses >> the common "stamp pattern" for Make [2]--that is, when an spkg is >> installed it writes a file that effectively "represents" completion of >> the installation of that spkg for Make's purposes. These stamp files >> are the files typically stored under >> $SAGE_LOCAL/var/lib/sage/installed/<spkg>-<version>. This directory >> is also known in some places as SAGE_SPKG_INST. By including the >> version number in the name we can also force rebuilds when an spkg's >> version changes. >> >> When one runs `make <spkg>` with just the spkg name, this is actually >> a phony target with the path to the stamp file for that package (at >> its current version) as the sole target. So `make <spkg>` translates >> to `make $SAGE_SPKG_INST/<spkg>-<version>` for the current version of >> that spkg. The associated rule is to run the sage-spkg command for >> that package, which also takes care of writing the stamp file. >> sage-spkg also writes some information into each stamp file in a >> somewhat loose format that I don't believe is parsed anywhere. >> However the *existence* of these files is used by the (somewhat >> controversial, for downstream packagers) `is_package_installed()` >> function.* I'm actually going to propose later that we write and use >> these stamp files (with some slight changes) even when installing >> dependencies from a system package, so these files might be present >> even in binary packages for Sage (though that might be up to >> downstream packagers). >> >> When Sage's `./configure` script generates the main Makefile for all >> of Sage's dependencies, it loops over all the spkgs in build/pkgs/ and >> creates two make targets for each spkg: the aforementioned phony >> target consisting of just the package name, and the *real* target for >> the stamp file. It also creates a make variable named like >> `$(inst_<spkg>)` (where <spkg> is just the package name, without the >> version) referring to the full path of the stamp file for that >> package. Each spkg may list its build dependencies in its >> build/pkgs/<spkg>/dependencies file, in the format that it will appear >> in the Makefile as dependencies for the make target of that package. >> For convenience's sake, the `dependencies` file just contains the >> package names, but the `./configure` script converts this to the >> appropriate `$(inst_<spkg>)` variables, so that the stamp files become >> the real dependencies (part of how the "stamp pattern" normally >> works). >> >> When a package is upgraded (i.e. its version number changes) then the >> Makefile is regenerated, but with the `$(inst_<spkg>)` for that >> package pointing to a new stamp file, containing the new version >> number. Thus any dependents of that package will see this as an >> outdated dependency, and get rebuilt after the upgraded package is >> built. When packages are rebuilt (even if their version didn't >> change) their stamp files are touched, forcing further rebuilds of any >> of their dependents and so on, in normal Make behavior. >> >> As far as I can tell this has worked quite well for Sage--especially >> as it also allows leveraging Make's parallel build features. So I'm >> proposing to keep this all pretty much as-is, with possibly only minor >> tweaks in the details. Instead, many more of the changes will be at >> configure time. >> >> >> * There is proposed work already mostly done to replace use of >> is_package_installed() within the Sage library with a way to do >> runtime feature checks: https://trac.sagemath.org/ticket/20382 Some >> of this work *might* be redundant with what I want to propose, but can >> also coexist with it, as it is currently designed for runtime use by >> the Python code itself, and not during builds. >> >> >> 3. Case study--examples already in Sage >> ======================================= >> >> Sage-the-distribution already has a few examples of "spkgs" in the >> system that *may* use a system package, rather than building from >> source. As it is this is done in an ad-hoc manner that can be >> surprising and/or misleading. But I think it's useful to look at them >> to see how this is done currently and if there's anything we can learn >> from it. >> >> a) Blas >> ------- >> >> There are two different BLAS implementation packages to choose from >> currently in Sage: OpenBLAS and ATLAS.* The selection can be made >> currently at configure time with a --with-blas= flag which can take >> either 'openblas' or 'atlas'. The selection is used to write a >> variable called `$(BLAS)` in the makefile that points to the stamp >> file path for the actual BLAS implementation spkg selected. Other >> spkgs that have BLAS as a dependency list the `$(BLAS)` variable in >> its dependencies, rather than writing "openblas" or "atlas" >> explicitly. >> >> When openblas is selected (now the default) the openblas spkg is >> installed unconditionally. >> >> However, when *atlas* is selected, there happens to be a mechanism for >> using a system BLAS (why just with ATLAS I don't know--historical >> reasons I guess). In this case it still runs the spkg-install for >> ATLAS like for any other spkg, but its spkg-install checks for a >> special environment variable, `SAGE_ATLAS_LIB` (the only way to >> control this behavior). This invokes a search in standard locations >> first for a "libatlas.so" (or equivalent) explicitly. If that's not >> found, it will happily take whatever it does find as long as there's >> *some* "libblas.so" and "liblapack.so" found on the system. It >> doesn't do any feature checks or anything--it just takes what it >> finds. >> >> If it does find something resembling either ATLAS specifically, or a >> generic BLAS/LAPACK, then it skips installing the actual spkg, but >> still writes a stamp file indicating that "ATLAS" was installed, with >> whatever version is in the package-version.txt for the spkg, which can >> of course be misleading. (It also writes pkgconfig .pc files in >> $SAGE_LOCAL/lib for blas/cblas/lapack indicating which libs it found, >> along with a "fake" version of "1.0".) >> >> This, Sage will use these system libraries for all build and runtime >> requirements of BLAS, and in my experience this has generally worked. >> >> * There is another issue I would like to address--slightly orthogonal >> to supporting system packages--of having a regular way to support >> "abstract" packages that can have multiple alternative implementations >> (another example being GMP/MPIR). This has been talked about before, >> such as in this recent thread [3]. I have some ideas about this that >> integrate well with my ideas for system packages, but I will try to >> save that for a separate message. >> >> >> b) GCC >> ------ >> >> The GCC spkg is a bit of a different beast, since it is normally not >> installed by default, and was only added to support cases where the >> platform's GCC is broken or too old and has bugs that affect building >> Sage or its dependencies. >> >> Although Sage's `configure` script is responsible for determining >> whether or not GCC should be installed (in contrast to hacks in >> spkg-install like for ATLAS), there is no *flag* for `configure` (e.g. >> --with-gcc or something like that) for controlling this. Instead the >> behavior is controlled solely by an environment variable >> "SAGE_INSTALL_GCC" (this should probably be fixed, but we'll come to >> that). If the environment variable is set to "yes"/"no" then that >> forces the gcc installation behavior one way or the other. However, >> if the environment variable is not set, then the configure script goes >> through the necessary checks to see if the installed gcc is new >> enough, and also if gfortran is installed, among others. If GCC >> installation is deemed necessary then it sets a flag indicating as >> much, called `need_to_install_gcc=yes`. >> >> This is used later (see next section) to set the `$(inst_gcc)` variable. >> >> c) git >> ------ >> >> Sage actually includes an spkg for git, and installs it >> unconditionally (there is currently no way to control this) if a >> working 'git' is not found on the system. This is one of the few >> packages that just has a straightforward check for the system version >> at configure time. If a working git is not found (where 'working' >> here just means `git --version` works) the script sets a variable >> (similar to the gcc case) called `need_to_install_git=yes`. >> >> (It also sets a similar variable for `need_to_install_yasm` on >> x86-based systems.) >> >> Later, while writing the main Makefile, the configure script loops >> over all spkgs that *might* be installed and checks for a >> `need_to_install_<spkg>` variable. If not found, or not set to "no", >> the script sets the `$(inst_<spkg>)` variable to point to the standard >> stamp file for that package. Otherwise it sets `$(inst_<spkg>)` to a >> dummy file that always exists (this way any dependencies for that >> package are still satisfied, but the spkg is never actually >> built/installed). >> >> >> 4. Package sources >> ================== >> >> One of the main changes I'm proposing is that stamp files for packages >> will always be written to SAGE_SPKG_INST even for cases where the >> system package is used, and the Sage spkg is not actually installed. >> >> That is, I want to change the meaning of "spkg" to more broadly >> represent "a dependency of Sage that *may* be included in >> Sage-the-distribution". >> >> To this end I want to define a concept of spkg "sources" (not to be >> confused with source code). Instead, these are sources from which the >> spkg dependency can be satisfied. Three possible sources I have in >> mind (and I'm not sure that there would be any other): >> >> a) sage-dist: This is the current notion of an "spkg", where the >> source tarball is downloaded from one of the Sage mirrors, unpacked >> and installed to $SAGE_LOCAL using sage-spkg + the spkg's spkg-install >> script. The resulting stamp file, with the version taken from >> package-version.txt is written to $SAGE_SPKG_INST. >> >> b) system: In this case a check is made to see if the dependency is >> already satisfied by the system. How exactly this check is performed >> depends heavily on the package. *If possible* the version of the >> system package is also determined (will discuss the nuts-and-bolts of >> this later). In this case a stamp file is still written to >> $SAGE_SPKG_INST, but indicating somehow that the system package was >> used, not the sage-dist package. >> >> c) source: This case is not necessary for supporting system packages, >> but I think would be useful for testing new versions of a package. In >> this case it would be possible to install an spkg from an existing >> source tree for that package, which would be installed using the >> spkg-install script. If possible the version number would be >> determined from the package source code, and not assumed. I think >> this would be useful, but won't discuss this case any further for now. >> I just point it out as another possibility within this framework of >> allowing different spkg "sources". >> >> To summarize, no matter how an spkg dependency is satisfied, a stamp >> file for that spkg is written to $SAGE_SPKG_INSTALL, possibly >> indicating the *actual* version of the package being used by Sage, and >> indicating how the dependency was satisfied. >> >> >> 5. Nuts and bolts >> ================= >> >> a) New stamp file format >> ------------------------ >> >> As suggested in the previous section, no matter how an spkg dependency >> was satisfied, a stamp file is written to the $SAGE_SPKG_INST >> directory. In order to support multiple possible package "sources", >> the source that was used should be included in the stamp file. This >> way, it will also be possible to re-run `./configure` and specify a >> different source for a package, thus forcing a rebuild. So I think >> the stamp filename format should be something like: >> >> $SAGE_SPKG_INST/<name>-<source>-<version> >> >> where <name> would be the base package name, <source> would be >> something like "sagedist" or "system", and <version> the *actual* >> version of the package being used. I'll discuss in the next section >> how this might be determined for system packages. There's plenty of >> room for bikeshedding in this, but I think this makes sense. We could >> also support the old filename format, if such files are found, for >> backwards compatibility. >> >> >> b) Checking packages >> -------------------- >> >> For any dependency that may be satisfied by system packages, there >> needs to be a way to specify what the minimum dependency is for Sage >> (be it a version number, or the presence of certain features) there >> needs to be a way for each package to check that the dependency is >> satisfied. >> >> I've gone back and forth on exactly how this should be done, but I >> think that the best way to do this is to allow per-package m4 files, >> containing an m4 macro that checks that dependency on that package is >> satisfied (again, be it version number or some other check). Each >> macro could be named something like >> >> SAGE_SPKG_CHECK_<name> >> >> Optionally the macro should set a variable indicating the package >> *version* if the package dependency is satisfied. This is the version >> string that can be used in the stamp file, for example. If there is >> no clear way to determine the version (though it most cases there will >> be), a string like "unknown" could still be allowed for the version. >> The macro would be defined in a file like sage_spkg_check.m4 under >> each build/pkgs/<spkg> directory, and loaded on an as-needed basis >> using the m4_include command in configure.ac. >> >> Writing an m4 macro for autoconf is not a common skill, which is why >> I've hesitated on this. But I think it has a few justifications: It >> allows one to take advantage of the many existing macros that come >> with autoconf to perform common checks, such as whether a program is >> installed, or a function is available in a library. For many packages >> the SAGE_SPKG_CHECK_ macro would probably just wrap one or two >> existing autoconf macros. Another justification is that for some >> packages there may be existing macros to check for them that we can >> borrow from other projects. >> >> We can also provide, in the documentation, a simple template macro >> demonstrating how to wrap a few shell commands. >> >> *NOTE*: To be clear, I'm not proposing that, to implement this >> proposal, we go through and write 250+ m4 macros for every Sage spkg. >> This check will be optional, and we can write them one at a time on an >> as-needed basis, starting with some of the most important ones. I'll >> discuss more about how missing checks are handled in the next section. >> >> Obviously the packages that already have checks in configure.ac (gcc, >> git, yasm) would have those checks moved out to their package-specific >> macros. >> >> >> c) Driving the system >> --------------------- >> >> As previously noted, selecting the source for a package would be done >> at ./configure time. My proposal would be to change very little about >> the current default behavior. >> >> By default, all packages would be installed from the sage-dist source >> as is the case now. We could still make exceptions for build >> dependencies like gcc and git. I don't care whether these exceptions >> are hard-coded in configure.ac, or specified in some generic way. >> >> However, the configure script would support, for all spkgs, a >> `--with-system-<spkg>` argument (e.g. `--with-system-zlib`). >> >> For each spkg to be installed (all standard packages, optional >> packages if selected), if the `--with-system-<spkg>` argument is >> given, it will attempt to load and run the SAGE_SPKG_CHECK_<spkg> >> macro for that package. If the macro is not defined, there would be a >> *warning* that system package was selected for that package, but there >> is no way to check if it was installed. The warning would make clear >> that if the build fails it may be due to this dependency being >> missing. Otherwise it runs the check, and if the check succeeds the >> configure script would continue, while if the check fails the >> configure would stop with an error. >> >> Optionally, we could add arguments to control all of this behavior. >> For example, it might be useful to have an option to install the >> sage-dist spkg if a check is not defined. This might even be better >> as the default--a possible bikeshed issue. >> >> Another possible option is one that enables system packages, but >> disables any checks. This might be useful for system packagers who >> already have external guarantees that the dependencies have been met. >> >> Finally, there should be an option like `--with-system-all` to >> automatically use system packages for all dependencies, so that >> downstream packagers don't have to supply hundreds of `--with-system-` >> flags. >> >> Otherwise, generation of the build/make/Makefile by the configure >> script would proceed more or less as it does currently. It would just >> take into account information gained through any `--with-system-` >> flags to generate the new format stamp filenames. The .dummy stamp >> file would not be used anymore. Also, the rule for building system >> packages would be to simply write the stamp file. >> >> >> 6. Q&A >> ===== >> >> Q: What if I install with --with-system-<spkg> but later want to >> install the sage-dist version of that package? >> >> A: We should also support some way to deselect system packages. >> Perhaps --without-system-<spkg> / --with-system-<spkg>=no (these are >> two ways of saying the same things in standard configure scripts). >> >> Q: The reverse: What if I install the sage-dist package, but want to >> switch to the system package? >> >> A: Same thing, but this is a little trickier because we would need to >> *uninstall* the package from $SAGE_LOCAL. I have a proposal for >> improving spkg uninstallation written up at >> https://trac.sagemath.org/ticket/22510 >> >> Q: What if I use a system package when building Sage, but that package >> is later upgraded, or worse, removed? >> >> A: There's no great solution to this. Certainly, I think the >> ./configure time checks should be cached (since updates are not >> usually *that* frequent). So there needs to be good documentation on >> invalidating the cache when re-running ./configure. Still, that only >> helps with configure-time detection. Sage can still break at runtime >> if a system package it depends on changes. This is a generic problem >> for *any* software development, however, and something developers >> should be aware if if they're updating their system. Granted, most >> people don't always closely examine what's changing when they install, >> for example, OS updates. I certainly don't always check this with a >> fine-toothed comb. But it's a general issue. Keeping the ability to >> install the "standard", known-working sage-dist spkgs if needed is >> also a big advantage of this proposal. >> >> Any other questions? >> >> >> 7. Future concepts >> ================== >> >> a) Platform hooks >> ----------------- >> >> It might be nice, when using system packages, for the underlying >> OS/distribution system to hook into the SAGE_SPKG_CHECK_ system, both >> to check if a package is installed, and to provide its version number. >> For example, when building Sage on Debian, it might just hook into the >> dpkg system to provide this information in a manner consistent with >> the system. >> >> b) Abstract packages >> -------------------- >> >> Returning to the question of dependencies that can be satisfied by >> more than one package (e.g. BLAS, GMP), I think it would be nice to >> have a generic way of handling such cases that's a little cleaner than >> the current ad-hoc system. I would like a way of specifying an >> "abstract" package (which might be named "blas", for example). >> Installing an abstract package would mean installing the concrete >> package selected to satisfy it, but it would also include a system for >> switching between concrete implementations. So for example it would >> be possible to have multiple BLAS implementations installed >> simultaneously, and installing "blas" with the current selection might >> just be a matter of updating some symlinks. >> >> I think this concept fits in well with the proposal for handling >> system packages, but doesn't necessarily need to be handled >> simultaneously with it. For now we can just maintain the special >> cases I think... >> >> >> 8. Conclusion (for now) >> ======================= >> >> I've heard many valid concerns with going beyond sage-the-distribution >> for building/running Sage. Sage's huge collection of dependencies can >> lead to many fragilities: Version X of package Y might work with >> dependency A, but completely break dependency B. And supporting >> versions V, W, and X of package Y simultaneously is a lot of overhead >> compared to always just using version Y of that package for Sage. >> >> I do personally have a preference, when it comes to writing software, >> to supporting as wide a range of versions for my dependencies as is >> feasible. For some dependencies the versions supported may, >> necessarily, be very narrow. But for other cases there can be a lot >> more room for flexibility. >> >> Regardless, I think this proposal maintains the current stability of >> Sage by keeping the current preference for sage-the-distribution in >> all cases by default. It also maintains the ability to use >> custom-built versions of some of Sage dependencies. But I think this >> will also provide more flexibility in experimenting with using >> existing system packages in cases where that's sufficient, and avoid >> Sage duplicating system packages unnecessarily. >> >> Best, >> Erik >> >> >> [1] https://trac.sagemath.org/ticket/14405 >> [2] https://www.technovelty.org/tips/the-stamp-idiom-with-make.html >> [3] https://groups.google.com/d/msg/sage-devel/8MJBe_qxWJ0/fTzOPVzDAAAJ > > -- > You received this message because you are subscribed to the Google Groups > "sage-packaging" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sage-packaging+unsubscr...@googlegroups.com. > To post to this group, send email to sage-packag...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sage-packaging/4897e22f-c3d2-4ba5-8a88-aada683197e4%40googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.