Robert Bradshaw <rober...@math.washington.edu> writes: >> But considering that we might one day want to make part of the Sage >> library possible to install into your system Python distribution >> (right?), it might be a good idea to keep it separate from the >> "infrastructure" part of Sage. > > While that's a nice idea, there's much bigger technical hurdles to > overcome than repository structure.
Well, I am certainly not knowledgeable about this, but William said the following on IRC yesterday (when I was describing to Jason my recent correspondence with you in this now way off-topic thread):: [2012-02-16 23:05:56] <wstein_> Kini -- you should increase your estimate that sage could be a standalone library. [2012-02-16 23:06:14] <wstein_> I've made it such more than once in the past. It is much, much easier than something like getting sage to build on cygwin. [2012-02-16 23:06:26] <wstein_> Which is another argument in your favor. [2012-02-16 23:07:31] <kini> wstein_: what about stuff like GAP integration of the group functionality, all the pari/gp integration, etc.? [2012-02-16 23:08:12] <wstein_> GAP is trivial, since it is via pexpect -- you just need it installed on your computer. [2012-02-16 23:08:14] <kini> er, maxima I guess is a better example than pari/gp there. maybe. I don't use Sage's number theory stuff :P [2012-02-16 23:08:22] <wstein_> Regarding PARI, you just include it in the library. [2012-02-16 23:08:22] <kini> true [2012-02-16 23:08:38] <wstein_> When I did this maxima only used pexpect. [2012-02-16 23:08:38] <kini> hmm [2012-02-16 23:08:44] <wstein_> Now you include it. [2012-02-16 23:08:55] <wstein_> Basically you include maybe 20% (ish) of the packages in Sage. [2012-02-16 23:09:04] <wstein_> E.g., obviously don't' include python, R, any python packages, etc. [2012-02-16 23:09:20] <kini> right... [2012-02-16 23:09:25] <wstein_> But core Sage math software gets included right in devel/sage/libs (say) [2012-02-16 23:09:25] <kini> that makes sense So there's that. >> I think there are some advantages to keeping our package management >> separate from our mathematical library separate from our glue which >> binds the two together, namely that we can import the first (from >> Gentoo Prefix, say) and export the second (as a standalone Python package >> with optional dependencies on stuff which we currently depend on, say). > > If the sage library could be used a s a standalone Python package, and > we could use Gentoo Prefix as is (with no modifications, but requiring > it as a dependency(?)) this might make more sense. I see the "glue" > code as being part of the library, neither is of much use without the > other. I don't understand what you mean by "use Gentoo prefix as is (with no modifications, but requiring it as a dependency)". Could you explain? In the category of "glue code" I meant to include everything in $SAGE_LOCAL/bin/sage-*. I see much of that stuff as more related to maintenance of the entire distribution of software we ship than to the actual Python/Cython code in the Sage library. Of course, if we do switch to Prefix or something like it, most of it will become unnecessary, or can actually be fit into the portage data tree as "Tools", so I see your point. > I think Sage will be monolithic and Windows be VM for the near future > at least, with a larger percentage of people using a Sage install "in > the cloud" on a university or otherwise hosted server for the near-mid > term future. But as you said who knows... Yes, I agree. That seems like the most likely future, at the moment. However, William has asked me to write a Sage Enhancement Proposal for switching to git, and I think we concluded on IRC that it might make sense to make a long term timeline for other big changes as well, such as what we're talking about now - or at least a proposal for such a timeline :) >> I'm not sure that's such a good idea. We should be able hotfix SPKGs >> without having to hotfix Sage itself. Or to put it another way, the >> development of build scripts for packages shouldn't really be in >> lockstep with the development of Sage, the mathematical system, since we >> have no control over upstream releases or bugfixes. > > Note that we don't maintain the actual build script for most upstream > packages, unless there's something Sage-specific. Currently, spkgs are > pristine upstream tarball + Sage customizations + metadata (e.g. > contact information). Ideally, the customizations is empty, but not > always. Right, of course - by "build scripts" I meant what we currently call spkg-install scripts, not the actual setup.py or Makefile or configure or whatever that comes with the vanilla upstream source, or the setup.py or Makefile or configure or whatever we dump into the upstream source after extracting it, or patch in, or whatever. I also meant all the other stuff that is currently tracked in the repository of an SPKG, but more on that below... > The question is where to put these customizations. Currently, they're > in a separate repository for each spkg (which is also separate form > the upstream project's repo). We push this upstream when it makes > sense, but even when we can it's often a slow process. Often changes > here (including bumping the version) involve a parallel set of changes > to the library (though this need not always be the case, especially > for the more stable/standard spkgs). Being able to do this actually > helps with the lack of control over upstream releases and bugfixes. > Where would you propose we stick this information? These customizations would go into the "build scripts" repository! Sorry, I guess it's possible you're not familiar with sage-on-gentoo_ and Prefix, and I should explain. This is *exactly* what is done there, and by extension in lmonade_. The portage data tree primarily contains ebuilds, which are shell scripts that don't do anything if you just run them, but instead define sh functions that describe how to configure, build, install, etc. the package. The package manager, `emerge`, then runs these functions as necessary. But the data tree *also* contains miscellaneous files such as patches or whatever else you want! The functions in the ebuild can use these files to modify the source after extracting it, or use those files to play soothing music while the package is being installed, or whatever it wants really. So I don't think we lose anything in terms of flexibility of patching upstream source, if we use Prefix rather than SPKGs. *Huge* thanks to François Bissey and Christopher Schwan for their work on sage-on-gentoo, by the way. They have converted an amazing number of SPKGs into ebuilds + misc files, as you can see in the sage-on-gentoo_ repository. Of course, they also had help from Gentoo itself, where many of these packages already exist and have ebuilds, so all that was needed was to port our customizations into the already existing ebuild and slap those customizations with a "sage" nametag (or more precisely a "USE flag" in Gentoo terminology). USE flags allow you to build packages with certain features - for example you might want to build a program disabling its Gtk+ interface but enabling its Qt one, so you'd build the package with USE flags set to "-gtk +qt4". So what François and Christopher have been doing is creating a "feature" of relevant programs called "sage" representing Sage compatibility, and setting up the Sage package itself to force all its dependencies to be compiled with this "sage compatibility feature" enabled. Of course, enabling this USE flag just applies our patches. But if you want the vanilla package for some reason, you can forcibly build that instead! The Prefix system is really incredibly flexible. I highly recommend you take a look at sage-on-gentoo and lmonade if you haven't already. .. _sage-on-gentoo: http://github.com/cschwan/sage-on-gentoo .. _lmonade: http://www.lmona.de/ Regarding the following sentence: > Often changes here (including bumping the version) involve a parallel > set of changes to the library (though this need not always be the > case, especially for the more stable/standard spkgs). This is fine. The user just won't be able to install the new package downstream-patch-version until the new version of Sage is released and they upgrade to it, that's all. But the new package will still be there if they want to forcibly install it and play around. Contrast this with the current situation where if you want to use an SPKG that is in development, you need to go find it on the trac ticket for upgrading the SPKG, download it, and then run `sage -f` on it, not to mention that you can't even downgrade back to the version sanctioned for use with your current Sage version. >> I see a future build-script repository as being something that people >> continually update from trunk to "check for updates". To avoid premature >> upgrading to new SPKGs that don't work with an old version of Sage, the >> Sage package itself would require certain versions of certain SPKGs and >> no higher. Of course, they would not pull from the Sage repository >> itself until an actual version was released. This is more or less how >> many Linux distributions work, i.e. package metadata / build scripts / >> etc. are updated constantly, and actual software packages have much more >> infrequent stable version releases. > > This assumes the various packages can be upgraded independently, which > is clearly not the case. Sometimes, I would even venture to even say semi-often, they can be upgraded independently without problems, especially in the case of optional SPKGs which don't have a lot of Sage library code interfacing with them and with other packages. If there is something stopping them from being upgraded independently, you can always write strict specific-version-dependency requirements into the ebuilds and no harm done. >> So for this to work the Sage repository would have to be separate from >> the packages repository. > > But for this to work we would need a separate pari-sage, separate from > pari, in the global package repository if we made any modifications to > pari. > > Lets assume for the moment that the glue+python library is a single > repository called "sage-lib." From what I gather, what you want the > sage distribution to be is: > > package_manager (e.g. gentoo-prefix, a prerequisite) > mpir-x.y.z > pari-x.y.z > cython-x.y.z > python-x.y.z > ... > sage-lib-x.y.z > > Where sage-lib-x.y.z lists as its dependencies > > mpir-x.y.z > pari-x.y.z > cython-x.y.z > python-x.y.z > ... > > Right? Then the package manager would just do its thing. The crux of > the issue is that what we really have for the dependencies of > sage-lib-x.y.z is > > mpir-x.y.z > pari-x.y.z + epsilon > cython-x.y.z + epsilon > python-x.y.z > ... > > What I would like to see is the dependencies (epsilon's and specific > version numbers) stored in the single repository with the code that > depends on them, so a commit could describe a global sage state and > all development could be expressed as "just a (set of) patches (= > branch in a personal/public repo)" Changing this file would trigger a > re-build (ideally with lots of transparent caching). That in a > nutshell is my proposal. > > Currently the epsilons are scattered across various non-upstream > repos, there's no explicit history of versions outside of what happens > to ship in a release tarball, and there's several distinct repos for > the various pieces of Sage. I think we can do better. As I've mentioned above, we can easily pack the "+ epsilon" in with the package metadata, and even separate it from the "pari-x.y.z" or "cython-x.y.z" with a "sage" USE flag (and this has already been done for us, mostly, though of course I should hasten to add that sage-on-gentoo and especially lmonade are experimental projects). If I understand what you are saying, you want the patches to live in the Sage repository itself, so that if you move around in the Sage library history to various points where various different sets of patches are needed for Sage's dependencies, then those packages will be automatically rebuilt when you run `sage -b`. I think we can get a pretty good alternative to this with the split system I'm describing (nimble package management repo for everyone and carefully reviewed sage repo for creating releases from), if I understand correctly. We would include the ebuild for Sage itself inside the Sage repository, and export it to our package repository when we make a release of Sage. `sage -b`, on the other hand, would just build the ebuild that existed inside the checked out revision. We could call this "sage-9999.ebuild", following the Gentoo convention for ebuilds with no fixed version which simply build the current source code, for some value of "current". sage-9999 would depend on exact versions of packages, including exact -r[0-9]+ suffixes. Suppose the current version of Sage was shipping pari-x.y.z-r3 , where "-r3" corresponds to the current ".p3" in the name of an SPKG. Then if I wanted to change the patches on pari-x.y.z, I would commit the new patches to our package repository, along with pari-x.y.z-r4.ebuild, which would be pari-x.y.z-r3.ebuild updated to use the new patches' filenames, and with any other necessary changes. Then I would modify the Sage ebuild inside the Sage library source tree to depend on pari-x.y.z-r4 instead of pari-x.y.z-r3. And then, of course, we would just make `sage -b` run the package manager on sage-9999.ebuild . As for caching, Portage also integrates with ccache, fwiw, though I don't know if the same can be said of Prefix. Would this be useful for Cython caching? >> Another advantage of this is that people writing their own packages to >> use with Sage could more easily and quickly get their stuff into the >> Sage package repository as an official optional package, which goes well >> with what William (?) was saying somewhere on sage-devel recently about >> how the packaging system of Sage should allow people to write their own >> packages without communication with us. Though of course those people >> could always distribute files that would plug into our packaging system, >> having the package repository separate from Sage would also encourage it >> to be modular enough to make this feasible. > > Certainly a requirement to having a system that supports optional > spkgs. Sharing an optional spkg would be as simple as sharing a patch > (or just the new files themselves, which contains a pointer to the > upstream package and any needed metadata) and rebuilding. > > Upgrading to a newer version of an upstream package would likewise be > just a patch as well. Yup, this is a standard thing to do on Gentoo or Prefix. Gentoo is actually still using CVS for their tree, I think (gag), so the usual way is to just put up a .ebuild file somewhere, along with any patches you want the user to put into the files directory. But we could of course do better with a DVCS - say, if you wanted to maintain a package, you could fork our packaging repository on GitHub and insert your own stuff. >> Just some thoughts. > > And very good food for thought. Likewise! And on that note, since man shall not live by every word that proceedeth out of the mouth of sage-devel alone, I'm off to dinner :) -Keshav ---- Join us in #sagemath on irc.freenode.net ! -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org