[sage-devel] Re: log messages

Keshav Kini Fri, 17 Feb 2012 01:10:28 -0800

Robert Bradshaw <rober...@math.washington.edu> writes:
>> But considering that we might one day want to make part of the Sage
>> library possible to install into your system Python distribution
>> (right?), it might be a good idea to keep it separate from the
>> "infrastructure" part of Sage.
>
> While that's a nice idea, there's much bigger technical hurdles to
> overcome than repository structure.


Well, I am certainly not knowledgeable about this, but William said the
following on IRC yesterday (when I was describing to Jason my recent
correspondence with you in this now way off-topic thread)::

[2012-02-16 23:05:56] <wstein_> Kini -- you should increase your estimate that 
sage could be a standalone library.
[2012-02-16 23:06:14] <wstein_> I've made it such more than once in the past.  
It is much, much easier than something like getting sage to build on cygwin.
[2012-02-16 23:06:26] <wstein_> Which is another argument in your favor.
[2012-02-16 23:07:31] <kini> wstein_: what about stuff like GAP integration of 
the group functionality, all the pari/gp integration, etc.?
[2012-02-16 23:08:12] <wstein_> GAP is trivial, since it is via pexpect -- you 
just need it installed on your computer.
[2012-02-16 23:08:14] <kini> er, maxima I guess is a better example than 
pari/gp there. maybe. I don't use Sage's number theory stuff :P
[2012-02-16 23:08:22] <wstein_> Regarding PARI, you just include it in the 
library.
[2012-02-16 23:08:22] <kini> true
[2012-02-16 23:08:38] <wstein_> When I did this maxima only used pexpect.
[2012-02-16 23:08:38] <kini> hmm
[2012-02-16 23:08:44] <wstein_> Now you include it.
[2012-02-16 23:08:55] <wstein_> Basically you include maybe 20% (ish) of the 
packages in Sage.
[2012-02-16 23:09:04] <wstein_> E.g., obviously don't' include python, R, any 
python packages, etc.
[2012-02-16 23:09:20] <kini> right...
[2012-02-16 23:09:25] <wstein_> But core Sage math software gets included right 
in devel/sage/libs (say)
[2012-02-16 23:09:25] <kini> that makes sense

So there's that.

>> I think there are some advantages to keeping our package management
>> separate from our mathematical library separate from our glue which
>> binds the two together, namely that we can import the first (from
>> Gentoo Prefix, say) and export the second (as a standalone Python package
>> with optional dependencies on stuff which we currently depend on, say).
>
> If the sage library could be used a s a standalone Python package, and
> we could use Gentoo Prefix as is (with no modifications, but requiring
> it as a dependency(?)) this might make more sense. I see the "glue"
> code as being part of the library, neither is of much use without the
> other.

I don't understand what you mean by "use Gentoo prefix as is (with no
modifications, but requiring it as a dependency)". Could you explain?

In the category of "glue code" I meant to include everything in
$SAGE_LOCAL/bin/sage-*. I see much of that stuff as more related to
maintenance of the entire distribution of software we ship than to the
actual Python/Cython code in the Sage library. Of course, if we do
switch to Prefix or something like it, most of it will become
unnecessary, or can actually be fit into the portage data tree as
"Tools", so I see your point.

> I think Sage will be monolithic and Windows be VM for the near future
> at least, with a larger percentage of people using a Sage install "in
> the cloud" on a university or otherwise hosted server for the near-mid
> term future. But as you said who knows...

Yes, I agree. That seems like the most likely future, at the moment.
However, William has asked me to write a Sage Enhancement Proposal for
switching to git, and I think we concluded on IRC that it might make
sense to make a long term timeline for other big changes as well, such
as what we're talking about now - or at least a proposal for such a
timeline :)

>> I'm not sure that's such a good idea. We should be able hotfix SPKGs
>> without having to hotfix Sage itself. Or to put it another way, the
>> development of build scripts for packages shouldn't really be in
>> lockstep with the development of Sage, the mathematical system, since we
>> have no control over upstream releases or bugfixes.
>
> Note that we don't maintain the actual build script for most upstream
> packages, unless there's something Sage-specific. Currently, spkgs are
> pristine upstream tarball + Sage customizations + metadata (e.g.
> contact information). Ideally, the customizations is empty, but not
> always.

Right, of course - by "build scripts" I meant what we currently call
spkg-install scripts, not the actual setup.py or Makefile or configure
or whatever that comes with the vanilla upstream source, or the setup.py
or Makefile or configure or whatever we dump into the upstream source
after extracting it, or patch in, or whatever.

I also meant all the other stuff that is currently tracked in the
repository of an SPKG, but more on that below...

> The question is where to put these customizations. Currently, they're
> in a separate repository for each spkg (which is also separate form
> the upstream project's repo). We push this upstream when it makes
> sense, but even when we can it's often a slow process. Often changes
> here (including bumping the version) involve a parallel set of changes
> to the library (though this need not always be the case, especially
> for the more stable/standard spkgs). Being able to do this actually
> helps with the lack of control over upstream releases and bugfixes.
> Where would you propose we stick this information?

These customizations would go into the "build scripts" repository!
Sorry, I guess it's possible you're not familiar with sage-on-gentoo_
and Prefix, and I should explain. This is *exactly* what is done there,
and by extension in lmonade_. The portage data tree primarily contains
ebuilds, which are shell scripts that don't do anything if you just run
them, but instead define sh functions that describe how to configure,
build, install, etc. the package. The package manager, `emerge`, then
runs these functions as necessary.

But the data tree *also* contains miscellaneous files such as patches or
whatever else you want! The functions in the ebuild can use these files
to modify the source after extracting it, or use those files to play
soothing music while the package is being installed, or whatever it
wants really. So I don't think we lose anything in terms of flexibility
of patching upstream source, if we use Prefix rather than SPKGs.

*Huge* thanks to François Bissey and Christopher Schwan for their work
on sage-on-gentoo, by the way. They have converted an amazing number of
SPKGs into ebuilds + misc files, as you can see in the sage-on-gentoo_
repository. Of course, they also had help from Gentoo itself, where many
of these packages already exist and have ebuilds, so all that was needed
was to port our customizations into the already existing ebuild and slap
those customizations with a "sage" nametag (or more precisely a "USE
flag" in Gentoo terminology).

USE flags allow you to build packages with certain features - for
example you might want to build a program disabling its Gtk+ interface
but enabling its Qt one, so you'd build the package with USE flags set
to "-gtk +qt4". So what François and Christopher have been doing is
creating a "feature" of relevant programs called "sage" representing
Sage compatibility, and setting up the Sage package itself to force all
its dependencies to be compiled with this "sage compatibility feature"
enabled. Of course, enabling this USE flag just applies our patches. But
if you want the vanilla package for some reason, you can forcibly build
that instead!

The Prefix system is really incredibly flexible. I highly recommend you
take a look at sage-on-gentoo and lmonade if you haven't already.

.. _sage-on-gentoo: http://github.com/cschwan/sage-on-gentoo
.. _lmonade: http://www.lmona.de/

Regarding the following sentence:

> Often changes here (including bumping the version) involve a parallel
> set of changes to the library (though this need not always be the
> case, especially for the more stable/standard spkgs).

This is fine. The user just won't be able to install the new package
downstream-patch-version until the new version of Sage is released and
they upgrade to it, that's all. But the new package will still be there
if they want to forcibly install it and play around. Contrast this with
the current situation where if you want to use an SPKG that is in
development, you need to go find it on the trac ticket for upgrading the
SPKG, download it, and then run `sage -f` on it, not to mention that you
can't even downgrade back to the version sanctioned for use with your
current Sage version.

>> I see a future build-script repository as being something that people
>> continually update from trunk to "check for updates". To avoid premature
>> upgrading to new SPKGs that don't work with an old version of Sage, the
>> Sage package itself would require certain versions of certain SPKGs and
>> no higher. Of course, they would not pull from the Sage repository
>> itself until an actual version was released. This is more or less how
>> many Linux distributions work, i.e. package metadata / build scripts /
>> etc. are updated constantly, and actual software packages have much more
>> infrequent stable version releases.
>
> This assumes the various packages can be upgraded independently, which
> is clearly not the case.

Sometimes, I would even venture to even say semi-often, they can be
upgraded independently without problems, especially in the case of
optional SPKGs which don't have a lot of Sage library code interfacing
with them and with other packages. If there is something stopping them
from being upgraded independently, you can always write strict
specific-version-dependency requirements into the ebuilds and no harm
done.

>> So for this to work the Sage repository would have to be separate from
>> the packages repository.
>
> But for this to work we would need a separate pari-sage, separate from
> pari, in the global package repository if we made any modifications to
> pari.
>
> Lets assume for the moment that the glue+python library is a single
> repository called "sage-lib." From what I gather, what you want the
> sage distribution to be is:
>
> package_manager (e.g. gentoo-prefix, a prerequisite)
> mpir-x.y.z
> pari-x.y.z
> cython-x.y.z
> python-x.y.z
> ...
> sage-lib-x.y.z
>
> Where sage-lib-x.y.z lists as its dependencies
>
> mpir-x.y.z
> pari-x.y.z
> cython-x.y.z
> python-x.y.z
> ...
>
> Right? Then the package manager would just do its thing. The crux of
> the issue is that what we really have for the dependencies of
> sage-lib-x.y.z is
>
> mpir-x.y.z
> pari-x.y.z + epsilon
> cython-x.y.z + epsilon
> python-x.y.z
> ...
>
> What I would like to see is the dependencies (epsilon's and specific
> version numbers) stored in the single repository with the code that
> depends on them, so a commit could describe a global sage state and
> all development could be expressed as "just a (set of) patches (=
> branch in a personal/public repo)" Changing this file would trigger a
> re-build (ideally with lots of transparent caching). That in a
> nutshell is my proposal.
>
> Currently the epsilons are scattered across various non-upstream
> repos, there's no explicit history of versions outside of what happens
> to ship in a release tarball, and there's several distinct repos for
> the various pieces of Sage. I think we can do better.

As I've mentioned above, we can easily pack the "+ epsilon" in with the
package metadata, and even separate it from the "pari-x.y.z" or
"cython-x.y.z" with a "sage" USE flag (and this has already been done
for us, mostly, though of course I should hasten to add that
sage-on-gentoo and especially lmonade are experimental projects).

If I understand what you are saying, you want the patches to live in the
Sage repository itself, so that if you move around in the Sage library
history to various points where various different sets of patches are
needed for Sage's dependencies, then those packages will be
automatically rebuilt when you run `sage -b`.

I think we can get a pretty good alternative to this with the split
system I'm describing (nimble package management repo for everyone and
carefully reviewed sage repo for creating releases from), if I
understand correctly. We would include the ebuild for Sage itself inside
the Sage repository, and export it to our package repository when we
make a release of Sage. `sage -b`, on the other hand, would just build
the ebuild that existed inside the checked out revision. We could call
this "sage-9999.ebuild", following the Gentoo convention for ebuilds
with no fixed version which simply build the current source code, for
some value of "current".

sage-9999 would depend on exact versions of packages, including exact
-r[0-9]+ suffixes. Suppose the current version of Sage was shipping
pari-x.y.z-r3 , where "-r3" corresponds to the current ".p3" in the name
of an SPKG. Then if I wanted to change the patches on pari-x.y.z, I
would commit the new patches to our package repository, along with
pari-x.y.z-r4.ebuild, which would be pari-x.y.z-r3.ebuild updated to use
the new patches' filenames, and with any other necessary changes. Then I
would modify the Sage ebuild inside the Sage library source tree to
depend on pari-x.y.z-r4 instead of pari-x.y.z-r3.

And then, of course, we would just make `sage -b` run the package
manager on sage-9999.ebuild . As for caching, Portage also integrates
with ccache, fwiw, though I don't know if the same can be said of
Prefix. Would this be useful for Cython caching?

>> Another advantage of this is that people writing their own packages to
>> use with Sage could more easily and quickly get their stuff into the
>> Sage package repository as an official optional package, which goes well
>> with what William (?) was saying somewhere on sage-devel recently about
>> how the packaging system of Sage should allow people to write their own
>> packages without communication with us. Though of course those people
>> could always distribute files that would plug into our packaging system,
>> having the package repository separate from Sage would also encourage it
>> to be modular enough to make this feasible.
>
> Certainly a requirement to having a system that supports optional
> spkgs. Sharing an optional spkg would be as simple as sharing a patch
> (or just the new files themselves, which contains a pointer to the
> upstream package and any needed metadata) and rebuilding.
>
> Upgrading to a newer version of an upstream package would likewise be
> just a patch as well.

Yup, this is a standard thing to do on Gentoo or Prefix. Gentoo is
actually still using CVS for their tree, I think (gag), so the usual way
is to just put up a .ebuild file somewhere, along with any patches you
want the user to put into the files directory. But we could of course do
better with a DVCS - say, if you wanted to maintain a package, you could
fork our packaging repository on GitHub and insert your own stuff.

>> Just some thoughts.
>
> And very good food for thought.

Likewise! And on that note, since man shall not live by every word that
proceedeth out of the mouth of sage-devel alone, I'm off to dinner :)

-Keshav

----
Join us in #sagemath on irc.freenode.net !

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

[sage-devel] Re: log messages

Reply via email to