On Fri, Mar 4, 2016 at 11:21 AM, Salvatore Stella <etn45...@gmail.com> wrote:
> Dear All,
> I have a philosophical question concerning how we implement things in sage.
> More precisely I am concerned on how we deal with external packages.
>
> Right now we implement our own package management system and we install
> (almost) all our dependencies under whichever prefix the user gives to sage.
> On the plus side we now have complete control on which version of each
> package we are dealing with, we have the ability to patch each of them
> diverging from their upstream version if we need to  and our users only need
> a single command to install the whole circus.
>
> On the minus side though, there is a lot of duplicated stuff around. For
> example on my machine I have 28 packages that are installed both as part of
> my system and as part of sage and 99 packages unique to sage. This looks to
> me like a complete waste of resources (especially when you consider that
> among the duplicated things there are some heavy package to compile like
> gcc).
>
> I can see few more drawbacks to this approach. For example we have to
> maintain the packages we ship with and it makes harder to use sage as a
> library rather than a CAS.
>
> Is there any plan to remove this issue in the future? If so, who do I talk to
> in order to contribute?

Hi Salvatore,

This is exactly the problem I'm starting to tackle as well, though I'm
also quite new to this community.  My more immediate tasks have to do
with getting Sage working natively on Windows, but I think that the
package management aspect of the problem is important for getting Sage
(the library) easier to install on other platforms.

The problem, as I see it, is that Sage has built up over the years its
own ad-hoc packaging system.  This is actually really cool, because
it's what's allowed Sage to be installable on a wide range of
platforms!  I think this happened sort of organically over time, as
various direct dependencies of Sage couldn't be installed easily on
some platforms due to poor support for *those* dependencies'
dependencies and so on.  I know Jeroen and others have a much better
handle on the exact history of this (though reading the SPKG.txt of
many of Sage's packages gives one a good idea...)

To me, right now, the biggest problem is lack of machine-digestible
metadata concerning the relationships and dependencies of Sage's
spkgs.  Many are direct dependencies of Sage the library and are
critical for core features.  Others are not direct dependencies at
all, but are assumed needed because they're a dependency of some other
dependency of Sage. Or in some cases because they don't have
consistent implementations across platforms (my favorite example of
the latter is GNU patch--Sage builds its own copy because the 'patch'
command on many systems is not consistent--but for most Linux
distributions the 'patch' command provided by the platform should be
just fine).  So these dependency hierarchies are not well-enough
tracked for me liking.  There are build dependencies listed in the
"dependencies" file of many spkgs, but there's otherwise no
distinction between build dependencies, runtime dependencies, test
dependencies, etc.  It's also hard to construct the reverse
dependencies--that is, each package's dependent packages.  This isn't
a criticism since I don't think it was ever originally the intent that
Sage comes along with its own packaging system [correct me if I'm
wrong though!], but that's my observation of where things are now :)

Another related issue is the patches--some spkgs have patches that
affect runtime behavior in a way that's necessary to integrate with
Sage, or that otherwise fixes bugs that are not fixed upstream. In
those cases it's important to rebuild that package for Sage.  Other
patches, however, only have to do with building that package in the
context of being a Sage spkg, and are not usually as relevant if that
package is being provided by the platform package repository.  It
would be great to have explicit metadata making these distinctions.

You'll find that the Debian Science community has done a lot of work
on this: https://wiki.debian.org/DebianScience/Sage  The work is
currently a little out of date, but they've helped identify many of
the packages that *must* be built specially for Sage, vs which
packages can come from Debian's repository (they've also packaged many
of Sage's dependencies for Debian that were not already packaged).
Currently they use some clever hacks to prevent Sage from building its
own packages where it's not necessary, but I think a few minor
upstream tweaks in Sage can make this a bit cleaner (i.e. a way to
explicitly tell Sage which packages to build for itself--Jeroen has
the right idea with the patch he linked to).  As for the packages for
which Sage needs special tweaks, there are experimental Sage-specific
forks of those packages.  For example, for PARI there's a pari-sage
package that duplicates PARI but with the Sage-specific packages.  I
think this is a good approach to handling this on the system packaging
side (but not terribly relevant or important for Sage itself to
concern itself with I don't think).

Anyways most of this is old news, but I thought I'd put my thoughts
out there after a week or so of exploring this issue on my own.

Erik

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To post to this group, send email to sage-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Reply via email to