Re: [sage-devel] Re: git integration repository, please test

Burcin Erocal Tue, 15 Jan 2013 13:54:38 -0800

On Tue, 15 Jan 2013 09:44:13 -0800
"R. Andrew Ohana" <andrew.oh...@gmail.com> wrote:

> On Tue, Jan 15, 2013 at 8:01 AM, Burcin Erocal <bur...@erocal.org>
> wrote:

> > Keshav, Jordi and many others have pointed these out before, but
> > our main problem seems to be:
> >
> >  - we are not really using a DVCS
> >  - the build system is showing its limits due to the rapid growth of
> >    Sage
> >
> 
> > I don't think "switching to GIT" will solve either of these problems
> > directly. Why don't we discuss the solutions to these problems
> > separately and put them into action?
> >
> 
> FYI, the whole switching to git is only looking at addressing the
> development model, not the build system -- but as switching to git
> gives us a chance to re-examine the directory structure, and modify
> it, the (current) build system will necessarily have to be modified.

IMHO, trying to solve all problems at once is not a good approach.

> >  - Development model:
> >
> >    I like the idea of keeping a branch for each issue on Trac. I
> > have even suggested using mercurial's pbrach extension for this
> >    purpose privately to William about 2 years ago.
> >
> >    With a student, I even set up a patched roundup install with this
> >    feature. I can enable it on the lmonade issue tracker if anybody
> > is interested in testing this out. Though I believe using gitlab and
> >    not reinventing the wheel is far better.
> >
> 
> Please see http://wiki.sagemath.org/WorkflowSEP for a description of
> the issues, and a work in progress of the proposal.

I have seen this of course. As anyone else involved with Sage
development for a long time, I am familiar with the problems as well as
various unsuccessful solutions that were proposed in the past, editors,
stopgaps, etc.

Let's look at the items in the Problems Section:

#.  Sage has four major repositories and arbitrarily many SPKG
    repositories, instead of one repository like most software. This
    adds to complexity and may confuse new developers.

As William repeatedly states in his Sage talks, "Sage" stands for
many things at once. At least:

 - mathematical software distribution
 - Python/Cython library for computational algebra
 - graphical user interface
 - interfaces to mathematical software

>From these, the graphical user interface got it's own repository.
Now we are trying to consolidate the rest?

The argument that most software has only one repository, so should Sage
is absurd.

The problem with the repositories is that they are not self contained
enough. For example, sage-doctest scripts should be in the Sage library
repository itself. They are solely for testing Sage, when they change,
the library also changes.

If the library did not rely on environment variables being set in
specific ways, I don't see why else the scripts need to be merged with
it.

Similarly for packages. I don't understand why a computational algebra
library should include a package manager in its repository.

#.  Requiring human developers to manually create and upload patch files
    adds to the maintenance burden for coordinators.

+1

#.  The lack of a standardized machine-readable format in which to
    specify on a ticket which patches to apply where and in what order
    causes the patchbot to often guess the answers to these questions
    incorrectly, and causes developers to be uncertain as to how to
    influence the patchbot's guesses.

The fact that a ticket needs to have patches for several repositories
indicates that the files in these repositories should be shuffled a bit
to prevent this situation from happening again.

#.  The common practice of continually updating patches with new
    versions is confusing because one ends up with a soup of patches on
    a trac ticket, only the latest few of which are actually relevant
    anymore.
<snip>

+1

#.  In that vein, continually updating patches (as opposed to only
    adding new patches on top of existing ones) encourages history
    rewriting, which leads to a loss of granularity and larger
    individual commits in the final Mercurial history of the Sage
    repositories.
<snip>

+1

#.  Patch files by nature provide no information about what revision
    they should be applied to. This means that reviewers and the
    patchbot are forced to guess the correct revision to use.

+1

#.  If it becomes necessary to rebase a patch file on another patch
    file, it is often difficult to do so manually.
<snip>

The items above say that we should be using a DVCS, and our
repositories are not separated logically.

#.  The fact that development versions of Sage have throwaway commits in
    them is extremely confusing and a bad practice, as commits that have
    been publicized (in a full alpha/beta/rc tarball no less, not just
    on a repository website), should *not* be rescinded if at all
    possible.

#.  The impossibility of upgrading from such a development version of
    Sage is a problem in and of itself.

These two are about the release management process. Keshav tried to
argue about these before. I agree with him completely. But these have
nothing to do with "switching to git."

I can't see how these justify the goals stated later on that page:

Primary goals:

* Switch from patches to branches

  - Consolidate *all* Sage repositories into a single repository

> >
> >  - Build system
> >
> >    This actually has two items:
> >
> >    - integrating the repositories
> >
> >      Why?
> >
> >      AFAICT, the only reason is to allow us to specify the
> > dependencies of the Sage library better. ATM, when a patch goes
> > into Sage, the corresponding patch to update the spkg is in a
> > separate repository. Putting these together would help coordinate
> > the two.
> >
> >      Do we really need to coordinate these? Why is Sage any
> > different from any other large software package out there?
> >
> >      The standard solution to this problem is to add a "configure"
> >      script to the library which checks if the dependencies are
> >      satisfied and sets various options for the compilation process
> >      accordingly.
> >
> >      The initial design of Sage separated the mathematics library
> > from the distribution system, then further separated the user
> > interface (notebook) from the mathematics library. Why are we now
> > trying to reconcile a bunch of shell scripts with a Python/Cython
> > library for mathematics?
> 
> I don't inherently disagree with you, but I don't necessarily see a
> path to getting to that point, as sage's library is so fixated on
> being part of the sage distribution at the moment. Maybe you have
> thoughts?

I applaud your work in this area. The sage-on-gentoo folk and I (for
lmonade) appreciate that you separated the environment variables, are
working on the build system to be more robust, and so on.

These small steps, perhaps driven by efforts to fit Sage into proper
package management schemes used by distributions will get us there.
This is what happens to most software anyway. A Debian developer points
out the inconsistencies in their system, and how they can fix things to
get in Debian, and they comply. :)

> >    - rewriting the build system
> >
> >      I agree that things can be done much better, but I don't
> >      understand why there is talk of rewriting. There are plenty of
> >      excellent package managers around. Why not just use one of
> >      them?
> >
> 
> I have probably overused the word "rewriting", when probably I should
> have been using "replacing".

> That said, I'm not at convinced that
> switching to portage (the commonly proposed replacement) is a good
> decision. Portage is huge, monolithic, and (from my personal
> experience) relatively slow -- sage doesn't need something so
> complicated, and its monolithicity makes modifying it (when
> necessary) a relatively painful process.

My main proposal with lmonade is to switch to ebuilds. This is a well
defined standard [1] implemented by at least 3 different package
managers, portage, paludis, pkgcore.

[1] http://www.gentoo.org/proj/en/qa/pms.xml

I agree that Portage is a large beast, but it is actively developed and
thoroughly tested. Installing software seems like a simple problem.
However, it quickly becomes a nightmare when all the details are
considered. Portage solves this problem well.

Sage, for its usual distribution, does not need a full fledged package
manager. The packages and the dependency tree is fixed for a release.
Having access to portage comes in handy if you are developing. I really
cannot understand why people would want to rebuild the whole Sage
distribution for every minor release.

> Also, portage was not
> initially designed for prefixed environments, but rather had that
> functionality added in later, and so it doesn't always behave the
> best in such use cases.

I haven't run into any problems. Do you have anything in mind?

> I would rather see sage use a lightweight,
> relatively minimally featured package manager that was designed from
> the ground up for prefixed environments, however, I am not aware of
> any package manager around that satisfies such criterion.

IMHO, the main point many people miss when looking at this problem is
that the tool (package manager) is not the only variable. The perfect
tool will not bring any advantages if it doesn't come with support, an
active community and a large package database.

Gentoo has an amazing collection of science packages [2]. Surely having
access to such a pool is better than creating an spkg for everything.

[2] https://github.com/gentoo-science/sci (this is only the science
overlay, main repository contains a lot more)

Cheers,
Burcin

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To post to this group, send email to sage-devel@googlegroups.com.
To unsubscribe from this group, send email to 
sage-devel+unsubscr...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel?hl=en.

Re: [sage-devel] Re: git integration repository, please test

Reply via email to