[sage-devel] Re: spkg refactoring and development model

Brian Granger Fri, 01 Jun 2007 11:59:09 -0700

> The point of spkg's, at least as I seem them, is that they should be
> ridiculously simple, and that anybody can make one with about 5
> minutes of directions.


This was the critical factor for me when I looked at the different options.

> This is -- in my opinion -- in sharp contrast to the situation with
> .deb, .rpm's, cygwin packages, etc.  A person with no packaging
> experience could easily spends literally days of hard work to really
> understand how to create rpm's correctly.  Moreover, spkg's
> are supposed to be decentralized, in that anybody can just start
> creating them and sharing them with other people without any
> need to check them into any sort of central repository.   I know
> for a fact that they accomplish both the above goals very well.

They are _mostly_ decentralized, but it would be nice if they were
more so.  A few things prevent this.  One problem is when multiple
people create slightly different versions of the same spkg.  Having
the spkg scripts hosted (even decentrally) on hg repos would
dramatically improve this situation.  The other thing we  want to
think about, is that for people to use a new spkg, they have to update
two files: deps and the install script.  It would be really nice if we
could come up with a way that a user could simply drop the spkg in a
directory and type make.  I will try to think about how we could
enable that and still preserve the dependency info.

> I do not find spkg's themselves to be a pain to maintain.  At least, the
> pain is *not* maintaining the spkg's, but actually solving
> the real problems of getting code to build and work correctly in
> the context of SAGE.  This is almost never related to spkg's.

For the most part, I agree, but there are a few things I find to be a pain:

1.  I have a number of spkgs that I maintain that other people also
make changes to.  Tracking these things without a repo is difficult
and very error prone.

2.  I have no source control for the spkgs I maintain.  This means I
have no history tracking, etc.  Also, it means that I don't have any
automated backups.  I can setup my own hg repo, but I might as well
work with everyone else to not duplicate efforts.

3.  The other aspect of maintenance is the actual scripts that run the
build process (outside the spkgs).  As I have started refactoring
those things, I have to manage everything by hand.  For instance for
others to benefit from my changes, I would have post a tarball of
everything somewhere.  That is no fun.

> Regarding multiple people working on spkg's, the main problem
> is that one person will look at an spkg-install, not understand it,
> delete it and create a new one, not realizing that the things they
> didn't understand are very subtle, involve only certain platforms,
> and often took hours to figure out.

Yep.

> There are some real problems that Brian has pointed out
> which need to be solved:
>
>   (1) It would be preferable if the package naming structure were
>        standardized, though it should only be a warning if one violates
>        this.  E.g., "sage -pkg package-dir" will print non-fatal warnings
>        if there is no src subdirectory, etc.  I like this proposal, which
>        is a slight extension of the one Brian suggested:
>
> package_dir
>     spkg-install    -- (required) shell script run to install the package
>     spkg-rebuild  -- (optional) download latest version of package from
>                              web page and recreate the src directory
>     spkg-check    -- (optional) runs the packages test suite after
>                               spkg-install succeeds
>     README.txt   -- (optional) what used to be SAGE.txt
>     /patches        -- (optional) any files needed by spkg-install for 
> patching
>     /src                -- (warning if not there) exact copy of what
> is available
>                               at some website.
>     /.hg                -- (optional) mercurial repository that has
>                               spkg-install, spkg-rebuild,
>                               spkg-check, README.txt and all contents
> of patches.

I like this a lot - especially the spkg-check.

>   (2) I'm not enthusiastic about having a version of spkg's that
> requires an internet
>      connection to use them.  I would prefer simply improving the
>      spkg-rebuild scripts.

I also agree with this fully.  One of the main benefits that I have
found, is that I can give someone a tarball that i) has everything and
ii) just works.

>   (3) Versioning issues.  If one types "sage -i foo" then the newest version
>     of the foo package should be installed.  Currently one has to type the
>     version number.  Fixing this is just adding something to some shell 
> script.

It would be nice to have code that actually parses the version number
in an intelligent manner.  Unfortunately, I really don't want to write
that code in a bash script - python would be much nicer.  On the other
hand , I like not having python involved in the build process.

>   (4) Dependencies among optional packages.  Currently this is usually
>       dealt with by simply putting
>                     sage -i foo-2.3,
>       say in the top of spkg-install for dependent packages.
>       This is OK, but it would probably be better if the version
> number were only
>       a lower bound or didn't have to be specified.

>   (5) Regarding all the source/revision control discussions, I think we should
>       just put one (optional) .hg repository in each spkg.  Simple.
> When somebody
>       does a bunch of work to improve an spkg, they can make available an
>       hg bundle that anybody else can apply to their spkg.

I think this is a start, but there are a few other things that I would
like to be in a repo:

1.  The deps file - like I said, it would be really nice if we could
get rid of this entirely though and have the spkgs declare their own
dependencies.

2.  The other scripts/makefiles involved in the spkg build process -
like newest_version, install, save-env, etc.

3.  The overall directory structure that contains the
scrips/makefiles.  The reason this is important is that most of the
scripts make strong assumptions about where they are run from.  Also,
as I have begun refactoring, some of the scripts have been moved
around (for instance newest_version has been moved to
base/spkg-newest-version).  Managing this without the associated
directory structure is a pain.

> > *  Because of this, I think mercurial queues only make sense when you
> > really need it - when there are multiple patches that are difficult to
> > diff/patch by hand.  But in these cases, it probably does make lots of
> > sense, and these sources should probably be in the hg repo.
> > Thankfully, the big packages (qt/enthought) are not heavily patched.
>
> Mercurial queues are really confusing.  I spent 2 hours once trying
> to figure them out and couldn't.  As is, they thus fail the "5 minutes" test
> for a new user test mentioned above.  Of course, if it were worth it,
> we could probably figure out how to make them usable to people easily.

I agree.  Last night I spent about 15 minutes reading about them.
While I did not actually try anything, my conclusion was that I would
need to be really sick (figuratively speaking) before I would be
willing to take that medicine - but if I was that sick it would be
good medicine.

> > * There is still the issue that maintaining patches is a pain.  It
> > seems like there are two broad categories of paches:  1) patches that
> > simply fix a bug in the original source 2) patches that actually
> > provide new functionality for sage or allow the package to be
> > installed in the sage context.
> >
> > In the first case, I think we really need to try hard to make sure
> > those patches get back upstream to the original package maintainers.
> > I know of multiple spkgs that have such patches that have never been
> > contributed upstream.  Not doing this makes the spkgs more difficult
> > to maintain (because there are more patches) and prevents to upstream
> > projects from benefiting from our work.
>
> Regarding (1), I agree completely. Most stable packages in sage are
> released only once or twice a year, so this takes time.  And it's never
> my highest priority.  However, if somebody else could volunteer
> for some amount of time to regular send patches upstream, I'd be
> very happy.

I can help with this, at least for the projects that I follow closely.

> Regarding (2), I can only (of the top of my head) think of 2 packages
> with _significant _code that isn't upstream:
>
>     * singular
>     * pyrex
>
> With singular, they release only once a year at most, which explains
> that situation.  With pyrex, Greg Ewing has refused multiple SAGE
> developer's patches repeatedly,  so upstreaming that is difficult.

Yep.  for packages this these forking with hg makes total sense.

A few more questions:

If there is some level of concensus, I would like to begin to
implement the above things.  This would involve:

1.  Creating an hg repo of the appropriate things as they stand now.
Can this be hosted on sage/math?  What is the preferred way of setting
this up?  Do we want to have a centralized repo or or go with a more
distirbuted model.

2.  Begin renaming things and creating the new structure for each spkg
as outlined above.  Once the repo is in place, I think everyone could
get this done in a hurry.  Also, my company is paying me to work on
this so I can do a lot of the work.

3.  Refactor the spkg infrastructure to address the other issues
brought up above.

Brian

>
>  -- William
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: spkg refactoring and development model

Reply via email to