[sage-devel] Re: fast vs viable (offline post)

Bill Hart Thu, 01 May 2008 10:18:34 -0700

On 1 May, 16:32, "William Stein" <[EMAIL PROTECTED]> wrote:
> On Thu, May 1, 2008 at 6:57 AM, Bill Hart <[EMAIL PROTECTED]> wrote:
<snip>

> >  1) The adequacy of test code to detect corner cases.
>
> Same here.  I hope people will write more test code.  By the way, doctests
> are not the only testing code in Sage
>
> sage: T = tests.modsym()
> sage: T.random()
>
> will run random tests of modular symbols computations attempting
> to find inconsistencies.  I would like to greatly expand code like the above.
>
> However, I view doctests as the obvious first thing you do before
> you do much more extensive sophisticated testing.
> Every user benefits from the doctests and they
> are incredibly easy to write, and easy to teach other people how
> to write.

Yes, doctests are good. I think I didn't express what I meant clearly,
I'm worried about finding corner cases the implementor/doctest writer
didn't think of. The stategies I mentioned are good for that.

>
> >  2) The way SAGE handles exceptions and special cases when interfacing
> >  with some of the underlying packages.
>
> This is  a difficult problem.  That Sage is able to get any reuse at all
> out of underlying and external packages is already surprising.  The
> use of underlying packages is driven entirely by the goal of getting
> stuff done today and fostering a spirit of community and collaboration.

Right, in that sense it is achieving its goals.

One way may be to write patches for the package code which capture
these events and then doctests which ensure they are captured. Perhaps
this is already done in some parts of SAGE.

Perhaps I am being slightly naive about how difficult this is though.

Perhaps Tim is arguing for things to be specified more formally. I'm
happy for things to be specified somewhat informally, so long as they
are rigorous enough to be useful to someone who didn't write the
code.

More importantly, as a mathematician, I am very, very concerned that
SAGE is doing what it tells me it is doing. Having seen how subtle
corner cases can be (I found one in the polynomial addition code in
FLINT the other day) I have little to no confidence in any computer
algebra package out there. *Every single time* I have used a computer
algebra package to do a non-trivial computation, I have found a non-
trivial bug. In some cases, those bugs were so subtle that the result
of my algorithm looked correct, but was utter rubbish. The only reason
I picked up the bugs is that I wrote my own test code for the package
I was using to make sure it was doing what it said it was doing.
Mathematicians who use computer algebra systems but who are not
developers, seem blissfully unaware of this. And this concerns me
greatly.

I 100% totally and absolutely agree with Tim Daly that computer
algebra is at the handwaving stage. As a mathematician concerned with
rigour, I am interested to know what SAGE hopes to do about it.

>
> The goal is that when using Sage one never *has* to think about
> underlying packages (e.g., Maxima, GAP).  Everything should have
> a nice unified Python-ic interface.

That's fair enough, but if a function calls NTL for factoring a
polynomial and it assumes the polynomial is squarefree, you had better
hope the implementor knew that and implemented the nice python-ic
interface correctly. One is also making an assumption that the package
implements what it says it does correctly. In many cases, this is an
incorrect assumption. If SAGE wishes to be better and still rely on
underlying packages, the testing needs to assume the underlying
package is broken in some non-trivial way, which it probably is.

<snip>

> >  4) The time and resources of a core group of SAGE volunteers is
> >  stretched thinner and thinner, largely on account of 3.
>
> Yep, some people love working on Sage so much that they work on
> Sage too much.

:-)

>
> >  5) I *perceive* that there is an attitude of "I've implemented
> >  algorithm x in SAGE", when in reality all that has been done is that
> >  algorithm x was previously implemented by someone else in package K
> >  and merely *wrapped* trivially in SAGE. Please don't mistake my
> >  meaning here. There is a lot of original code in SAGE implementing
> >  actual algorithms, and I'm not criticising the attitude of the hard
> >  working volunteers who contributed that. Also understand I am not
> >  criticising the ideal of not reinventing the wheel, which is again a
> >  strength of SAGE. I'm talking about distinguishing more carefully what
> >  is wrapped and what is implemented. There is also a major difference
> >  between implementing an algorithm in SAGE which might take just a few
> >  lines, based on what is already in SAGE, and then optimising that
> >  functionality, which might take hundreds or thousands of lines.
>
> People working on Sage
> development have their eyes on a very specific goal which is
> to create a viable open source free alternative to Maple, Magma,
> Mathematica, and Matlab.   With that as the goal, developers aren't
> thinking "look how cool I am -- I just implemented XYZ"; much
> more important is "look, we got XYZ working for Sage, since we
> need XYZ in order to accomplish our goal ... OK, what next."
> We are all on the same team.
>
> This perspective is dramatically different than the FLINT
> perspective, so I understand that it would be easy to misunderstand.

I only noticed the change in emphasis a few days ago. I was certainly
aware of the aim to "to create a viable open source free alternative
to Maple, Magma, Mathematica, and Matlab" a long time ago, but I was
unaware that the focus had evolved quite so much and how much this was
informing design decisions and the ways people are involved in SAGE.
Of course I certainly have seen all the work that goes on in calculus,
visualisation, graphing, etc, etc, from a distance. I try to keep up
with what has broadly been happening in SAGE. Of course I don't
participate on IRC which means I miss a lot, but that seems somewhat
haphazard as a way of informing oneself, and I also have little time
for it, especially when it occurs post midnight here.

The most interesting thing to me was your comment (WIlliam) a couple
of days ago to the effect that (apologies for the poor paraphrase) you
don't see yourself as a person in computer algebra, but a number
theorist who just wants to use the code and to this end you just want
to create a viable alternative to Ma*. This comment really surprised
me at the time in context, since at times I have seen you put quite
some effort into implementing algorithms (including new ones) and you
have been concerned with being a computer algebra person.

I do see however, that you (William) are perhaps uniquely qualified to
manage all the different aspects of a project like SAGE, and to be on
top of all the different developments and directions that are
requisite in such a package. Your ability to multitask and be all
things to all people is astonishing. You also have an awesome knack
for being enthusiastic about what *everyone* is doing and being upbeat
and also ensuring that those people have opportunities they otherwise
wouldn't have.

In terms of my interest in SAGE I primarily want to contribute to the
vision of a viable alternative to Magma and the overlapping
functionality of that package with the other Ma's, which means
performance, number theory, algebraic geometry, algebra, a better
testing regime, greater rigour and better documentation.

So I admit my focus is rather narrower and to some extent informs my
perspective when thinking about SAGE.

>
> >  6) The performance of some functions really sucks, and quite often
> >  this is functions that have been implemented afresh. I think Tim Daly
> >  mentioned this.
>
> The only way to fix this is for people to fix it.   Fix it and send me a 
> patch.

It's not going to happen. I am not qualified to patch functionality in
SAGE, and have no time to do so.

>
> > There also seems to be little attempt at
> >  systematically monitoring performance and making sure that the
> >  algorithms in SAGE cover a wide range of input parameters and that the
> >  functions are not just efficient for the small examples in the doctest
> >  (which often just boil down to reducing python overhead).
>
> I wish there were more.   Work is underway in this direction though.
>
> >  No one is  going to do this for you. The original "implementor" must take
> >  responsibility for testing, profiling and documentation of their code
> >  (build testing is a separate matter).
>
> It's better to work with other people as a team.  Don't
> believe dogmatically that doing so isn't possible.  Also, other
> people are capable of amazing difficult superhuman tasks -- such
> as "testing, profiling and documenting" even code that you wrote.
> I'm not making this up.

Agreed. People are capable of doing this and in some cases it
happens.

But the converse is not true. One should never assume it is going to
happen. Unless you have the agreement of someone to explicitly do this
for you, you should assume it is your (the implementor's) job.

I sincerely think that it is human nature to think that our own
expertise uniquely qualifies us to not be interested in such lowly
matters and that others will surely do it for us. The mature attitude
is always to take responsibility for the ultimate quality of the part
of the project you are working on. The team work comes in ensuring you
are interfacing well with those in parts of the project which must
interface with yours. Of course if you are lucky enough to be in a
team where someone else has agreed to write test code for your
function, your ultimate responsibility is to the function, and to
interfacing with the people writing the test code, not to the test
code. But you need a clear understanding of this arrangement, not an
assumption that later on someone might get motivated to do something
and therefore it isn't your responsibility.

Just saying, "if my code is broken, send me a patch" is not
responsible. It probably has a benefit in terms of its social impact
on the project, I admit. But ultimately the implementor is uniquely
qualified to ensure the quality of the code they have written, not the
user or other developers.

I've certainly had the wrong attitude to taking care of my own mess in
FLINT at times (ask David Harvey), and I've put off making crucial
decisions until it is waaay too late, on account of being lazy (again
ask David Harvey). But it has become more clear to me as I've matured
as a developer, that ultimately I have a responsibility to ensure the
quality of what I've written.

<snip>

>
> >  8) I have found the documentation somewhat unhelpful so far. From the
> >  point of view of someone who does not use python it is so far hard to
> >  get familiar with SAGE compared to a much simpler (and simplistic)
> >  package such as Pari for example, though it is probably comparable
> >  with learning how to use Magma effectively.
>
> Learning Python is a prerequisite for using Sage.   There is no way
> around that.  Sorry.  (It's sort of like how learning C is a prerequisite
> for using FLINTlib.)

Agreed. But ultimately one needs a very small subset of the python
language to use SAGE, just as one needs a very small subset of C to
use FLINT. I did once sit down to try and write some documentation for
the part of *SAGE* that interested me at the time (number theory). But
I really didn't get very far. There were lots of questions outside the
limits of a basic knowledge of python that I couldn't find answers to.
A mathematician wants to know some basic things. How do I construct
the objects I want to work with (I need lots of examples both trivial
and non-trivial). How do I pass these objects to the functions
available in SAGE. What do those functions precisely do (inputs,
semantics, algorithm, running time, output) and how do I retrieve the
output or parts thereof and print them or store them. The python part
fits somewhere in the middle there, essentially in controlling the
steps of the program I'm trying to write in SAGE. I may want to
iterate over something, and that might take some python.

There is certainly lots of documentation, but I personally find some
of it quite confusing quite aside from my admittedly limited knowledge
of python.

>
> >  9) There are many bugs in the trac which will never be fixed, some
> >  have been fixed or have become irrelevant, some should have been fixed
> >  a long time ago but have not, and the list grows ever longer, which is
> >  in itself worrying. I know mabshoff does an excellent job of fixing
> >  bugs related to build testing, but numerous other bugs seem to get
> >  little attention. The main focus seems to be on fixing bugs related to
> >  failing doctests. My honest gut feeling is that these could be
> >  multiplied ad infinitum if careful tests were carried out, and this
> >  concerns me. I hope I'm wrong about this.
>
> We would love to have more bug reports!!!

This is the problem. I can get NTL to segfault, let alone can I
generate bug reports for SAGE. But if I put lots of time into
generating such bugs, who would be writing test code for FLINT in my
absence? I am responsible for the code I wrote. When FLINT is used in
SAGE I will ensure it is tested fully through SAGE, and I also offered
to generate bug reports for the p-adics in SAGE when doctesting
reaches 100% for the p-adics. That offer still stands.

Bill.
--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---
[sage-devel] Re: fast vs viable (offline post)

Reply via email to