On 03/16/11 08:21 AM, Robert Bradshaw wrote:

If the author can't justify the doctest, then that's a problem, and if
the reviewer can't (after consulting the author) then either he
shouldn't give a positive review or should convince the author that it
the test in question should be taken out/improved. I don't see this as
anything special to tests, if something looks sketchy, call it out. We
don't need a separate stamp of approval for tests, "positive review"
encompasses that. I also don't see this happening often, as people
shouldn't be writing or reviewing code they don't understand. Are you
seeing such code being committed?

Since I'm not a mathematician, I would not recognize so easily analytical tests where the expected result is simply what Sage gave. If the integration of some complex bit of code gives an expected result of "foobar", I personally would not always know if "foobar" is obvious to an expert, or is just "what Sage gave".

As such, I can't say about such tests.

But I've seen a lot of tests giving numerical results, where I'm pretty sure nobody has actually checked if those numerical results are right or not.

It's difficult to give examples without appearing to pick on anyone, but here's just one specific example I happen to be involved in the review of just now. It's far from the only one I've seen and I doubt the worst.

Ethan Van Andel wrote what many say is impressive code which adds Riemann mapping and complex interpolation.

http://trac.sagemath.org/sage_trac/ticket/6648

This is fully doctested, with some numerical results.

Now a ticket to upgrade Numpy:

http://trac.sagemath.org/sage_trac/ticket/10792

has found fairly large changes in the numerical results from this Riemann mapping code.

The fact is, nobody knows what's the correct result. There are comments on

http://trac.sagemath.org/sage_trac/ticket/10792

which include:

1) "I am not sure. I don't understand the code very well in Riemann map."

2) "I don't understand the riemann code very well either"

3) "So one question is: are the results it returns now better or worse than 
before?"

4) "Hoo boy, testing it out definitely shows some serious numerical instability even for many more plot points."

The fact is, nobody knows what the correct values are, but there are tests which check if the results are the exepcted values or not. When a different version of Numpy is used, so the results change significantly.

To be fair to Ethan, the documentation does say "Note that all the methods are Numeric rather than analytic; for unsusual regions or insufficient collocation points may give very inaccurate results."

Now people on the Numpy ticket are suggesting perhaps the Reimann algorithm has poor numerical stability.

I then asked whether there were some cases where the results are known analytically. Apparently there are, which would be useful tests in my opinion. Better than tests where nobody know what the result is anyway.

This is not meant to be a dig at Ethan - he appears to be far more conscientious of his work than others I've known.

IMHO, if we permit code to be committed under these circumstances, then Sage 
will never be seen as a viable alternative to commercial products. I very much 
doubt the commercial products would permit code under such circumstances, but 
even if they do, at least they don't publicly admit it.

Mathematica for one claims all the time to be able to compute things
"no one else can" so failing 1 is certainly good enough for them.

I don't have a problem with any one of the three cases, so not having another tool to compare with is not a show stopper for me. But if that's combined with tests which neither the author nor reviewer can justify, then I do have a problem.

If Mathematica can compute a whole class of integrals nobody else can, but differentiating the results leads back to the original, then that at least shows consistency.

What I have an issue with is when code exists which nothing else can compute, but the tests are not justified.

Many years ago I wrote some finite difference code for computing the properties of arbitrary shaped transmission lines. There's no other free code able to do this, and I did not have access to commercial code which could do it.

But I made numerous tests of my code comparing the results with cases where there are exact analytical results.

http://atlc.sourceforge.net/accuracy.html

So whilst I can't prove the finite difference code works for some given random shape, I do know it works for all cases where I could compute the result analytically.

(BTW, those run times were on hardware more than two orders of magnitude slower than todays hardware, so it should be more practical now to use more points and to improve accuracy).

No
idea of their internal processes, but I'd hope that they don't have
people writing and reviewing (if they do that) code and tests they
don't understand.

I doubt Wolfram Research use many tests nobody can justify or understand. I suppose with code the size of Mathematica, there are bound to be some such tests.

One will never write a bug free program the size of Sage or Mathematica. One can never be 100% confident an algorithm will never fail under certain circumstances. But, in my opinion, Sage developers could do more to check Sage, increasing the confidence in the results.

If prospective Sage users can see that not only do we test the code, but we justify the tests, that will give them more confidence in using Sage.

I'm in a difficult position now. I'd like to use Sage, but I'm reluctant because I don't feel I have sufficient confidence in Sage's quality control procedures. That's a bit annoying after spending a lot of time developing Sage.


Dave

--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to