On 03/16/11 08:21 AM, Robert Bradshaw wrote:
If the author can't justify the doctest, then that's a problem, and if
the reviewer can't (after consulting the author) then either he
shouldn't give a positive review or should convince the author that it
the test in question should be taken out/improved. I don't see this as
anything special to tests, if something looks sketchy, call it out. We
don't need a separate stamp of approval for tests, "positive review"
encompasses that. I also don't see this happening often, as people
shouldn't be writing or reviewing code they don't understand. Are you
seeing such code being committed?
Since I'm not a mathematician, I would not recognize so easily analytical tests
where the expected result is simply what Sage gave. If the integration of some
complex bit of code gives an expected result of "foobar", I personally would not
always know if "foobar" is obvious to an expert, or is just "what Sage gave".
As such, I can't say about such tests.
But I've seen a lot of tests giving numerical results, where I'm pretty sure
nobody has actually checked if those numerical results are right or not.
It's difficult to give examples without appearing to pick on anyone, but here's
just one specific example I happen to be involved in the review of just now.
It's far from the only one I've seen and I doubt the worst.
Ethan Van Andel wrote what many say is impressive code which adds Riemann
mapping and complex interpolation.
http://trac.sagemath.org/sage_trac/ticket/6648
This is fully doctested, with some numerical results.
Now a ticket to upgrade Numpy:
http://trac.sagemath.org/sage_trac/ticket/10792
has found fairly large changes in the numerical results from this Riemann
mapping code.
The fact is, nobody knows what's the correct result. There are comments on
http://trac.sagemath.org/sage_trac/ticket/10792
which include:
1) "I am not sure. I don't understand the code very well in Riemann map."
2) "I don't understand the riemann code very well either"
3) "So one question is: are the results it returns now better or worse than
before?"
4) "Hoo boy, testing it out definitely shows some serious numerical instability
even for many more plot points."
The fact is, nobody knows what the correct values are, but there are tests which
check if the results are the exepcted values or not. When a different version of
Numpy is used, so the results change significantly.
To be fair to Ethan, the documentation does say "Note that all the methods are
Numeric rather than analytic; for unsusual regions or insufficient collocation
points may give very inaccurate results."
Now people on the Numpy ticket are suggesting perhaps the Reimann algorithm has
poor numerical stability.
I then asked whether there were some cases where the results are known
analytically. Apparently there are, which would be useful tests in my opinion.
Better than tests where nobody know what the result is anyway.
This is not meant to be a dig at Ethan - he appears to be far more conscientious
of his work than others I've known.
IMHO, if we permit code to be committed under these circumstances, then Sage
will never be seen as a viable alternative to commercial products. I very much
doubt the commercial products would permit code under such circumstances, but
even if they do, at least they don't publicly admit it.
Mathematica for one claims all the time to be able to compute things
"no one else can" so failing 1 is certainly good enough for them.
I don't have a problem with any one of the three cases, so not having another
tool to compare with is not a show stopper for me. But if that's combined with
tests which neither the author nor reviewer can justify, then I do have a problem.
If Mathematica can compute a whole class of integrals nobody else can, but
differentiating the results leads back to the original, then that at least shows
consistency.
What I have an issue with is when code exists which nothing else can compute,
but the tests are not justified.
Many years ago I wrote some finite difference code for computing the properties
of arbitrary shaped transmission lines. There's no other free code able to do
this, and I did not have access to commercial code which could do it.
But I made numerous tests of my code comparing the results with cases where
there are exact analytical results.
http://atlc.sourceforge.net/accuracy.html
So whilst I can't prove the finite difference code works for some given random
shape, I do know it works for all cases where I could compute the result
analytically.
(BTW, those run times were on hardware more than two orders of magnitude slower
than todays hardware, so it should be more practical now to use more points and
to improve accuracy).
No
idea of their internal processes, but I'd hope that they don't have
people writing and reviewing (if they do that) code and tests they
don't understand.
I doubt Wolfram Research use many tests nobody can justify or understand. I
suppose with code the size of Mathematica, there are bound to be some such tests.
One will never write a bug free program the size of Sage or Mathematica. One can
never be 100% confident an algorithm will never fail under certain
circumstances. But, in my opinion, Sage developers could do more to check Sage,
increasing the confidence in the results.
If prospective Sage users can see that not only do we test the code, but we
justify the tests, that will give them more confidence in using Sage.
I'm in a difficult position now. I'd like to use Sage, but I'm reluctant because
I don't feel I have sufficient confidence in Sage's quality control procedures.
That's a bit annoying after spending a lot of time developing Sage.
Dave
--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org