[sage-devel] Re: Using valgrind to find segfaults

Bill Hart Wed, 03 Nov 2010 21:00:59 -0700

Hi David.

On Nov 4, 2:09 am, Dr David Kirkby <drkir...@gmail.com> wrote:
> On Nov 3, 5:29 am, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > Hi all,
>
> Hi Bill
>
> > Now the MPIR test code is pretty extensive and really ought to have
> > picked up this bug. We put a lot of time into the test code for that
> > MPIR release, so this is unfortunate.
>
> Bugs are inevitable. Anybody that believes one can write a non-trivial
> piece of bug-free code is mistaken. NASA can't do it. The aviation
> industry can't do it.
>
> So any thoughts of writing bug-free code are just pointless. It will
> never happen. The best we can do is to reduce the probability of
> bugs.
>
> > However, the entire Pari test suite and the entire Sage test suite
> > (with an older version of Pari) passed without picking up this pretty
> > serious bug in the MPIR division code!
>
> > I think this underscores something I have been saying for a long time.
> > Sage doesn't test the C libraries it uses well enough.
>
> I agree. There tends to be a trust that the upstream developers are
> good mathematicians and their code is often widely used, so it must be
> ok. But I'm afraid that in many cases in Sage, the upstream developers
> skill set is often not in software, but in mathematics.
>
> > As a result of
> > that, it is taking inordinate amounts of developers' time to track
> > down bugs turned up by Sage doctests when spkg's are updated. In some
> > cases there is actually woefully inadequate test code in the C library
> > itself.
>
> Agreed. I think however there is now an increased awareness of this,
> and the situation is improving. I know recently someone proposed
> adding some code to Sage, and I asked if an audit had been done of the
> library. It was clear that in that case the developers were clearly
> quite skilled at writing software. It was well commented, clean etc. I
> did not check the maths of it, but it passed the first stage in that
> the developers were clearly vigilant in what they were doing. It had
> test code etc.
>
> > But even when this is not the case, it makes sense for Sage to
> > do some serious testing before assuming the library is bug free. This
> > is particularly easy to do in Python, and much harder to do at the
> > level of the C library itself, by the way.
> > I have been saying this for a very long time, to many people. *ALL*
> > mathematical libraries are broken and contain bugs.
>
> Of course. Any non-trivial piece of code will contain bugs.
>
> > If you don't test
> > the code you are using, it *is* broken.
>
> Even if you do test, if the code is non-trivial, it will contain bugs.
> There is no such thing as a non-trivial bug-free program.
>
> > The right ratio of test code
> > to code is really pretty close to 50/50. And if you think I don't do
> > this myself when I write code (even Sage code), well you'd be wrong.
>
> Where do you get this 50:50 figure from Bill? Did you can rand(), or
> is it based on any hard facts?


Well, I only have to find one person who agrees with me :-)

http://blog.flipbit.co.uk/2009/06/what-code-coverage-percentage-should.html

But it is roughly based on the idea that each call to a function from
elsewhere in a project should equal one more test for that function,
and this somehow seems to naturally work out to about that amount.
Usually slightly less I think. But it depends on the language and the
type of testing done obviously.

>
> What appears to be to be a rather extreme example, is the SQlite
> database which is in Sage, where the test code is 647 times bigger
> than the database
>
> http://www.sqlite.org/testing.html

Now that's my kind of test suite!!

>
> If you pick up any decent book on software engineering you will find
> that the most expensive part of developing commercial software is the
> maintenance costs. So it does not surprise me one bit that a lot of
> time is spent in Sage in resolving bug problems - that seems pretty
> normal in the computer industry.
>
> > One solution would be for everyone to test more widely. If you write
> > code that depends on feature Y of module X and module X doesn't
> > properly test feature Y, assume it is broken and write doctests for
> > that code as well as the code you are writing yourself.
>
> Unfortunately that will often get impractical, and I think would
> easily consume more than the 50:50 mix you mention above.
>
> > To give an example, Andy Novocin and I have been working on new
> > polynomial factoring code in FLINT for a couple of years now. Around 6
> > months ago we had a long test of some 100,000 or so polynomials
> > factoring correctly. We also had a long test of some 20 odd very
> > difficult polynomials factoring correctly. Thus there was no reason at
> > all to suppose there were *ANY* bugs in the polynomial factoring code
> > or any of the functions it made use of.
>
> That's just rubbish. If this code is non-trivial, then you must expect
> bugs.

Of course.

>
> > By Sage standards I think this
> > is an insane level of testing.
>
> It's certainly higher than normal in Sage I would agree. There are a
> wide mix of Sage developers, so who pay very scant attention to code
> quality, others who do take it seriously.
>
> > But I insisted that every function we have written have its own test
> > code. This has meant 6 months more work (there was something like
> > 40,000 lines of new code to test). But I cannot tell you how many new
> > serious bugs (and also performance problems too) that we turned up.
> > There must be dozens of serious bugs we've fixed, many of which would
> > have led to incorrect factorisations of whole classes of polynomials.
>
> I'm really puzzled you are surprised at this.

I am puzzled you think I was expressing surprise. I was well aware
what would turn up with better test code.

>
> > The lesson for me was: just because my very extensive 5 or 6 doctests
> > passed for the very complex new functionality I added, does not mean
> > there aren't incredibly serious bugs in the underlying modules I used,
> > nor does it mean that my new code is worth printing out and using as
> > toilet paper.
>
> You do not need 6 months worth to prove this - you can pick up any
> decent book on software engineering and these issues are discussed.
>
> > Detecting bugs in Sage won't make Sage a viable alternative to the
> > MA*'s (that a whole nuther thread).
>
> Agreed. But failing to detect too many bugs will make Sage non-viable
> for a large number of users.
>
> > After all, testing standards in
> > those other packages are quite possibly much worse.
>
> I do believe Wolfram Research take testing quite seriously. Whilst I'm
> aware of some of which is written at
>
> http://reference.wolfram.com/mathematica/tutorial/TestingAndVerificat...
>
> is total rubbish, I doubt they actually lie about some of the testing
> methods they use. They certainly claim to use a lot of techniques
> which Sage does not.
>
> > But testing more
> > thoroughly will mean less time is spent wasted trying to track down
> > bugs in an ad hoc manner, and eventually, much more time available for
> > addressing those issues that are relevant to becoming a viable
> > alternative.
>
> Agreed. But one has to balance the conflicting demands of testing code
> better, and adding more features. Finding the right balance point is
> not easy. Different developers each have their own ideas.
>

True.

Bill.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

[sage-devel] Re: Using valgrind to find segfaults

Reply via email to