Re: [sage-devel] Using valgrind to find segfaults

Mitesh Patel Wed, 03 Nov 2010 04:09:20 -0700

On 11/03/2010 12:29 AM, Bill Hart wrote:
> I've been down with the flu for a few days and so amusing myself in
> ways that don't make my head hurt too much. So, for fun, I just read


I hope you get better soon!

> through the *very* long trac ticket for getting the new Pari into
> Sage.
> 
> Firstly, thank you to all the people who took the time to work on
> putting the new MPIR and Pari into Sage.
> 
> (By the way, I don't understand why MPIR has been updated to 2.1.2 and
> not 2.1.3 which fixes a serious bug in the mpf functions. Nor do I
> understand why MPIR has been updated and the thread for this hasn't
> been closed. Also FLINT hasn't been updated, even though I explicitly
> stated it isn't safe to build the old flint against the new MPIR.)
> 
> Anyhow, whilst reading the long Pari trac ticket, and associated
> tickets, a few things stood out to me (a C programmer) that just might
> not be obvious to everyone. Apologies if this is already known to
> everyone here.
> 
> At some point the new Pari + new MPIR caused a segfault in one of the
> doctests. Now, segfaults are in some ways the easiest types of bugs to
> track down. Here's how:
> 
> You simply compile the relevant C libraries with gcc -g (this adds
> symbol information and line numbers to the compiled libraries). Next,
> you run the program valgrind. You don't need to do anything to run
> this program. It just works.
> 
> If you normally type "blah" at the command line to run your program,
> just type "valgrind blah" instead. It will take much longer to run
> (usually 25-100 times longer), but it will tell you precisely which
> lines of the C code caused the segfault and if it was reading or
> writing to an invalid memory address at the time! Its output is a bit
> like a stack trace in Python.
> 
> Note you can actually do all this with a Sage doctest, because after
> all, Sage is just a program you run from the command line.
> 
> Once you find out which lines of C code the segfault occurs at, you
> can put a trace in to see if the data being fed to the relevant
> function is valid or not. This tells you if the library is at fault or
> your higher level Python/Cython code is somehow responsible for
> feeding invalid data (e.g. some C object wasn't initialised).
> 
> Once upon a time, Michael Abshoff used to valgrind the entire Sage
> test suite and fix all the myriad bugs that showed up!
> 
> So valgrind is the bug hunters friend.

Thanks for the tip.  I think we could add a valgrind builder to the Sage
buildbot.

> A second observation, made by Leif I think, is spot on. This all quite
> possibly shows up a problem with insufficient doctesting in Sage.
> 
> Now the MPIR test code is pretty extensive and really ought to have
> picked up this bug. We put a lot of time into the test code for that
> MPIR release, so this is unfortunate.
> 
> However, the entire Pari test suite and the entire Sage test suite
> (with an older version of Pari) passed without picking up this pretty
> serious bug in the MPIR division code!
> 
> I think this underscores something I have been saying for a long time.
> Sage doesn't test the C libraries it uses well enough. As a result of
> that, it is taking inordinate amounts of developers' time to track
> down bugs turned up by Sage doctests when spkg's are updated. In some
> cases there is actually woefully inadequate test code in the C library
> itself. But even when this is not the case, it makes sense for Sage to
> do some serious testing before assuming the library is bug free. This
> is particularly easy to do in Python, and much harder to do at the
> level of the C library itself, by the way.
> 
> I have been saying this for a very long time, to many people. *ALL*
> mathematical libraries are broken and contain bugs. If you don't test
> the code you are using, it *is* broken. The right ratio of test code
> to code is really pretty close to 50/50. And if you think I don't do
> this myself when I write code (even Sage code), well you'd be wrong.

Does anyone have an estimate of this ratio for the Sage library?

> One solution would be for everyone to test more widely. If you write
> code that depends on feature Y of module X and module X doesn't
> properly test feature Y, assume it is broken and write doctests for
> that code as well as the code you are writing yourself.
> 
> To give an example, Andy Novocin and I have been working on new
> polynomial factoring code in FLINT for a couple of years now. Around 6
> months ago we had a long test of some 100,000 or so polynomials
> factoring correctly. We also had a long test of some 20 odd very
> difficult polynomials factoring correctly. Thus there was no reason at
> all to suppose there were *ANY* bugs in the polynomial factoring code
> or any of the functions it made use of. By Sage standards I think this
> is an insane level of testing.
> 
> But I insisted that every function we have written have its own test
> code. This has meant 6 months more work (there was something like
> 40,000 lines of new code to test). But I cannot tell you how many new
> serious bugs (and also performance problems too) that we turned up.
> There must be dozens of serious bugs we've fixed, many of which would
> have led to incorrect factorisations of whole classes of polynomials.
> 
> The lesson for me was: just because my very extensive 5 or 6 doctests
> passed for the very complex new functionality I added, does not mean
> there aren't incredibly serious bugs in the underlying modules I used,
> nor does it mean that my new code is worth printing out and using as
> toilet paper.
> 
> Detecting bugs in Sage won't make Sage a viable alternative to the
> MA*'s (that a whole nuther thread). After all, testing standards in
> those other packages are quite possibly much worse. But testing more
> thoroughly will mean less time is spent wasted trying to track down
> bugs in an ad hoc manner, and eventually, much more time available for
> addressing those issues that are relevant to becoming a viable
> alternative.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Re: [sage-devel] Using valgrind to find segfaults

Reply via email to