On Tue Aug 21 07:31:55 EDT 2012, cinap_len...@gmx.de wrote:
> nothing wrong with diffing the changes and see if theres a clue, but
> to solve this one really needs to find the underlying cause no matter
> what. changes can just hide bugs or make them more or less likely to
> appear. can anyone provide at least a stacktrace or process snapshot
> of the crashed dns processes? from that you try to build a theory of
> what might be going wrong by thinking really really hard... (the
> thinking should be directly proportional to the time it takes to
> reproduce the bug) and then you work on how to prove that theory.
> just changing stuff without knowing what exactly was the problem with
> the old code is sometimes tempting, but wrong and dangerous.

very good point.  in the past, much of the trouble has been that the
rr records get smashed and you crash much later on.  this makes debugging
from a crash frustrating.

but as it happily turns out, i keep all the snaps of all my broken dns 
processes.
i've had ~35 since january, when i last looked at dns.  as it turns out
all of the crashes were due to an off-by-one in dnresolv:/^serveraddrs
the final loop in the function should be limited by Maxdest-1, since
Maxdest is an index not a count.

with this fixed, it will be interesting to see if we see better behavior
or not.  my big concern right now is occassionally corrupted results we
see.  the crashes and whatnot are relatively easy to clean up after.

if anyone else is interested in my compendium of dnssnaps, i'd be happy
to send them along.  there's only a gb of them!  :-).

- erik

Reply via email to