On Fri, 14 Dec 2012, Alfred Perlstein wrote:
On 12/14/12 4:12 PM, Robert Watson wrote:
On Fri, 14 Dec 2012, John Baldwin wrote:
On Thursday, December 13, 2012 4:02:15 am Gleb Smirnoff wrote:
On Wed, Dec 12, 2012 at 04:53:48PM -0800, Alfred Perlstein wrote: A> The
problem again is that not all the KASSERTS are inviolable, if you A> want
to do a project to split them, then please do, it would really be A>
helpful, as for now, they are a mis-mash of death/warnings and there are
A> at least three vendors who approve of this as well as 3 long term A>
committers that approved my change (not including Adrian).
Can you show examples of not inviolable KASSERTs?
There are none. They are all assertions for a reason. However, in my
Not even one whose existence is a bug? :-)
experience at several large consumers of FreeBSD, no one wants to run with
INVARIANTS in production. Not because we don't want panics (believe me,
Yahoo! gets plenty of panics even with INVARIANTS disabled), but because
the performance overhead, and redefining INVARIANTS into printf doesn't
address that at all.
I mostly don't use INVARIANTS, but once used Debugger() a lot. Now I
only want to debug my own code and sometimes turn on INVARIANTS to try
to do this, but it usually exposes more bugs not in my own code so I
turn it off again :-).
In the past, FYI, the two major INVARIANTS hits were un-inlining of
locking, and UMA using additional global locking to perform heavier-weight
sanity checking. For me, at least, the former wasn't such a problem, but
the latter was extremely measurable. I have occasionally wondered if we
should have another option name for heavier-duty sanity checking, as is the
case with WITNESS, SOCKBUF_DEBUG, etc, so that INVARIANTS overhead can be
made tolerable for more real-world installations. As you observe, our
invariants tests are generally things where you catch critical problems in
much more debuggable states, rather than sources of additional fragility,
and it would be great if we could leave more in so that bug reports were
able to contain more information.
I think that KASSERT() is used too much:
- the heavyweight uses that you mention
- many lightweight unconditional panics in old code were turned into
conditional panics using KASSERT(). So production code that turns off
INVARIANTS no longer gets the benefit of the sanity checking for these.
- newer code tends to use KASSERT() even for things that could reasonably
be lightweight unconditional panics. So production code that turns off
INVARIANTS never got the benefit of the sanity checking for these.
- not many unconditional panics (with non-lightweight checking for them)
can't be left in production code, so there were a limited number of them,
with a few pushed under DIAGNOSTIC. In particular, ones for debugging
didn't grow unboundedly. Now, KASSERT()s grow unboundedly and the
cost of enabling INVARIANTS grows unboundedly.
- the source code becomes unreadable, with half of it tending to consist
ot misformatted KASSERT()s SHOUTING IN UPPER CASE. The latter bug is
missing in assert(3). assert(3) has a better designed API in other
respects too. For example it supplies the bloat for __func__, __FILE__
and __LINE__ whether you want this or not. This bloat was intentionally
left out for KASSERT(). But many uses of KASSERT() supply this bloat
in the caller, where it also bloat the source code.
The KTR system offers an interesting reference for a model that allows us to
make a compile time decision about which areas to trace.
There used to be a DIAGNOSTIC option for the more heavy checks, this is
either removed or just not used these days. Again, using a mask like KTR
might be a win if we bring back the equivalent of heavy weight DIAGNOSTIC
option.
DIGANOSTIC still exists. It is rarely used, and tends to be only used for
heavyweight checks that are too buggy or too heavyweight or just too old
to put in KASSERT().
Often I've been guilty of putting KASSERT(ptr != NULL) checks into the code
too. Those are really just to make it less painful when hitting that bug,
You mean to make it _more_ painful when hitting that bug.
so
instead of having to do a lot of homework to figure out the fault address and
line number, I can just get a pretty printed message under INVARIANTS. Maybe
those "pretty null checks" need to go under another option, perhaps
DIAGNOSTIC, or another .. ??PRETTY_PANIC.
Null pointer panics give restartable page faults, with the fault address
printed by trap_fatal() and the page fault often easy to restart using a
debugger. trap_fatal() calls the debugger, so the page fault isn't
completely transparent. This corresponds to gdb in userland stopping
on the signal for the page fault. You might have to use "handle SIGFOO
Stop/Print/Pass" to get back to the faulting instruction. In the kernel,
continuing from trap_fatal() gives this behaviour. The page fault will
repeat endlessly unless you fix it up. You can try various fixes until
you find one that doesn't fault, or let it fault a few times to look
at the context in different ways.
If panic() is used for null pointers, then you get an unrestartable
panic(), with the relevant context several frames back where it is
hard to see and much more difficult to restart from (practically
impossible now that panic() stops CPUs, etc., though it used to be
possible to unwind the frame in some cases).
The null pointer check doesn't even detect the common case where the
pointer is &p->bar where p is null. Now the bad pointer is a small
offset from a null pointer (normally at a small address). It would
take an earlier check of p to detect this.
Anyhow, I've always been a huge fan of FreeBSD due the additional debugging
checks it offered. I was the one that suggested splassert() back in the day
when we would continually find issues with spl calls. Additionally I found
SPLASSERT() was almost never used. The only uses that I can find of it
are all in netipsec.
doing filesystem work a ton easier with DEBUG_VFS_LOCKS under FreeBSD than
under Darwin which at the time did not have such a feature.
This was more useful I think :-). The main reason that SPLASSERT()
was useless was because spl was a stable API that went away at about
the same time that SPLASSERT() was added (SPLASSERT() was implemented
about 6 months before SMPng killed spl).
Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"