On Tue, Oct 4, 2016 at 7:43 PM, Paul Gortmaker
<paul.gortma...@windriver.com> wrote:
>
> A couple years ago Ingo had an idea to kill  BUG_ON abuse that I made
> a 1st pass at.  Back then it seemed nobody cared.  Maybe that has since
> changed?
>
> https://lkml.org/lkml/2014/4/30/359

So we actually have the checkpatch warning already:

      # avoid BUG() or BUG_ON()
                if ($line =~ /\b(?:BUG|BUG_ON)\b/) {
                        my $msg_type = \&WARN;
                        $msg_type = \&CHK if ($file);
                        &{$msg_type}("AVOID_BUG",
                                     "Avoid crashing the kernel - try
using WARN_ON & recovery code rather than BUG() or BUG_ON()\n" .
$herecurr);
                }

but it doesn't trigger on VM_BUG_ON().

And I'm not convinced about replacing things with BUG_ON_AND_HALT(),
it simply doesn't fix the existing issue we have: people use BUG_ON(),
and worse, _when_ they use BUG_ON(), they use it instead of error
handling, so the code _around_ the BUG_ON() tends to then very much
depend on what the BUG_ON() checks.

This is actually one way that VM_BUG_ON() is better: it's very much by
design something that can be compiled away, so at least hopefully
nobody thinks of it as a security measure. So we could just say that
we will treat VM_BUG_ON() as a WARN_ON_ONCE(), and just not kill the
machine.

Because I could easily imagine that somebody *does* treat BUG_ON()
that way, thinking "well, if that BUG_ON() triggers at least it won't
then go off the rails later".

In fact, right now we mark BUG() in such a way that gcc can even take
advantage of the "crash the machine" semantics, because we explicitly
mark the BUG() with "unreachable()".

So what I think we should think about is:

 - extending the checkpatch warning to VM_BUG_ON too, to discourage new users.

 - look at making BUG_ON() simply be less lethal. Remove the
unrechable(), reorganize how the string is stored, and make it act
more like WARN_ON_ONCE() instead (it's the "rewind_stack_do_exit()"
that ends up causing us to try to kill things, we *could* just try to
stop doing that).

 - Instead of adding a BUG_ON_AND_HALT(), we could perhaps add a new
FATAL_ERROR() thing that acts like the current BUG_ON, and *not* call
it something similar (we don't want people doing mindless
conversions!). And that's the one that would do the whole
rewind_stack_do_exit() to kill the process.

But apart from the checkpatch thing, it's actually a pretty big change.

               Linus

Reply via email to