Re: Question on -fwrapv and -fwrapv-pointer

Martin Uecker via Gcc Mon, 18 Sep 2023 08:23:25 -0700

Am Montag, dem 18.09.2023 um 10:47 +0200 schrieb Richard Biener via Gcc:
> On Mon, Sep 18, 2023 at 10:17 AM Martin Uecker <ma.uec...@gmail.com> wrote:
> > 
> > Am Montag, dem 18.09.2023 um 09:31 +0200 schrieb Richard Biener via Gcc:
> > > On Sat, Sep 16, 2023 at 10:38 AM Martin Uecker via Gcc <gcc@gcc.gnu.org> 
> > > wrote:
> > > > 
> > > > 
> > > > 
> > > > (moved to gcc@)
> > > > 
> > > > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > > > > > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao <qing.z...@oracle.com> 
> > > > > > wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao <xry...@xry111.site> 
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > On Thu, 2023-09-14 at 21:41 +0000, Qing Zhao wrote:
> > > > > > > > > > > CLANG already provided 
> > > > > > > > > > > -fsanitize=unsigned-integer-overflow. GCC
> > > > > > > > > > > might need to do the same.
> > > > > > > > > > 
> > > > > > > > > > NO. There is no such thing as unsigned integer overflow. 
> > > > > > > > > > That option
> > > > > > > > > > is badly designed and the GCC community has rejected a few 
> > > > > > > > > > times now
> > > > > > > > > > having that sanitizer before. It is bad form to have a 
> > > > > > > > > > sanitizer for
> > > > > > > > > > well defined code.
> > > > > > > > > 
> > > > > > > > > Even though unsigned integer overflow is well defined, it 
> > > > > > > > > might be
> > > > > > > > > unintentional, shall we warn user about this?
> > > > > > > > 
> > > > > > > > *Everything* could be unintentional and should be warned then.  
> > > > > > > > GCC is a
> > > > > > > > compiler, not an advanced AI educating the programmers.
> > > > > > > 
> > > > > > > Well, you are right in some sense. -:)
> > > > > > > 
> > > > > > > However, overflow is one important source for security flaws, 
> > > > > > > it’s important  for compilers to detect
> > > > > > > overflows in the programs in general.
> > > > > > 
> > > > > > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > > > > > point here. unsigned wraps and does NOT overflow. Yes there is a 
> > > > > > major
> > > > > > difference.
> > > > > 
> > > > > Right, yes. I will try to pick my language very carefully. :)
> > > > > 
> > > > > The practical problem I am trying to solve in the 30 million lines of
> > > > > Linux kernel code is that of catching arithmetic wrap-around. The
> > > > > problem is one of evolving the code -- I can't just drop -fwrapv and
> > > > > -fwrapv-pointer because it's not possible to fix all the cases at 
> > > > > once.
> > > > > (And we really don't want to reintroduce undefined behavior.)
> > > > > 
> > > > > So, for signed, pointer, and unsigned types, we need:
> > > > > 
> > > > > a) No arithmetic UB -- everything needs to have deterministic 
> > > > > behavior.
> > > > >    The current solution here is "-fno-strict-overflow", which 
> > > > > eliminates
> > > > >    the UB and makes sure everything wraps.
> > > > > 
> > > > > b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
> > > > >    would work with -fsanitize=[signed-integer|pointer]-overflow except
> > > > >    due to "a)" we always wrap. And there isn't currently coverage like
> > > > >    this for unsigned (in GCC).
> > > > > 
> > > > > Our problem is that the kernel is filled with a mix of places where 
> > > > > there
> > > > > is intended wrap-around and unintended wrap-around. We can chip away 
> > > > > at
> > > > > fixing the intended wrap-around that we can find with static 
> > > > > analyzers,
> > > > > etc, but at the end of the day there is a long tail of finding the 
> > > > > places
> > > > > where intended wrap-around is hiding. But when the refactoring is
> > > > > sufficiently completely, we can move the wrap-around warning to a 
> > > > > trap,
> > > > > and the kernel will not longer have this class of security flaw.
> > > > > 
> > > > > As a real-world example, here is a bug where a u8 wraps around causing
> > > > > an under-allocation that allowed for a heap overwrite:
> > > > > 
> > > > > https://git.kernel.org/linus/6311071a0562
> > > > > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422
> > > > > 
> > > > > If there were more than 255 elements in a linked list, the allocation
> > > > > would be too small, and the second loop would write past the end of 
> > > > > the
> > > > > allocation. This is a pretty classic allocation underflow and linear
> > > > > heap write overflow security flaw. (And it would be trivially stopped 
> > > > > by
> > > > > trapping on the u8 wrap around.)
> > > > > 
> > > > > So, I want to be able to catch that at run-time. But we also have code
> > > > > doing things like "if (ulong + offset < ulong) { ... }":
> > > > > 
> > > > > https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artpec6_crypto.c#L1187
> > > > > 
> > > > > This is easy for a static analyzer to find and we can replace it with 
> > > > > a
> > > > > non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find
> > > > > them all immediately, especially for the signed and pointer cases.
> > > > > 
> > > > > So, I need to retain the "everything wraps" behavior while still being
> > > > > able to detect when it happens.
> > > > 
> > > > 
> > > > Hi Kees,
> > > > 
> > > > I have a couple of questions:
> > > > 
> > > > Currently, my thinking was that you would use signed integers
> > > > if you want the usual integer arithmetic rules we know from
> > > > elementary school and if you overflow this is clearly a bug
> > > > you can diagnose with UBsan.
> > > > 
> > > > There are people who think that signed overflow should be
> > > > defined to wrap, but I think this would be a severe
> > > > mistake because then code would start to rely on it, which
> > > > makes it then difficult to differentiate between bugs and
> > > > intended uses (e.g. the unfortunate situation you have
> > > > with the kernel).
> > > > 
> > > > I assume you want to combine UBSan plus wrapping for
> > > > production use?  Or only for testing?   Or in other words:
> > > > why would testing UBSan and production with wrapping
> > > > not be sufficient to find and fix all bugs?
> > > > 
> > > > Wrapping would not be correct because it may lead to
> > > > logic errors or use-after-free etc.  I assume it is still
> > > > preferred because it more deterministic than whatever comes
> > > > out of the optimizer assuming that overflow has UB. Is this
> > > > the reasoning applied here?
> > > > 
> > > > 
> > > > For unsigned the intended use case is modulo arithmetic
> > > > where wrapping is the correct behavior. At least, this
> > > > is what I thought so far.. This seems also to be the
> > > > position of the overall GCC community rejecting
> > > > -fsanitize=unsigned-integer-overflow.
> > > > 
> > > > But then there are also people like Richard Seacord that
> > > > have the position that one should use  "unsigned" for
> > > > every quantity which can not be negative, which implies
> > > > that then the modulo use case becomes the exception.
> > > > 
> > > > Related to this I have the following question: In the
> > > > bug you refer to above, an unsigned variable was used
> > > > for something that is not meant for modulo arithmetic.
> > > > Is this done in the kernel to save space?  I.e. because
> > > > 127 would not be enough as maximum but going to i16 takes
> > > > to much space? or is this for compatibility with some
> > > > on-the-wire protocol?
> > > > 
> > > > Would an attribute for variables help that tells the
> > > > compiler that if stores to the variable wrap around
> > > > then this is not intended and this is an error?
> > > > 
> > > > u8 x [[nowrap]];
> > > > 
> > > > x = 256; // can be diagnosed ?
> > > 
> > > There's a problem with representing this, it's a long-standing thing 
> > > standing
> > > in the way of handling -fwrapv in the IL.  Maybe C wants to have
> > > 'nonnegative int' in addition to 'unsigned int', or 'bit int' for the
> > > "just some two-complement bits".
> > > 
> > > For GCC we think of -fwrapv to perform signed integer arithmetic as if
> > > promoted to a large enough type first and then truncated to the original
> > > type in the implementation defined manner (modulo reducing).  But we
> > > end up doing what the actual hardware does (I'm for example unsure
> > > how that handles INT_MIN / -1 - x86 documents it as raising #DE
> > > which it indeed does, even with -fwrapv).
> > > 
> > > volatile int a, b;
> > > int main ()
> > > {
> > >   a = -__INT_MAX__ - 1;
> > >   b = -1;
> > >   int z = a / b;
> > >   __builtin_printf ("%d\n", z);
> > > }
> > > 
> > > raises SIGFPE with and without -fwrapv.  I'll note that we document
> > > -fwrapv to only affect addition, subtraction and multiplication
> > > (it also affects negation), matching the libgcc *v functions we have
> > > for -ftrapv.  -fsanitize=undefined instruments division but
> > > -fsanitize=recover doesn't help.  Maybe there's a bug about this
> > > recorded.  The documentation of -fwrapv should likely be clarified.


Thank you for the explanation. But note that I suggested the 
opposite: "nowrap" . A compiler would not need to be changed
at all and can simply ignore it this attribute. But if it sees 
that a too large value is stored in an unsigned type that
is too small, it could warn in the frontend. And we could add 
a sanitizer mode to diagnose wraparound at run-time. (which 
would then not have the problem with false  positives as 
-funsigned-integer-overflow would have)
> > > 

> > > For recent work I failed to spot the C standards language on
> > > singed overflow behavior for negation, addition, subtraction
> > > and multiplication.  I can find it for division though.  Can someone
> > > point me to where that is (moved?)?
> > 
> > I don't think it moved. It should follow from the general rule 6.5p5:
> > 
> > "If an exceptional condition occurs during the evaluation of an
> > expression (that is, if the result is not mathematically defined
> > or not in the range of representable values for its type), the
> > behavior is undefined."
> 
> Ah, and the unsigned integer type exception is then 6.2.5/9
> but using "can never overflow" wording rather than "in the range
> of representable values for its type".
> 
The "can never overflow" is consequence of modulo
arithmetic, because then the result is in the
representable range.

>   Integer "overflow" isn't defined anywhere and 

It might be defined in ISO/IEC 2382.

> I suppose the unsigned exception also
> applies to underflow for expressions like 0 - 1.

Yes, because the operation is performed modulo 2^N.

> 
> It would have been useful to repeat some of this in the sections
> documenting the relevant operators.

I agree.

Martin

Re: Question on -fwrapv and -fwrapv-pointer

Reply via email to