[uclibc-ng-devel] Re: [PATCH] fix possible overflow in pointer arithmetics strnlen()

Kjetil Oftedal Tue, 07 Jan 2025 09:18:19 -0800

On Tue, 7 Jan 2025 at 18:07, Frank Mehnert
<frank.mehn...@kernkonzept.com> wrote:
>
> On Dienstag, 7. Januar 2025 17:38:52 MEZ Kjetil Oftedal wrote:
> > On Tue, 7 Jan 2025 at 17:09, Frank Mehnert
> > <frank.mehn...@kernkonzept.com> wrote:
> > >
> > > On Dienstag, 7. Januar 2025 16:45:03 MEZ Kjetil Oftedal wrote:
> > > > On Tue, 7 Jan 2025 at 16:19, Frank Mehnert
> > > > <frank.mehn...@kernkonzept.com> wrote:
> > > > >
> > > > > On Dienstag, 7. Januar 2025 15:55:28 MEZ Kjetil Oftedal wrote:
> > > > > > On Tue, 7 Jan 2025 at 15:33, Frank Mehnert
> > > > > > <frank.mehn...@kernkonzept.com> wrote:
> > > > > > > [...]
> > > > > > >
> > > > > > > With offset=1 and x=0xffffffff'fffffff, y will 0, so y does not 
> > > > > > > belong to
> > > > > > > the same object as x. There is no other value of x which could 
> > > > > > > lead to a
> > > > > > > result y<x with offset=1!
> > > > > > >
> > > > > > > The comparison tests the wrap around and adapts end_ptr in that 
> > > > > > > case.
> > > > > > >
> > > > > > > Again: If there was a wrap around, then the pointer arithmetic 
> > > > > > > operation
> > > > > > > was wrong (undefined behavior). Hence, we can remove the test 
> > > > > > > plus the code
> > > > > > > which is executed if the test succeeds.
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > I see the point. I just don't agree with the conclusion.
> > > > > > As I see it the LLVM group is claiming that if there is any 
> > > > > > possiblity for UB,
> > > > > > even if it is dependent on input arguments, then it is always UB, 
> > > > > > and
> > > > > > the compiler can
> > > > > > remove code as it sees fit. (Which will be a nightmare for security 
> > > > > > code)
> > > > >
> > > > > I admit I thought so some time ago but in my opinion you do a wrong
> > > > > conclusion:
> > > > >
> > > > > There is a test in the code.
> > > > >
> > > > > The result of the test can _only_ be true if the code before the test
> > > > > triggered undefined behavior.
> > > > >
> > > > > In other words, if the test succeeded, then we can be sure that the 
> > > > > test
> > > > > compared two pointers belonging to different objects -- which is 
> > > > > undefined
> > > > > behavior.
> > > > >
> > > > > Therefore, the compiler is fine removing the test because the 
> > > > > compiler does
> > > > > not support the case where the test result is true.
> > > > >
> > > > > If the test result is false, everything is fine, both objects may or 
> > > > > may not
> > > > > point to different objects, but the compiler assumes that they do. 
> > > > > But: If
> > > > > the test result is false, end_ptr does not need an adaption, 
> > > > > therefore that
> > > > > line can be optimized out.
> > > > >
> > > > > [...]
> > > >
> > > > Let us flip the variables around and review:
> > > > E.g.
> > > > if ( str < end_ptr )
> > > > instead of
> > > > if ( end_ptr < str )
> > > >
> > > > Should the compiler then always evaluate the clause to true?
> > > > end_ptr might still be lower than str due to overflow in the artihmetic.
> > >
> > > Not sure what you mean by "then always evaluate the clause to true".
> > > Are you asking if the compiler may assume that (str < end_ptr) is always
> > > true because only this is "defined behavior"?
> > >
> > > No, the compiler cannot assume that. But this is no contradiction to what
> > > I said before.
> > >
> > > The compiler does the following steps for the original problem,
> > > (end_ptr < str):
> > >
> > >   1. Generate code for the test.
> > >   2. The test result can either be true or false. The compiler generates
> > >      code for both cases.
> > >   3. No code needs to be generated for the case "false".
> > >   4. Generate code for the case "true".
> > >   5. Optimize the code. The compiler observes that the "false" case can
> > >      only happen if the previous calculation triggered undefined behavior
> > >      (str and end_ptr point to different objects).
> > >
> > > In your case (str < end_ptr), the compiler still needs to generate code 
> > > for
> > > the test because the compiler must assume that both pointers (str and 
> > > end_ptr)
> > > belong to the same object, therefore the test is valid.
> > >
> > > In the original case (end_ptr < str), the compiler knows for sure that 
> > > both
> > > pointers belong to different objects!
> >
> > [...]
> >
> > Is this true for all cases for strings?
> >
> > > In the original case (end_ptr < str), the compiler knows for sure that
> > > both pointers belong to different objects!
> >
> > ---
> > const char* str = "Hello World!"
> > const char* sub_str = strstr(str, "Wor");
> >
> > strnlen(sub_str, X);
> > ---
> >
> > For some values of X end_ptr is still pointing to a valid array entry
> > for the original "Hello World" string,
> > even if it is not within the "World!" substring. Is it then not
> > pointing to a valid array of objects?
> > (E.g if X is the unsigned representation of -1,-2,...-6)
>
> OK, I think you have a point.
>
> Basically this derives into the question if adding a huge offset to a pointer
> is the same as subtracting a small offset from a pointer. So far I couldn't
> find any C++ rule related to this problem.
>
> Kind regards,
>
> Frank
> --
> Dr.-Ing. Frank Mehnert, frank.mehn...@kernkonzept.com, +49-351-41 883 224
>
> Kernkonzept GmbH.  Sitz: Dresden.  Amtsgericht Dresden, HRB 31129.
> Geschäftsführer: Dr.-Ing. Michael Hohmuth
>
>
Hi,


I think it is purely academic as this point though :)

The change is probably fine. Just a bit annoyed that clang forces
changes to the code
in a lot of projects, instead of clang accepting defacto standard patterns.

Best regards,
Kjetil Oftedal
_______________________________________________
devel mailing list -- devel@uclibc-ng.org
To unsubscribe send an email to devel-le...@uclibc-ng.org

[uclibc-ng-devel] Re: [PATCH] fix possible overflow in pointer arithmetics strnlen()

Reply via email to