[uclibc-ng-devel] Re: [PATCH] fix possible overflow in pointer arithmetics strnlen()

Frank Mehnert Tue, 07 Jan 2025 09:07:21 -0800

On Dienstag, 7. Januar 2025 17:38:52 MEZ Kjetil Oftedal wrote:
> On Tue, 7 Jan 2025 at 17:09, Frank Mehnert
> <frank.mehn...@kernkonzept.com> wrote:
> >
> > On Dienstag, 7. Januar 2025 16:45:03 MEZ Kjetil Oftedal wrote:
> > > On Tue, 7 Jan 2025 at 16:19, Frank Mehnert
> > > <frank.mehn...@kernkonzept.com> wrote:
> > > >
> > > > On Dienstag, 7. Januar 2025 15:55:28 MEZ Kjetil Oftedal wrote:
> > > > > On Tue, 7 Jan 2025 at 15:33, Frank Mehnert
> > > > > <frank.mehn...@kernkonzept.com> wrote:
> > > > > > [...]
> > > > > >
> > > > > > With offset=1 and x=0xffffffff'fffffff, y will 0, so y does not 
> > > > > > belong to
> > > > > > the same object as x. There is no other value of x which could lead 
> > > > > > to a
> > > > > > result y<x with offset=1!
> > > > > >
> > > > > > The comparison tests the wrap around and adapts end_ptr in that 
> > > > > > case.
> > > > > >
> > > > > > Again: If there was a wrap around, then the pointer arithmetic 
> > > > > > operation
> > > > > > was wrong (undefined behavior). Hence, we can remove the test plus 
> > > > > > the code
> > > > > > which is executed if the test succeeds.
> > > > >
> > > > > [...]
> > > > >
> > > > > I see the point. I just don't agree with the conclusion.
> > > > > As I see it the LLVM group is claiming that if there is any 
> > > > > possiblity for UB,
> > > > > even if it is dependent on input arguments, then it is always UB, and
> > > > > the compiler can
> > > > > remove code as it sees fit. (Which will be a nightmare for security 
> > > > > code)
> > > >
> > > > I admit I thought so some time ago but in my opinion you do a wrong
> > > > conclusion:
> > > >
> > > > There is a test in the code.
> > > >
> > > > The result of the test can _only_ be true if the code before the test
> > > > triggered undefined behavior.
> > > >
> > > > In other words, if the test succeeded, then we can be sure that the test
> > > > compared two pointers belonging to different objects -- which is 
> > > > undefined
> > > > behavior.
> > > >
> > > > Therefore, the compiler is fine removing the test because the compiler 
> > > > does
> > > > not support the case where the test result is true.
> > > >
> > > > If the test result is false, everything is fine, both objects may or 
> > > > may not
> > > > point to different objects, but the compiler assumes that they do. But: 
> > > > If
> > > > the test result is false, end_ptr does not need an adaption, therefore 
> > > > that
> > > > line can be optimized out.
> > > >
> > > > [...]
> > >
> > > Let us flip the variables around and review:
> > > E.g.
> > > if ( str < end_ptr )
> > > instead of
> > > if ( end_ptr < str )
> > >
> > > Should the compiler then always evaluate the clause to true?
> > > end_ptr might still be lower than str due to overflow in the artihmetic.
> >
> > Not sure what you mean by "then always evaluate the clause to true".
> > Are you asking if the compiler may assume that (str < end_ptr) is always
> > true because only this is "defined behavior"?
> >
> > No, the compiler cannot assume that. But this is no contradiction to what
> > I said before.
> >
> > The compiler does the following steps for the original problem,
> > (end_ptr < str):
> >
> >   1. Generate code for the test.
> >   2. The test result can either be true or false. The compiler generates
> >      code for both cases.
> >   3. No code needs to be generated for the case "false".
> >   4. Generate code for the case "true".
> >   5. Optimize the code. The compiler observes that the "false" case can
> >      only happen if the previous calculation triggered undefined behavior
> >      (str and end_ptr point to different objects).
> >
> > In your case (str < end_ptr), the compiler still needs to generate code for
> > the test because the compiler must assume that both pointers (str and 
> > end_ptr)
> > belong to the same object, therefore the test is valid.
> >
> > In the original case (end_ptr < str), the compiler knows for sure that both
> > pointers belong to different objects!
>
> [...]
>
> Is this true for all cases for strings?
>
> > In the original case (end_ptr < str), the compiler knows for sure that
> > both pointers belong to different objects!
> 
> ---
> const char* str = "Hello World!"
> const char* sub_str = strstr(str, "Wor");
> 
> strnlen(sub_str, X);
> ---
> 
> For some values of X end_ptr is still pointing to a valid array entry
> for the original "Hello World" string,
> even if it is not within the "World!" substring. Is it then not
> pointing to a valid array of objects?
> (E.g if X is the unsigned representation of -1,-2,...-6)


OK, I think you have a point.

Basically this derives into the question if adding a huge offset to a pointer
is the same as subtracting a small offset from a pointer. So far I couldn't
find any C++ rule related to this problem.

Kind regards,

Frank
--
Dr.-Ing. Frank Mehnert, frank.mehn...@kernkonzept.com, +49-351-41 883 224

Kernkonzept GmbH.  Sitz: Dresden.  Amtsgericht Dresden, HRB 31129.
Geschäftsführer: Dr.-Ing. Michael Hohmuth


_______________________________________________
devel mailing list -- devel@uclibc-ng.org
To unsubscribe send an email to devel-le...@uclibc-ng.org

[uclibc-ng-devel] Re: [PATCH] fix possible overflow in pointer arithmetics strnlen()

Reply via email to