On Thu, Feb 6, 2025 at 2:57 PM Qing Zhao <qing.z...@oracle.com> wrote:
>
>
>
> > On Feb 6, 2025, at 04:36, Richard Biener <richard.guent...@gmail.com> wrote:
> >
> > On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.z...@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Feb 5, 2025, at 07:46, Richard Biener <richard.guent...@gmail.com> 
> >>> wrote:
> >>>
> >>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.z...@oracle.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> One of our big application segv in libgcc code while unwinding the stack.
> >>>>
> >>>> This is a random crash while the application throws a c++ exception and
> >>>> unwinds the stack. Incidents are random and never can be reproduced by 
> >>>> any
> >>>> test case.
> >>>>
> >>>> The libgcc that is used is based on GCC 8.
> >>>>
> >>>> Fortunately, we got some information through the stack trace, and 
> >>>> narrowed
> >>>> down the crash is in the routine "init_object" of 
> >>>> libgcc/unwind-dw2-pde.c:
> >>>>
> >>>> 777           count = classify_object_over_fdes (ob, ob->u.single);
> >>>>
> >>>> when loading the 2nd parameter of the call to 
> >>>> "classify_object_over_fdes".
> >>>> i.e, the address that is pointed by "ob->u.single" is invalid.
> >>>>
> >>>> And the outer caller we can track back is the following line 1066 of the 
> >>>> routine
> >>>> "_Unwind_Find_FDE":
> >>>>
> >>>> 1060   /* Classify and search the objects we've not yet processed.  */
> >>>> 1061   while ((ob = unseen_objects))
> >>>> 1062     {
> >>>> 1063       struct object **p;
> >>>> 1064
> >>>> 1065       unseen_objects = ob->next;
> >>>> 1066       f = search_object (ob, pc);
> >>>>
> >>>> Then we suspected that the construction of the static variable 
> >>>> "unseen_objects"
> >>>> might have some bug.
> >>>>
> >>>> Then after studying the source code that constructs "unseen_objects" in
> >>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines 
> >>>> "__register_frame"
> >>>> and "__register_frame_table" as following:
> >>>>
> >>>> 127 void
> >>>> 128 __register_frame (void *begin)
> >>>> 129 {
> >>>> 130   struct object *ob;
> >>>> 131
> >>>> 132   /* If .eh_frame is empty, don't register at all.  */
> >>>> 133   if (*(uword *) begin == 0)
> >>>> 134     return;
> >>>> 135
> >>>> 136   ob = malloc (sizeof (struct object));
> >>>> 137   __register_frame_info (begin, ob);
> >>>> 138 }
> >>>>
> >>>> 181 void
> >>>> 182 __register_frame_table (void *begin)
> >>>> 183 {
> >>>> 184   struct object *ob = malloc (sizeof (struct object));
> >>>> 185   __register_frame_info_table (begin, ob);
> >>>> 186 }
> >>>> 187
> >>>>
> >>>> In the above, one obvious issue in the source code is at line 136, line 
> >>>> 137,
> >>>> and line 184 and line 185: the return value of call to "malloc" is not 
> >>>> checked
> >>>> against NULL before it was passed as the second parameter of the 
> >>>> follwoing call.
> >>>>
> >>>> This might cause unpredicted random behavior.
> >>>>
> >>>> I checked the latest trunk GCC libgcc, the same issue is there too.
> >>>>
> >>>> This patch is for the latest trunk GCC.
> >>>>
> >>>> Please let me know any comments on this?
> >>>
> >>> I think I've seen this elsewhere —
> >>
> >> Do you mean that you saw this problem (malloc return NULL during 
> >> register_frame) previously?
> >
> > It was probably reported to us (SUSE) from a customer/partner, also
> > from a (llvm?) JIT context.
>
> Okay.
> So, that bug has not been resolved yet?

It was resolved from our side IIRC as being a problem in the
application with the JIT and
otherwise a feature request (better unwind API for this use case,
allowing a failure mode).

Richard.

> Qing
> >
> >>> the issue is the unwind register API does
> >>> not allow for failures but I also think calling abort() is bad.
> >>
> >> Okay, I see.
> >>
> >>>
> >>> Are you calling this from a JIT context or so?
> >>
> >> I will ask the bug submitter on this to make sure.
> >>
> >>> We're assuming that at program
> >>> start malloc() will not fail.
> >>
> >> So, the routine __register_frame is only called at the _start_ of a 
> >> program?
> >>>
> >>> The proper solution is probably to add an alternate ABI which has a way 
> >>> to fail
> >>> during registration.
> >>
> >> Any suggestion on this (or any other routine I can refer to?)
> >>
> >>
> >> Thanks a lot for the help.
> >>
> >> Qing
> >>>
> >>> Richard.
> >>>
> >>>> thanks.
> >>>>
> >>>> Qing
> >>>>
> >>>>
> >>>> =======================
> >>>>
> >>>> Fix two issues in libgcc/unwind-dw2-fde.c:
> >>>>
> >>>> A. The returned value of call to malloc is not checked against NULL.
> >>>>  This might cause unpredicted behavior during stack unwinding.
> >>>>
> >>>> B. Check for null begin parameter (as well as pointer to null) in
> >>>>  __register_frame_info_table_bases and __register_frame_table.
> >>>>
> >>>> libgcc/ChangeLog:
> >>>>
> >>>>       * unwind-dw2-fde.c (__register_frame): Check the return value of 
> >>>> call
> >>>>       to malloc.
> >>>>       (__register_frame_info_table_bases): Check for null begin 
> >>>> parameter.
> >>>>       (__register_frame_table): Check the return value of call to malloc,
> >>>>       Check for null begin parameter.
> >>>> ---
> >>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
> >>>> 1 file changed, 12 insertions(+)
> >>>>
> >>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
> >>>> index cdfd3974c99..f0bc29d682a 100644
> >>>> --- a/libgcc/unwind-dw2-fde.c
> >>>> +++ b/libgcc/unwind-dw2-fde.c
> >>>> @@ -169,6 +169,8 @@ __register_frame (void *begin)
> >>>>    return;
> >>>>
> >>>>  ob = malloc (sizeof (struct object));
> >>>> +  if (ob == NULL)
> >>>> +    abort ();
> >>>>  __register_frame_info (begin, ob);
> >>>> }
> >>>>
> >>>> @@ -180,6 +182,10 @@ void
> >>>> __register_frame_info_table_bases (void *begin, struct object *ob,
> >>>>                                  void *tbase, void *dbase)
> >>>> {
> >>>> +  /* If .eh_frame is empty, don't register at all.  */
> >>>> +  if ((const uword *) begin == 0 || *(const uword *) begin == 0)
> >>>> +    return;
> >>>> +
> >>>>  ob->pc_begin = (void *)-1;
> >>>>  ob->tbase = tbase;
> >>>>  ob->dbase = dbase;
> >>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct 
> >>>> object *ob)
> >>>> void
> >>>> __register_frame_table (void *begin)
> >>>> {
> >>>> +    /* If .eh_frame is empty, don't register at all.  */
> >>>> +  if (*(uword *) begin == 0)
> >>>> +    return;
> >>>> +
> >>>>  struct object *ob = malloc (sizeof (struct object));
> >>>> +  if (ob == NULL)
> >>>> +    abort ();
> >>>>  __register_frame_info_table (begin, ob);
> >>>> }
> >>>>
> >>>> --
> >>>> 2.31.1
>
>

Reply via email to