On Thu, Feb 6, 2025 at 2:57 PM Qing Zhao <qing.z...@oracle.com> wrote: > > > > > On Feb 6, 2025, at 04:36, Richard Biener <richard.guent...@gmail.com> wrote: > > > > On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.z...@oracle.com> wrote: > >> > >> > >> > >>> On Feb 5, 2025, at 07:46, Richard Biener <richard.guent...@gmail.com> > >>> wrote: > >>> > >>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.z...@oracle.com> wrote: > >>>> > >>>> Hi, > >>>> > >>>> One of our big application segv in libgcc code while unwinding the stack. > >>>> > >>>> This is a random crash while the application throws a c++ exception and > >>>> unwinds the stack. Incidents are random and never can be reproduced by > >>>> any > >>>> test case. > >>>> > >>>> The libgcc that is used is based on GCC 8. > >>>> > >>>> Fortunately, we got some information through the stack trace, and > >>>> narrowed > >>>> down the crash is in the routine "init_object" of > >>>> libgcc/unwind-dw2-pde.c: > >>>> > >>>> 777 count = classify_object_over_fdes (ob, ob->u.single); > >>>> > >>>> when loading the 2nd parameter of the call to > >>>> "classify_object_over_fdes". > >>>> i.e, the address that is pointed by "ob->u.single" is invalid. > >>>> > >>>> And the outer caller we can track back is the following line 1066 of the > >>>> routine > >>>> "_Unwind_Find_FDE": > >>>> > >>>> 1060 /* Classify and search the objects we've not yet processed. */ > >>>> 1061 while ((ob = unseen_objects)) > >>>> 1062 { > >>>> 1063 struct object **p; > >>>> 1064 > >>>> 1065 unseen_objects = ob->next; > >>>> 1066 f = search_object (ob, pc); > >>>> > >>>> Then we suspected that the construction of the static variable > >>>> "unseen_objects" > >>>> might have some bug. > >>>> > >>>> Then after studying the source code that constructs "unseen_objects" in > >>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines > >>>> "__register_frame" > >>>> and "__register_frame_table" as following: > >>>> > >>>> 127 void > >>>> 128 __register_frame (void *begin) > >>>> 129 { > >>>> 130 struct object *ob; > >>>> 131 > >>>> 132 /* If .eh_frame is empty, don't register at all. */ > >>>> 133 if (*(uword *) begin == 0) > >>>> 134 return; > >>>> 135 > >>>> 136 ob = malloc (sizeof (struct object)); > >>>> 137 __register_frame_info (begin, ob); > >>>> 138 } > >>>> > >>>> 181 void > >>>> 182 __register_frame_table (void *begin) > >>>> 183 { > >>>> 184 struct object *ob = malloc (sizeof (struct object)); > >>>> 185 __register_frame_info_table (begin, ob); > >>>> 186 } > >>>> 187 > >>>> > >>>> In the above, one obvious issue in the source code is at line 136, line > >>>> 137, > >>>> and line 184 and line 185: the return value of call to "malloc" is not > >>>> checked > >>>> against NULL before it was passed as the second parameter of the > >>>> follwoing call. > >>>> > >>>> This might cause unpredicted random behavior. > >>>> > >>>> I checked the latest trunk GCC libgcc, the same issue is there too. > >>>> > >>>> This patch is for the latest trunk GCC. > >>>> > >>>> Please let me know any comments on this? > >>> > >>> I think I've seen this elsewhere — > >> > >> Do you mean that you saw this problem (malloc return NULL during > >> register_frame) previously? > > > > It was probably reported to us (SUSE) from a customer/partner, also > > from a (llvm?) JIT context. > > Okay. > So, that bug has not been resolved yet?
It was resolved from our side IIRC as being a problem in the application with the JIT and otherwise a feature request (better unwind API for this use case, allowing a failure mode). Richard. > Qing > > > >>> the issue is the unwind register API does > >>> not allow for failures but I also think calling abort() is bad. > >> > >> Okay, I see. > >> > >>> > >>> Are you calling this from a JIT context or so? > >> > >> I will ask the bug submitter on this to make sure. > >> > >>> We're assuming that at program > >>> start malloc() will not fail. > >> > >> So, the routine __register_frame is only called at the _start_ of a > >> program? > >>> > >>> The proper solution is probably to add an alternate ABI which has a way > >>> to fail > >>> during registration. > >> > >> Any suggestion on this (or any other routine I can refer to?) > >> > >> > >> Thanks a lot for the help. > >> > >> Qing > >>> > >>> Richard. > >>> > >>>> thanks. > >>>> > >>>> Qing > >>>> > >>>> > >>>> ======================= > >>>> > >>>> Fix two issues in libgcc/unwind-dw2-fde.c: > >>>> > >>>> A. The returned value of call to malloc is not checked against NULL. > >>>> This might cause unpredicted behavior during stack unwinding. > >>>> > >>>> B. Check for null begin parameter (as well as pointer to null) in > >>>> __register_frame_info_table_bases and __register_frame_table. > >>>> > >>>> libgcc/ChangeLog: > >>>> > >>>> * unwind-dw2-fde.c (__register_frame): Check the return value of > >>>> call > >>>> to malloc. > >>>> (__register_frame_info_table_bases): Check for null begin > >>>> parameter. > >>>> (__register_frame_table): Check the return value of call to malloc, > >>>> Check for null begin parameter. > >>>> --- > >>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++ > >>>> 1 file changed, 12 insertions(+) > >>>> > >>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c > >>>> index cdfd3974c99..f0bc29d682a 100644 > >>>> --- a/libgcc/unwind-dw2-fde.c > >>>> +++ b/libgcc/unwind-dw2-fde.c > >>>> @@ -169,6 +169,8 @@ __register_frame (void *begin) > >>>> return; > >>>> > >>>> ob = malloc (sizeof (struct object)); > >>>> + if (ob == NULL) > >>>> + abort (); > >>>> __register_frame_info (begin, ob); > >>>> } > >>>> > >>>> @@ -180,6 +182,10 @@ void > >>>> __register_frame_info_table_bases (void *begin, struct object *ob, > >>>> void *tbase, void *dbase) > >>>> { > >>>> + /* If .eh_frame is empty, don't register at all. */ > >>>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0) > >>>> + return; > >>>> + > >>>> ob->pc_begin = (void *)-1; > >>>> ob->tbase = tbase; > >>>> ob->dbase = dbase; > >>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct > >>>> object *ob) > >>>> void > >>>> __register_frame_table (void *begin) > >>>> { > >>>> + /* If .eh_frame is empty, don't register at all. */ > >>>> + if (*(uword *) begin == 0) > >>>> + return; > >>>> + > >>>> struct object *ob = malloc (sizeof (struct object)); > >>>> + if (ob == NULL) > >>>> + abort (); > >>>> __register_frame_info_table (begin, ob); > >>>> } > >>>> > >>>> -- > >>>> 2.31.1 > >