> On Feb 6, 2025, at 04:36, Richard Biener <richard.guent...@gmail.com> wrote: > > On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.z...@oracle.com> wrote: >> >> >> >>> On Feb 5, 2025, at 07:46, Richard Biener <richard.guent...@gmail.com> wrote: >>> >>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.z...@oracle.com> wrote: >>>> >>>> Hi, >>>> >>>> One of our big application segv in libgcc code while unwinding the stack. >>>> >>>> This is a random crash while the application throws a c++ exception and >>>> unwinds the stack. Incidents are random and never can be reproduced by any >>>> test case. >>>> >>>> The libgcc that is used is based on GCC 8. >>>> >>>> Fortunately, we got some information through the stack trace, and narrowed >>>> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c: >>>> >>>> 777 count = classify_object_over_fdes (ob, ob->u.single); >>>> >>>> when loading the 2nd parameter of the call to "classify_object_over_fdes". >>>> i.e, the address that is pointed by "ob->u.single" is invalid. >>>> >>>> And the outer caller we can track back is the following line 1066 of the >>>> routine >>>> "_Unwind_Find_FDE": >>>> >>>> 1060 /* Classify and search the objects we've not yet processed. */ >>>> 1061 while ((ob = unseen_objects)) >>>> 1062 { >>>> 1063 struct object **p; >>>> 1064 >>>> 1065 unseen_objects = ob->next; >>>> 1066 f = search_object (ob, pc); >>>> >>>> Then we suspected that the construction of the static variable >>>> "unseen_objects" >>>> might have some bug. >>>> >>>> Then after studying the source code that constructs "unseen_objects" in >>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines >>>> "__register_frame" >>>> and "__register_frame_table" as following: >>>> >>>> 127 void >>>> 128 __register_frame (void *begin) >>>> 129 { >>>> 130 struct object *ob; >>>> 131 >>>> 132 /* If .eh_frame is empty, don't register at all. */ >>>> 133 if (*(uword *) begin == 0) >>>> 134 return; >>>> 135 >>>> 136 ob = malloc (sizeof (struct object)); >>>> 137 __register_frame_info (begin, ob); >>>> 138 } >>>> >>>> 181 void >>>> 182 __register_frame_table (void *begin) >>>> 183 { >>>> 184 struct object *ob = malloc (sizeof (struct object)); >>>> 185 __register_frame_info_table (begin, ob); >>>> 186 } >>>> 187 >>>> >>>> In the above, one obvious issue in the source code is at line 136, line >>>> 137, >>>> and line 184 and line 185: the return value of call to "malloc" is not >>>> checked >>>> against NULL before it was passed as the second parameter of the follwoing >>>> call. >>>> >>>> This might cause unpredicted random behavior. >>>> >>>> I checked the latest trunk GCC libgcc, the same issue is there too. >>>> >>>> This patch is for the latest trunk GCC. >>>> >>>> Please let me know any comments on this? >>> >>> I think I've seen this elsewhere — >> >> Do you mean that you saw this problem (malloc return NULL during >> register_frame) previously? > > It was probably reported to us (SUSE) from a customer/partner, also > from a (llvm?) JIT context.
Okay. So, that bug has not been resolved yet? Qing > >>> the issue is the unwind register API does >>> not allow for failures but I also think calling abort() is bad. >> >> Okay, I see. >> >>> >>> Are you calling this from a JIT context or so? >> >> I will ask the bug submitter on this to make sure. >> >>> We're assuming that at program >>> start malloc() will not fail. >> >> So, the routine __register_frame is only called at the _start_ of a program? >>> >>> The proper solution is probably to add an alternate ABI which has a way to >>> fail >>> during registration. >> >> Any suggestion on this (or any other routine I can refer to?) >> >> >> Thanks a lot for the help. >> >> Qing >>> >>> Richard. >>> >>>> thanks. >>>> >>>> Qing >>>> >>>> >>>> ======================= >>>> >>>> Fix two issues in libgcc/unwind-dw2-fde.c: >>>> >>>> A. The returned value of call to malloc is not checked against NULL. >>>> This might cause unpredicted behavior during stack unwinding. >>>> >>>> B. Check for null begin parameter (as well as pointer to null) in >>>> __register_frame_info_table_bases and __register_frame_table. >>>> >>>> libgcc/ChangeLog: >>>> >>>> * unwind-dw2-fde.c (__register_frame): Check the return value of call >>>> to malloc. >>>> (__register_frame_info_table_bases): Check for null begin parameter. >>>> (__register_frame_table): Check the return value of call to malloc, >>>> Check for null begin parameter. >>>> --- >>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c >>>> index cdfd3974c99..f0bc29d682a 100644 >>>> --- a/libgcc/unwind-dw2-fde.c >>>> +++ b/libgcc/unwind-dw2-fde.c >>>> @@ -169,6 +169,8 @@ __register_frame (void *begin) >>>> return; >>>> >>>> ob = malloc (sizeof (struct object)); >>>> + if (ob == NULL) >>>> + abort (); >>>> __register_frame_info (begin, ob); >>>> } >>>> >>>> @@ -180,6 +182,10 @@ void >>>> __register_frame_info_table_bases (void *begin, struct object *ob, >>>> void *tbase, void *dbase) >>>> { >>>> + /* If .eh_frame is empty, don't register at all. */ >>>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0) >>>> + return; >>>> + >>>> ob->pc_begin = (void *)-1; >>>> ob->tbase = tbase; >>>> ob->dbase = dbase; >>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct >>>> object *ob) >>>> void >>>> __register_frame_table (void *begin) >>>> { >>>> + /* If .eh_frame is empty, don't register at all. */ >>>> + if (*(uword *) begin == 0) >>>> + return; >>>> + >>>> struct object *ob = malloc (sizeof (struct object)); >>>> + if (ob == NULL) >>>> + abort (); >>>> __register_frame_info_table (begin, ob); >>>> } >>>> >>>> -- >>>> 2.31.1