Xiaoran Wang <fanfuxiao...@gmail.com> 于2024年8月20日周二 11:32写道:
> > > Tom Lane <t...@sss.pgh.pa.us> 于2024年8月19日周一 22:12写道: > >> Robert Haas <robertmh...@gmail.com> writes: >> > On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang <fanfuxiao...@gmail.com> >> wrote: >> >> If the code in PG_TRY contains any non local control flow other than >> >> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot >> >> be called, then the PG_exception_stack will point to the memory whose >> >> stack frame has been released. So after that, when the pg_re_throw >> >> called, __longjmp() will crash and report Segmentation fault error. >> >> >> >> Addition to sigjmp_buf, add another field 'int magic' which is next to >> >> the sigjum_buf in the local stack frame memory. The magic's value is >> always >> >> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if >> >> the magic's value is still '0x12345678', if not, that means the memory >> >> where the 'PG_exception_stack' points to has been released, and the >> 'sigbuf' >> >> must be invalid. >> >> > This is an interesting idea. I suspect if we do this we want a >> > different magic number and a different error message than what you >> > chose here, but those are minor details. >> >> I would suggest just adding an Assert; I doubt this check would be >> useful in production. >> > > Agree, an Assert is enough. > >> >> > I'm not sure how reliable this would be at actually finding problems, >> > though. >> >> Yeah, that's the big problem. I don't have any confidence at all >> that this would detect misuse. It'd require that the old stack >> frame gets overwritten, which might not happen for a long time, >> and it'd require that somebody eventually do a longjmp, which again >> might not happen for a long time --- and when those did happen, the >> error would be detected in someplace far away from the actual bug, >> with little evidence remaining to help you localize it. >> > > Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf, > but to implement that is easy I think, recording the line num maybe. > > I think this is still useful, at least it tell you that the error is due > to the > PG_TRY. > >> >> Also, as soon as some outer level of PG_TRY is exited in the proper >> way, the dangling stack pointer will be fixed up. That means there's >> a fairly narrow time frame in which the overwrite and longjmp must >> happen for this to catch a bug. >> >> > Yes, if the outer level PG_TRY call pg_re_throw instead of ereport, > the dangling stack pointer will be fixed up. It's would be great to fix > that > up in any case. But I have no idea how to implement that now. > > Sorry, "Yes, if the outer level PG_TRY call pg_re_throw instead of ereport, " should be "Yes, if the outer level PG_TRY reset the PG_exception_stack." In pg_re_throw, if we could use '_local_sigjmp_buf' instead of the > global var PG_exception_stack, that would be great. We don't > need to worry about the invalid sigjum_buf. > > So on the whole I doubt this'd be terribly useful in this form; >> and I don't like the amount of code churn required. >> >> > Personally, I don't think I've ever made this particular mistake. I >> > think a much more common usage error is exiting the catch-block >> > without either throwing an error or rolling back a subtransaction. But >> > YMMV, of course. >> >> We have had multiple instances of code "return"ing out of a PG_TRY, >> so I fully agree that some better way to detect that would be good. >> But maybe we ought to think about static analysis for that. >> >> regards, tom lane >> > > > -- > Best regards ! > Xiaoran Wang > -- Best regards ! Xiaoran Wang