Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

Xiaoran Wang Mon, 19 Aug 2024 20:33:18 -0700

Tom Lane <t...@sss.pgh.pa.us> 于2024年8月19日周一 22:12写道：


> Robert Haas <robertmh...@gmail.com> writes:
> > On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang <fanfuxiao...@gmail.com>
> wrote:
> >> If the code in PG_TRY contains any non local control flow other than
> >> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
> >> be called, then the PG_exception_stack will point to the memory whose
> >> stack frame has been released. So after that, when the pg_re_throw
> >> called, __longjmp() will crash and report Segmentation fault error.
> >>
> >> Addition to sigjmp_buf, add another field 'int magic' which is next to
> >> the sigjum_buf in the local stack frame memory. The magic's value is
> always
> >> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
> >> the magic's value is still '0x12345678', if not, that means the memory
> >> where the 'PG_exception_stack' points to has been released, and the
> 'sigbuf'
> >> must be invalid.
>
> > This is an interesting idea. I suspect if we do this we want a
> > different magic number and a different error message than what you
> > chose here, but those are minor details.
>
> I would suggest just adding an Assert; I doubt this check would be
> useful in production.
>

Agree, an Assert is enough.

>
> > I'm not sure how reliable this would be at actually finding problems,
> > though.
>
> Yeah, that's the big problem.  I don't have any confidence at all
> that this would detect misuse.  It'd require that the old stack
> frame gets overwritten, which might not happen for a long time,
> and it'd require that somebody eventually do a longjmp, which again
> might not happen for a long time --- and when those did happen, the
> error would be detected in someplace far away from the actual bug,
> with little evidence remaining to help you localize it.
>

Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf,
but to implement that is easy I think, recording the line num maybe.

I think this is still useful, at least  it tell you that the error is due
to the
PG_TRY.

>
> Also, as soon as some outer level of PG_TRY is exited in the proper
> way, the dangling stack pointer will be fixed up.  That means there's
> a fairly narrow time frame in which the overwrite and longjmp must
> happen for this to catch a bug.
>
>
Yes, if the outer level PG_TRY call pg_re_throw instead of ereport,
the dangling stack pointer will be fixed up.  It's would be great to fix
that
up in any case. But I have no idea how to implement that now.

In pg_re_throw, if we could use '_local_sigjmp_buf' instead of the
global var PG_exception_stack, that would be great. We don't
need to worry about the invalid sigjum_buf.

So on the whole I doubt this'd be terribly useful in this form;
> and I don't like the amount of code churn required.
>
> > Personally, I don't think I've ever made this particular mistake. I
> > think a much more common usage error is exiting the catch-block
> > without either throwing an error or rolling back a subtransaction. But
> > YMMV, of course.
>
> We have had multiple instances of code "return"ing out of a PG_TRY,
> so I fully agree that some better way to detect that would be good.
> But maybe we ought to think about static analysis for that.
>
>                         regards, tom lane
>


-- 
Best regards !
Xiaoran Wang

Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

Reply via email to