Hi, I would like to update something about this idea. I attached a new patch 0003-Imporve-pg_re_throw-check-if-sigjmp_buf-is-valid-and.patch. Not too many updates in it: - replace the 'ereport' with Assert - besides checking the PG_exception_stack->magic, also check the address of PG_exception_stack, if it is lower than current stack, it means that it is an invalid longjmp_buf.
There are 2 other things I would like to update with you: - As you are concerned with that this method is not reliable as the PG_exception_stack.magic might not be rewritten even if the longjmp_buf is not invalid anymore. I have tested hat, you are right, it is not reliable. I tested it with the flowing function on my MacOS: ----------------------- wrong_pg_try(int i) { if (i == 100) ereport(ERROR,(errmsg("called wrong_pg_try"))); if ( i == 0) { PG_TRY(); { return; } PG_CATCH(); { } PG_END_TRY(); } else wrong_pg_try(i + 1) + j; } ------------------------ First call wrong_pg_try(0); then call wrong_pg_try(1); It didn't report any error. I found that is due to the content of PG_exception_stack is not rewritten. There is no variable saved on the "wrong_pg_try()" function stack, but the stacks of the two call are not continuous, looks they are aligned. Sure there are other cases that the PG_exception_stack.magic would not be rewritten - More details about the case that segmentfault occurs from __longjmp. I have a signal function added in PG, in it the PG_TRY called and returned. Then it left a invalid sigjmp_buf. It is a custom signal function handler, can be triggered by another process. Then another sql was running and then interrupted it. At that time, ProcessInterrupts ->ereport->pg_re_throw will called, it crashed Xing Guo <higuox...@gmail.com> 于2024年8月20日周二 22:21写道: > Hi > > On Mon, Aug 19, 2024 at 10:12 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > We have had multiple instances of code "return"ing out of a PG_TRY, > > so I fully agree that some better way to detect that would be good. > > But maybe we ought to think about static analysis for that. > > I have some static analysis scripts for detecting this kind of problem > (of mis-using PG_TRY). Not sure if my scripts are helpful here but I > would like to share them. > > - A clang plugin for detecting unsafe control flow statements in > PG_TRY. > https://github.com/higuoxing/clang-plugins/blob/main/lib/ReturnInPgTryBlockChecker.cpp > - Same as above, but in CodeQL[^1] script. > https://github.com/higuoxing/postgres.ql/blob/main/return-in-PG_TRY.ql > - A CodeQL script for detecting the missing of volatile qualifiers > (objects have been changed between the setjmp invocation and longjmp > call should be qualified with volatile). > https://github.com/higuoxing/postgres.ql/blob/main/volatile-in-PG_TRY.ql > > Andres also has some compiler hacking to detect return statements in > PG_TRY[^2]. > > [^1]: https://codeql.github.com/ > [^2]: > https://www.postgresql.org/message-id/20230113054900.b7onkvwtkrykeu3z%40awork3.anarazel.de > -- Best regards ! Xiaoran Wang
0003-Imporve-pg_re_throw-check-if-sigjmp_buf-is-valid-and.patch
Description: Binary data