Re: [HACKERS] Strange Windows problem, lock_timeout test request

Boszormenyi Zoltan Sun, 17 Mar 2013 01:01:30 -0700

2013-03-17 04:48 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan <z...@cybertec.at> writes:

  [ 2-lock_timeout-v37.patch ]

Applied after a fair amount of additional hacking.


Thank you, thank you, thank you! :-)

I was disappointed to find that the patch introduced a new race
condition into timeout.c, or at least broke a safety factor that had
been there.  The argument why enable_timeout() could skip disabling
the timer interrupt on entry, if num_active_timeouts is zero, doesn't
work for enable_timeouts(): as soon as you've incremented
num_active_timeouts for the first new timeout, the signal handler would
mess around with the data structure if it were to fire.  What I did
about that was to modify disable_alarm() to forcibly disable the
interrupt if we're adding multiple timeouts in enable_timeouts(), even
if we think no interrupt is pending.  This might be overly paranoid,
but since all of this is new code for 9.3 and hasn't been through any
beta testing, I felt it best to preserve that safety feature.  We can
revisit it later if it proves to be an issue.  (It's conceivable for
instance that we could avoid incrementing the stored value of
num_active_timeouts until we're done adding all the new timeouts;
but that seemed pretty messy.)


Your reasoning seems to be correct. However, if we take it to the
extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
should also disable the interrupt handler at the beginning of the
function and enabled at the end with pqsignal(). An evil soul may
send SIGALRM externally to the backend processes at the proper
moment and create a race condition while enable_timeout*() is in
progress and the itimer is about to trigger at the same time (but
doesn't since it was disabled).

   For the current usage pattern I'm not
too sure that it matters anyway: a savings is only possible if you
have enabled lock_timeout but not statement_timeout, and I'm doubtful
that that will be a common use-case.


I am not too sure about it. With lock_timeout in place, code migrated
from Informix, MSSQL, etc. can have the same semantic behaviour.


I also rearranged the handling of the LOCK_TIMEOUT interrupt some more,
since I didn't see a need for it to be different from STATEMENT_TIMEOUT,
and got rid of some non-C89 coding practices that didn't seem to me to
be improving readability anyway.


Thanks for that, too. I was too blind to see that choice, even after
thinking about why the statement_timeout cannot be done from
SIGALRM context and why does the code need to also send SIGINT
to the process group. (To kill children, too, like scripts executed via
system()...)

Thanks again,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Strange Windows problem, lock_timeout test request

Reply via email to