Let me (rather shamelessly) extract a couple of patches from the patch set that was already shared in the fault injection framework proposal [1].
The first patch incorporates a new syntax in isolation spec grammar to explicitly mark a step that is expected to block (due to reasons other than locks). E.g. permutation step1 step2& step3 The “&” suffix indicates that step2 is expected to block and isolation tester should move on to step3 without waiting for step2 to finish. The second patch implements the insert-conflict scenario that is being discussed here - one session waits (using a “suspend” fault) after inserting a tuple into the heap relation but before updating the index. Another session concurrently inserts a conflicting tuple in the heap and the index, and commits. Then the fault is reset so that the blocked session resumes and detects conflict when updating the index. > On 25-Aug-2020, at 9:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > I wrote: >> I've spent the day fooling around with a re-implementation of >> isolationtester that waits for all its controlled sessions to quiesce >> (either wait for client input, or block on a lock held by another >> session) before moving on to the next step. That was not a feasible >> approach before we had the wait_event infrastructure, but it's >> seeming like it might be workable now. Still have a few issues to >> sort out though ... > > I wasted a good deal of time on this idea, and eventually concluded > that it's a dead end, because there is an unremovable race condition. > Namely, that even if the isolationtester's observer backend has > observed that test session X has quiesced according to its > wait_event_info, it is possible for the report of that fact to arrive > at the isolationtester client process before test session X's output > does. The attached test evades this race condition by not depending on any output from the blocked session X. It queries status of the injected fault to ascertain that a specific point in the code was reached during execution. > > I think what we have to do to salvage this test is to get rid of the > use of NOTICE outputs, and instead have the test functions insert > log records into some table, which we can inspect after the fact > to verify that things happened as we expect. > +1 to getting rid of NOTICE outputs. Please refer to https://github.com/asimrp/postgres/tree/faultinjector for the full patch set proposed in [1] that is now rebased against the latest master. Asim [1] https://www.postgresql.org/message-id/flat/CANXE4Tc%2BRYRC48%3DdKYn1PvAjE26Ew4hh%3DXUjBRGj%3DJ9eob-S6g%40mail.gmail.com#cd02fa3b461102e97bcdc97e62dcc6d3
0001-Add-syntax-to-declare-a-step-that-is-expected-to-blo.patch
Description: 0001-Add-syntax-to-declare-a-step-that-is-expected-to-blo.patch
0002-Speculative-insert-isolation-test-spec-using-fault-i.patch
Description: 0002-Speculative-insert-isolation-test-spec-using-fault-i.patch