While writing an injection point test, I encountered a variant of the race condition that f4083c4 fixed. It had three sessions and this sequence of events:
s1: local-attach to POINT s2: enter InjectionPointRun(POINT), yield CPU just before injection_callback() s3: detach POINT, deleting the InjectionPointCondition record s2: wake up and run POINT as though it had been non-local On Sat, Mar 16, 2024 at 08:40:21AM +0900, Michael Paquier wrote: > On Fri, Mar 15, 2024 at 11:23:31AM +0200, Heikki Linnakangas wrote: > > Wrt. the spinlock and shared memory handling, I think this would be simpler > > if you could pass some payload in the InjectionPointAttach() call, which > > would be passed back to the callback function: > > > > In this case, the payload would be the "slot index" in shared memory. > > > > Or perhaps always allocate, say, 1024 bytes of working area for every > > attached injection point that the test module can use any way it wants. Like > > for storing extra conditions, or for the wakeup counter stuff in > > injection_wait(). A fixed size working area is a little crude, but would be > > very handy in practice. That would be one good way to solve it. (Storing a slot index has the same race condition, but it fixes the race to store a struct containing the PID.) The best alternative I see is to keep an InjectionPointCondition forever after creating it. Give it a "bool valid" field that we set on detach. I don't see a major reason to prefer one of these over the other. One puts a negligible amount of memory pressure on the main segment, but it simplifies the module code. I lean toward the "1024 bytes of working area" idea. Other ideas or opinions? Separately, injection_points_cleanup() breaks the rules by calling InjectionPointDetach() while holding a spinlock. The latter has an elog(ERROR), and reaching that elog(ERROR) leaves a stuck spinlock. I haven't given as much thought to solutions for this one. Thanks, nm