Hi Drew, This fix looks fine to me. Just so I understand your shutdown model, you send SIGUSR1 to the processes in the procflow after you've handled the SIGINT/SIGTERM from kill/ ^C, right?
Thanks, -j On Tue, Feb 10, 2009 at 10:20:25AM -0800, Andrew Wilson wrote: > Dear OpenSolaris performance gurus, > Several bugs related to problems shutting down FileBench have been > reported, including the two cited here and one more that I may add to the > push. Unfortunately the provided information is a bit sparse, and I can't > reproduce them on my test machines. However, an examination of the code > does reveal some potential shutdown races that could cause shared memory to > be removed before all the child processes have stopped using it, which > could lead to core dumps (as reported in these two CRs), or problems > accessing locks and semaphores. > > So, I have gone ahead and redone the shutdown process to include some > locking to stop the races and provide a more orderly shutdown. In the > process I have managed to simplify the shutdown code as well. So, I can't > be sure I fixed the problems, as I still don't see any dumps or other > problems on my test rigs, but it should do a cleaner shutdown, especially > of oltp.f, which seems to be the major culprit, and has been responsible > for problems with shutting down that I have encountered before. The code > changes are available here for your review: > > http://cr.opensolaris.org/~dreww/shutdown_fix/ > > Drew > > _______________________________________________ > perf-discuss mailing list > perf-discuss@opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org