Hi Drew,
This fix looks fine to me.  Just so I understand your shutdown model,
you send SIGUSR1 to the processes in the procflow after you've handled
the SIGINT/SIGTERM from kill/ ^C, right?

Thanks,

-j

On Tue, Feb 10, 2009 at 10:20:25AM -0800, Andrew Wilson wrote:
> Dear OpenSolaris performance gurus,
>    Several bugs related to problems shutting down FileBench have been 
> reported, including the two cited here and one more that I may add to the 
> push. Unfortunately the provided information is a bit sparse, and I can't 
> reproduce them on my test machines. However, an examination of the code 
> does reveal some potential shutdown races that could cause shared memory to 
> be removed before all the child processes have stopped using it, which 
> could lead to core dumps (as reported in these two CRs), or problems 
> accessing locks and semaphores.
>
> So, I have gone ahead and redone the shutdown process to include some 
> locking to stop the races and provide a more orderly shutdown. In the 
> process I have managed to simplify the shutdown code as well. So, I can't 
> be sure I fixed the problems, as I still don't see any dumps or other 
> problems on my test rigs, but it should do a cleaner shutdown, especially 
> of oltp.f, which seems to be the major culprit, and has been responsible 
> for problems with shutting down that I have encountered before. The code 
> changes are available here for your review:
>
> http://cr.opensolaris.org/~dreww/shutdown_fix/
>
> Drew
>
> _______________________________________________
> perf-discuss mailing list
> perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to