http://marc.theaimsgroup.com/?t=109716966900003&r=1&w=2
And we ought to document that. Any takes to some this thread up?
-------- Original Message -------- Subject: Bug: Apache hangs if script invokes fork/waitpid Date: Wed, 6 Oct 2004 11:01:23 -0700 From: Naik, Roshan <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] To: <[EMAIL PROTECTED]>
The Problem: --------------- I notice Apache 2(worker mpm) is not able to correctly handle a fork/waitpid invoked by a script used with mod_perl.
Here is a simple cgi perl script to reproduce the problem (run under mod_perl).
#!/opt/perl/bin/perl print "Content-Type: text/plain; charset=euc-jp\n\n";
$pid=fork(); if( $pid == 0 ) { sleep(15); } else { waitpid($pid,0); }
Run this cgi perl script under mod_perl (with ExecCGI option). The call to fork in the perl script actually creates a child process that is identical to the Apache process currently handling the request. This situation, I believe, is somewhat different if the cgi script was run under mod_cgi.
The forked child is not exactly identical to its parent since the forked process only has the worker thread and no other threads (i.e listner or the main thread waiting on POD or the other workers).
Now the funny thing is that once the forked process completes executing the remainder of the Perl script it returns back to the worker thread who happily goes back to ap_queue_pop waiting for someone to feed it more requests to handle...instead of terminating. And there is no one to feed it anything. Now the parent process (or perl script that invoked fork()) is performing a waitpid() for this child and thus blocked forever.
This effectively has made one worker thread useless. Invoking a second request will cause another worker thread to go down the drain and you can continue to do this till all worker process are put into a never ending wait. If Apache has been constrained by MaxClients to some value N, then N requests will efectively cause Apache to stop responding to further requests.
Essentially the forked Apache process does not know that it is not a real Apache worker process.
Fixing it: -------------
First Solution : The natural fix for this is to somehow make the forked worker aware that it is not a real worker and thus it should not go back to waiting for more requests once its done. So essentally in the worker_thread function we check if ap_my_pid == getpid() before continuing into the next iteration of the while loop around ap_queue_pop(). ap_my_pid actually has the pid of the (real) worker that forked the child worker.
static void * APR_THREAD_FUNC worker_thread(apr_thread_t *thd, void * dummy) { // ... snip ...
while (!workers_may_exit) {
// ... snip ...
worker_pop:
if (workers_may_exit) {
break;
} /*break from loop if this worker was forked by another worker*/ if ( ap_my_pid != getpid() ) { break; // or apr_thread_exit(..) or exit(NULL); }
rv = ap_queue_pop(worker_queue, &csd, &ptrans); // ... snip ...
} // end while
// ... snip ..
}
However there is a problem with this approach. The connection is closed (by the forked worker) even before the next iteration of while loop. This causes a problem for the parent who is blocked on waitpid(). Consequently when the child returns control back to the parent the parent can no longer talk to the client. So the core_output_filter of the parent prints out an error message saying "broken pipe" in the error log.
It seems that the connection is closed by the worker in the apr_sendv() function call itself ! So once the child has completed sending all its output, the connection is closed immediately.
I thought it might be a better idea to invoke apr_thread_exit() instead of break to prohibit the forked worker from cleaning up any of the data structures that actually belong to its parent. But apr_thread_exit calls this..
... apr_pool_destroy(thd->pool); // this cleans up stuff that belongs to the parent!! pthread_exit(NULL); ...
Second solution: Is to invoke exit(0) (if we are in the forked worker) just after ap_run_handler is invoked by ap_invoke_handler....
AP_CORE_DECLARE(int) ap_invoke_handler(request_rec *r) | { // ...snip....
result = ap_run_handler(r);
if ( I am a forked worker ) { exit(0); // terminate at the earliest possible stage after request was processed }
// ...snip.... }
Unfortunately this leaves us with a different problem (although a smaller problem) What happens here is that the forked child did all its request processing, but never is not given a chance to send any data out back to client(if it has any). But then the catch-22 is that if we allow it to send all its data out..then it will close the connection too.
In summary: -------------------- Second solution seems preferrable among the two I have suggested.
If I had to chose between disallowing the parent or child from sending data back, I would disallow the child. Either way, parent is allowed to write stuff back until the child closes the connection. From my point of view, as forking is more useful for performing time consuming background tasks, rather than performing concurrent wirtes back to the client, it seems preferrable to disallow the child.
Ofcourse, if there is a way to allow both the child and the parent to both write back then that's best. Then we will have to leave it to the script writers to decide how the parent and child will synchronize between themselves in order to avoid garbled output going to the client.
-Roshan
-- __________________________________________________________________ Stas Bekman JAm_pH ------> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
-- Report problems: http://perl.apache.org/bugs/ Mail list info: http://perl.apache.org/maillist/modperl.html List etiquette: http://perl.apache.org/maillist/email-etiquette.html