This may not be the correct list for this issue, if so I would appreciate if anyone could forward it to the correct list. I had experienced a number of problems with standard library SocketServer when implementing a tcp forking server under python 2.6. I fixed every issue including some timing problems (e.g. socket request closed too fast before the last packet was grabbed) by overriding or extending methods as needed. Nonetheless, the one issue which may require a wider attention had to deal with method collect_children() of class TCPServer. This method makes the following os library call: pid, result = os.waitpid(child, options=0) Under some conditions the method breaks on this line with a message indicating that an unexpected keyword argument "options" was provided. In a continuous run reading thousands of packets from multiple client connects, this line seems to fail at times, but not always. Unfortunately, I did not record the specific conditions that caused this "erroneous" error message, which happened unpredicatbly multiple times. To correct the problem the line of code may be changed by removing the keyword to: pid, result = os.waitpid(child, 0); this never fails. Nonetheless, I believe that method collect_children() is too cumbersome as written and I did override it with the following simpler strategy. The developers of SocketServer may want to consider it as a replacement to the current code used for collect_children(). def collect_children(self): '''Collect Children - Overrides ForkingTCPServer collect_children method. The provided method in SocketServer modules does not properly work for the intended purpose. This implementation is a complete replacement. Each new child process id (pid) is added to list active_children by method process_request(). Each time a new connection is created by the method, a call is then made here for cleanup of any inactive processes. Returns: None ''' child = None try: if self.active_children: # a list of active child processes for child in self.active_children: try: val = os.waitpid(child, os.WNOHANG) # val = (pid, status) if not val[0]: # pid 0; child is inactive time.sleep(0.5) # wait to kill os.kill(child, signal.SIGKILL) # make sure it is dead self.active_children.remove(child) # remove from active list else: continue except OSError, err: if errno.ECHILD != err.errno: # do not report; child not found msg = '\tOS error attempting to terminate child process {0}.' self.errlog.warning(msg.format(str(child))) else: pass except: msg = '\tGeneral error attempting to terminate child process {0}.' self.errlog.exception(msg.format(str(child))) else: pass # for child loop else: pass except: msg = '\tGeneral error while attempting to terminate child process {0}.' self.errlog.exception(msg.format(str(child)))
Things to note are: 1. Using os.WNOHANG instead of 0 as options values to os.waitpid 2. Detecting if we get a returned pid==0; hence assume child is done 3. Attempt a os.kill for defunct children after a time.sleep(0.5); allow dependent processes to complete their job before totally closing down the request. 4. Report os errors as exceptions; but not errno.ECHILD, which means trying to kill none existing child; keep this as a warning. This is more suscinct code and does the job. At least it does it for me. Thanks, Boris
-- http://mail.python.org/mailman/listinfo/python-list