On Tue, 03 Jul 2007 08:54:25 -0700, Samuel <[EMAIL PROTECTED]> wrote: >On Jul 3, 3:03 pm, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: >> EPIPE results when writing to a socket for which writing has been shutdown. >> This most commonly occurs when the socket has closed. You need to handle >> this exception, since you can't absolutely prevent the socket from being >> closed. > >The exception is already caught and logged, but this is really not >good enough. By "handling this exception", do you mean that there is a >way to handle it such that the connection still works? I found some >code that attempts to retry when SIGPIPE was received, but this only >results in the same error all over again.
No, the exception indicates the connection is gone. There is no way to continue to transfer data using it. >Why can this not be prevented (in the general case)? Unless something >fancy happened, what can cause the socket to close? Looking at the raw >data received by the connected host, the connection gets lost in mid- >stream; I can not see anything that might cause the remote side to >close the connection (in which case I'd expect a "connection reset by >peer" or something). It is the nature of TCP/IP that the connection might disappear at any moment. The reasons for this vary from someone explicitly calling the close or shutdown API to a wire being unplugged somewhere between the two communicating hosts to a traffic event which results in there being insufficient physical resources to satisfy your particular connection, and so on and so on. It may be the case that whatever is causing your connection to be dropped is entirely avoidable (I can't say, since I don't know what is causing your connection to be dropped), but all of these other causes are not avoidable, and your program might encounter one of them someday. > >> There might be some other change which would be appropriate, though, >> if it is the case that something your application is doing is causing the >> socket to be closed (for example, sending a message which the remote side >> decides is invalid and causing it to close the socket explicitly from its >> end). > >The program is doing the same thing repeatedly and it works 95% of the >time, so I am fairly sure that nothing special is sent. > >> It's difficult to make any specific suggestions in that area without >> knowing exactly what your program does. > >Unfortunately the application is rather complex and a simple test case >is not possible. I used to bother to spend days or weeks trying to track down a subtle bug in a complex system, but I don't anymore. ;) It's much better to spend that time simplifying the software which exhibits the problem until it is simple enough to understand and make the bug obvious. Unit testing and test-driven development have the advantage of pressuring you to write code which is already split into simple enough pieces that this is usually a relatively painless process. For systems not written with this in mind, it can be quite unpleasant to produce a simple example, but it's ultimately still worthwhile. > >Basically, it creates a number of daemon threads, each of which >creates a (thread local, non-shared) instance of telnetlib and >connects to a remote host. Are there any special conditions that must >be taken care of when opening a number of sockets in threads? (The >code runs on AIX 4.1, where Python supports native OS threads.) Oops. Threads. So there's a million possible bugs. Oops AIX, heh, that probably introduces another million. I don't know of anything specifically broken in Python tied to telnetlib and threading on AIX, no, but that just leaves you with the usual suspects. Since I don't have any further specific advice to give you in tracking down this problem, maybe I'll just recommend that you take a look at Twisted, which has a better (although probably somewhat harder to grasp) telnet library, and will let you manage numerous connections without threads. (Of course, if you have a large existing system then switching to something as drastically different as Twisted might not be an option, but it doesn't hurt to suggest it.) Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list