This is very valuable. In trunk socket timeout is 60 and this resulted in another problem. Ctrl-C waits for 60 seconds before joining the worker processes. Perhaps we should increate socket-timeout, catch Ctrl+C and then kill the process instead of joining the workers.
On Jan 31, 12:16 am, nick name <i.like.privacy....@gmail.com> wrote: > Ok, the culprit is definitely ignoring exceptions raised in sendall. In my > humble opinion this is serious enough to be on the 2.0 blocker list. > > How to reproduce: you have to have a wsgi worker, that produces output in > parts (that is, returns a list or yields part as a generator). e.g: use > web2py's "static" file server (which uses wsgi and does not use the > FileSystermWorker). > > 1. Make sure that there's a large payload produced, and that it is made > of a lot of small parts. e.g. put a 10MB file in > web2py/applications/welcome/static/file10mb.data (web2py will use 64K parts > by default) > 2. Consume file slowly, e.g. wget --limit=100k > http://localhost:8000/welcome/static/file10mb.data; this would take 100 > seconds to download the whole file even on localhost. > 3. Let file download for 10 seconds, then pause wget (e.g. suspend it by > using Ctrl-Z on linux/osx) > 4. Wait 20 seconds > 5. Let it continue (e.g. type 'fg' if you suspended it with ctrl-z) > 6. Notice that when it reaches the end, wget will complain about missing > bytes, reconnect and download the rest of the file (and will be happy with > it). However, the file will be corrupt: A block (or many blocks) will be > missing from the middle, and the last few blocks will be repeated (by the > 2nd wget connection; if you disallow wget from resuming, the file will just > be shorter). > > A better idea where the problem is can be seen from the following ugly > patch (applied against web2py's "one file" rocket.py) > > @@ -1929,6 +1929,9 @@ class WSGIWorker(Worker): > self.conn.sendall(b('%x\r\n%s\r\n' % (len(data), data))) > else: > self.conn.sendall(data) > + except socket.timeout: > + self.closeConnection = True > + print 'Exception lost' > except socket.error: > # But some clients will close the connection before that > # resulting in a socket error. > > Running the same experiment with the patched rocket.py will show that files > get corrupted if 'exception lost' is printed to the web2py's terminal. > > Discussion: The only way to use sendall() reliably is to immediately > terminate the connection upon any error (including timeout), as there is no > way to know how many bytes were sent. (That there is no way to know how > many bytes were sent is clearly stated in the documentation; the > implication that it is impossible to reliably recover from this is not). > However, there are sendall() calls all over rocket.py, and some will result > in additional sendalls() following a failed sendall(). The worst offender > seems to be WSGIWorker.write(), but I'm not sure the other sendalls are > safe either. > > Temporary workaround: increase SOCKET_TIMEOUT significantly (default is 1 > second; bump to e.g. 10), and not swallow socket.timeout in > WSGIWorker.write(). > > Increasing the chunk size is NOT a helpful, because it only changes the > number of bytes before the first loss (at a given bandwidth), but from that > point, the problem is the same. > cross > reference:https://github.com/explorigin/Rocket/issues/1#issuecomment-3734231