Bugs item #756104, was opened at 2003-06-17 15:25 Message generated for change (Comment added) made by facundobatista You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=756104&group_id=5470
>Category: Documentation >Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Alex R.M. Turner (plexq) Assigned to: Nobody/Anonymous (nobody) Summary: Calling socket.recv() with a large number breaks Initial Comment: I have a backup script that calls socket.recv() passing the amount of data that is left to get for a given file as the argument. For very large files (I have one here that is 1.9Gig) it seems to break horribly after receiving about 130Meg. I have tried to track it down precisely with basic debugging (putting in print statements in the python code) but it just seams to freak out around the 130meg mark, and it just doesn't make sense. If I change the argument passed to recv to 32767 it works just fine. I have attatched the loop that reads data from the socket (In a working state, the bad code is commented out). ---------------------------------------------------------------------- >Comment By: Facundo Batista (facundobatista) Date: 2005-01-15 16:07 Message: Logged In: YES user_id=752496 This is a documentation bug; something the user should be "warned" about. This caught me once, and two different persons asked about this in #python, so maybe we should put something like the following in the recv() docs. """ For best match with hardware and network realities, the value of "buffer" should be a relatively small power of 2, for example, 4096. """ If you think the wording is right, just assign the bug to me, I'll take care of it. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2004-09-13 11:11 Message: Logged In: YES user_id=31435 As Jeremy implied at the start, someone needs to demonstrate that "the bug" is actually in Python, not in your platform's implementation of sockets. If a C program displays the same behavior, your complaint is with your platform socket implementation. Sockets are low-level gimmicks, which is why Jeremy expected a C program to fail the same way. ---------------------------------------------------------------------- Comment By: Dmitry Dvoinikov (targeted) Date: 2004-09-13 08:17 Message: Logged In: YES user_id=1120792 I've also been hit by this problem, not at a 130Meg read, but at a mere 10Meg. Because of that I had to change my code to read many small chunks rather than a single blob and that fixed it. Still, I disagree with tim_one. If recv() is limited with the amount of data it can read per call, it should be set and documented, otherwise it is a bug and the call is unreliable. Nobody has to follow the decribed "best-practice" of reading small chunks, it actually worsens code quality, and what's worse - it makes other stuff break. Example - I was using the SimpleXMLRPCServer (Lib/SimpleXMLRPCServer.py), and it contains the following line at it's heart: data = self.rfile.read(int(self.headers["content-length"])) Guess what was happening with requests with content-length larger than 10Meg (in my case). I've aso thought the problem was with max() instead of min(), but as tim_one rightfully pointed out, this was a desired behaviour. Ironically though, replacing max() with min() fixes the problem as there is no more huge reads at lower level. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2004-07-26 21:41 Message: Logged In: YES user_id=31435 potterru, I don't believe plexq was using _fileobject.read() -- he said he was using socket.recv(), and all the comments later were consistent with that. The code you found does appear to *intend* max(): code following the snippet you quoted clearly expects that it may have read more than "left" bytes, and it would not be worrying about that had min() been intended. I agree the code is pretty inscrutable regardless, but we'd have to ask Guido why he wrote it that way. In any case, since this bug report is about socket.recv(), if you want to make a case that _fileobject.read() is buggy you should open a new report for that. ---------------------------------------------------------------------- Comment By: Igor E. Poteryaev (potterru) Date: 2004-07-26 08:23 Message: Logged In: YES user_id=48596 It looks like bug in module socket.py in _fileobject.read method. ... while True: left = size - buf_len recv_size = max(self._rbufsize, left) data = self._sock.recv(recv_size) This code should read not more than *left* or *buffer size* So, it should be min instead of max ! Should I file a bug/patch for this ? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2003-06-18 15:51 Message: Logged In: YES user_id=31435 I assume you're running on Linux. Python's implementation of recv asks the platform malloc for a buffer of the size requested, and raises an exception if malloc returns NULL. Unfortunately, malloc on Linux has a nasty habit of returning non-NULL even if there's no chance you can actually use the amount of memory requested. There's really nothing Python can do about that. So back to Jeremy's comment: try the same thing in C. If you get ridiculous behavior there too, it's a platform C/OS bug, and Python won't be able to hide it from you. ---------------------------------------------------------------------- Comment By: Alex R.M. Turner (plexq) Date: 2003-06-18 15:22 Message: Logged In: YES user_id=60034 That as maybe - it might be worth putting a suggested maximum in the docs. However I would say that given that an IPv6 packet could be as large as 2Gig, it's not unreasonable to ask for a packet as large as 1 Gig. Wether the problem is in glibc or python I don't know, although it seems that asking for a buffer of 1.3 Gig in size, and passing that to recv() would be odd behaviour on a current system in C given that most systems couldn't allocate that much memory to a buffer ;). I have written fairly extensive socket code in C/C++ before, and I never used anything larger than 65536 for the obvious reason that you can't receive anything bigger than that in IPv4 (and most NICs can't handle anything that big either). I figured it would be interesting to see what happened :). I have a penchant for being the only person in history to do quite a few things apparently! ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2003-06-18 13:29 Message: Logged In: YES user_id=31435 BTW, if you're new to socket programming, do read Python's socket programming HOWTO: http://www.amk.ca/python/howto/sockets/ I expect you're the only person in history to try passing such a large value to recv() -- even if it worked, you'd almost certainly run out of memory trying to allocate buffer space for 1.9GB. sockets are a low-level facility, and it's common to pass a relatively small power of 2 (for best match with hardware and network realities). ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2003-06-18 13:02 Message: Logged In: YES user_id=31392 What happens when you write a C program that does the same thing? I expect you'll see similar problems. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=756104&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com