Inada Naoki <songofaca...@gmail.com> added the comment:
issue1296004 is too old (512MB RAM machine!) and I can not confirm it. But I think it was caused by inefficient realloc() in CRT. See https://github.com/python/cpython/blob/6c52d76db8b1cb836c136bd6a1044e85bfe8241e/Lib/socket.py#L298-L303 _fileobject called socket.recv with remaining size. Typically, socket can't return MBs at once. So it cause: 1. Large (at most `amt`, some MBs) string (bytes) are allocated. (malloc) 2. recv is called. 3. _PyString_Resize() (realloc) is called with smaller bytes (typically ~128KiB) 4. amt -= received 5. if amt == 0: exit; goto 1. This might stress malloc and realloc in CRT. It caused fragmentation and MemoryError. --- For now, almost everything is rewritten. In case of _pyio, BufferedIOReader calls RawIOBase.read(). It copies from bytearray to bytes. So call only malloc and free. Stress for realloc will be reduced. https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Lib/_pyio.py#L586-L591 In case of C _io module, it is more efficient. It allocate PyBytes once and calls SocketIO.read_into directly. No temporary bytes objects are created. https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Modules/_io/bufferedio.c#L1632 https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Modules/_io/bufferedio.c#L1658 https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Modules/_io/bufferedio.c#L1470 ---------- type: -> performance versions: +Python 3.8 -Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36050> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com