Inada Naoki <songofaca...@gmail.com> added the comment:

issue1296004 is too old (512MB RAM machine!) and I can not confirm it.

But I think it was caused by inefficient realloc() in CRT.

See 
https://github.com/python/cpython/blob/6c52d76db8b1cb836c136bd6a1044e85bfe8241e/Lib/socket.py#L298-L303

_fileobject called socket.recv with remaining size.
Typically, socket can't return MBs at once.  So it cause:

1. Large (at most `amt`, some MBs) string (bytes) are allocated. (malloc)
2. recv is called.
3. _PyString_Resize() (realloc) is called with smaller bytes (typically ~128KiB)
4. amt -= received
5. if amt == 0: exit; goto 1.

This might stress malloc and realloc in CRT.  It caused fragmentation and 
MemoryError.

---

For now, almost everything is rewritten.

In case of _pyio, BufferedIOReader calls RawIOBase.read().  It copies from 
bytearray to bytes.  So call only malloc and free.  Stress for realloc will be 
reduced.

https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Lib/_pyio.py#L586-L591

In case of C _io module, it is more efficient.  It allocate PyBytes once and 
calls SocketIO.read_into directly.  No temporary bytes objects are created.

https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Modules/_io/bufferedio.c#L1632
https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Modules/_io/bufferedio.c#L1658
https://github.com/python/cpython/blob/50866e9ed3e4e0ebb60c20c3483a8df424c02722/Modules/_io/bufferedio.c#L1470

----------
type:  -> performance
versions: +Python 3.8 -Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36050>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to