New submission from Paul Ellenbogen <p...@cs.princeton.edu>:
Python encounters significant memory fragmentation when unpickling many small objects. I have attached two scripts that I believe demonstrate the issue. When you run "dumpy.py" it will generate a large list of namedtuples, then write that list to a file using pickle. Before it does so, it pauses for user input. Before exiting the script you can view the memory usage in htop or whatever your preferred method is. The "load.py" script loads the file written by dump.py. After loading the data is complete, it waits for user input. The memory usage at the point where the script is waiting for user input is (more than) twice as much in the "load" case as the "dump" case. The small objects in the list I am storing have 3 values, and I have tested three alternative representations: tuple, namedtuple, and a custom class. The namedtuple and custom class both have the memory use/fragmentation issue. The built in tuple type does not have this issue. Using optimize in pickletools doesn't seem to make a difference. Matthew Cowles from the python help list had some good suggestions, and found that the object size themselves, as observed by sys.getsizeof was different before and after pickling. Perhaps this is something other than memory fragmentation, or something in addition to memory fragmentation. Although high water mark is similar for both scripts, the pickling script settles down on a reasonably smaller memory footprint. I would still consider the long run memory waste of unpickling a bug. For example in my use case I will run one instance of the equivalent of pickling script, then run many many instances of the script that unpickles. These scripts were run with Python 3.6.7 (GCC 8.2.0) on Ubuntu 18.10. ---------- components: Library (Lib) files: dump.py messages: 340615 nosy: Ellenbogen, alexandre.vassalotti priority: normal severity: normal status: open title: Excessive memory use or memory fragmentation when unpickling many small objects type: resource usage versions: Python 3.6 Added file: https://bugs.python.org/file48278/dump.py _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36694> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com