New submission from Mark Shannon:

If a dict is used a cache, e.g. in functools.lru_cache, 
the reduced resize factor in 3.3 can cause excessive resizing.
This can lead to a significant performance regression.

When the the number of deletions and insertions is roughly in balance
the reduced head room in the dict (compare to 3.2) causes a large increase in 
the number of resizes.

The reason for this above-linear increase is that with fewer dummy keys, the 
chance of a dummy being overwritten is reduced *and* is there is less overhead 
as well.
A dictionary with 128 items will have a capacity of 256 and only 43 dummy keys. 
If it had a capacity of 512 (as it would have done in 3.2) then it will have 
214 keys, making a resize at least 10 times less frequent.

Changing the growth function from round_up_to_power_2(used*2) to
round_up_to_power_2(used*2+capacity/2) the desirable property of only doubling 
in size when growing can be preserved, yet ensuring sufficient overhead when 
used as a cache.

Consider a dict which grows to n items and then remains that size, with 
frequent deletions and insertions, using the proposed growth function:

Items    Capacity         Steady state     Capacity
         on reaching n    capacity         under 3.2
  2        8                  8               8
  4        8                  16              16 
  6        16                 32              32 
  8        16                 32              32 
  10       16                 64              64 
  12       32                 64              64 
  15       32                 64              64 
  20       32                 128             128
  30       64                 128             128
  50       128                256             256
  80       128                512             512
  128      256                512             512


Thanks to Raymond Hettinger for bringing this to my attention.

Patch attached.

----------
components: Interpreter Core
files: resize.patch
keywords: patch
messages: 185378
nosy: Mark.Shannon
priority: normal
severity: normal
status: open
title: Excessive resizing of dicts when used as a cache
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file29598/resize.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17563>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to