New submission from Tim Holy <tim.h...@gmail.com>:
This is a lightly-edited reposting of https://stackoverflow.com/questions/69164027/unreliable-results-from-timeit-cache-issue I am interested in timing certain operations in numpy and skimage, but I'm occasionally seeing surprising transitions (not entirely reproducible) in the reported times. Briefly, sometimes timeit returns results that differ by about 5-fold from earlier runs. Here's the setup: import skimage import numpy as np import timeit nrep = 16 def run_complement(img): def inner(): skimage.util.invert(img) return inner img = np.random.randint(0, 65535, (512, 512, 3), dtype='uint16') and here's an example session: In [1]: %run python_timing_bug.py In [2]: t = timeit.Timer(run_complement(img)) In [3]: t.repeat(nrep, number=1) Out[3]: [0.0024439050030196086, 0.0020311699918238446, 0.00033007100864779204, 0.0002889479947043583, 0.0002851780009223148, 0.0002851030003512278, 0.00028487699455581605, 0.00032116699730977416, 0.00030912700458429754, 0.0002877369988709688, 0.0002840430097421631, 0.00028515000303741544, 0.00030791999597568065, 0.00029302599432412535, 0.00030723700183443725, 0.0002916679950430989] In [4]: t = timeit.Timer(run_complement(img)) In [5]: t.repeat(nrep, number=1) Out[5]: [0.0006320849934127182, 0.0004014919977635145, 0.00030359599622897804, 0.00029224599711596966, 0.0002907510061049834, 0.0002920039987657219, 0.0002918920072261244, 0.0003095199936069548, 0.00029789700056426227, 0.0002885590074583888, 0.00040198900387622416, 0.00037131100543774664, 0.00040271600300911814, 0.0003492849937174469, 0.0003378120018169284, 0.00029762100894004107] In [6]: t = timeit.Timer(run_complement(img)) In [7]: t.repeat(nrep, number=1) Out[7]: [0.00026428700948599726, 0.00012682100350502878, 7.380900206044316e-05, 6.346100417431444e-05, 6.29679998382926e-05, 6.278700311668217e-05, 6.320899410638958e-05, 6.25409884378314e-05, 6.262199894990772e-05, 6.247499550227076e-05, 6.293901242315769e-05, 6.259800284169614e-05, 6.285199197009206e-05, 6.293600017670542e-05, 6.309800664894283e-05, 6.248900899663568e-05] Notice that in the final run, the minimum times were on the order of 0.6e-4 vs the previous minimum of ~3e-4, about 5x smaller than the times measured in previous runs. It's not entirely predictable when this "kicks in." The faster time corresponds to 0.08ns/element of the array, which given that the 2.6GHz clock on my i7-8850H CPU ticks every ~0.4ns, seems to be pushing the limits of credibility (though thanks to SIMD on my AVX2 CPU, this cannot be entirely ruled out). My understanding is that this operation is implemented as a subtraction and most likely gets reduced to a bitwise-not by the compiler. So you do indeed expect this to be fast, but it's not entirely certain it should be this fast, and in either event the non-reproducibility is problematic. It may be relevant to note that the total amount of data is In [15]: img.size * 2 Out[15]: 1572864 and lshw reports that I have 384KiB L1 cache and 1536KiB of L2 cache: In [16]: 384*1024 Out[16]: 393216 In [17]: 1536*1024 Out[17]: 1572864 So it seems possible that this result is being influenced by just fitting in L2 cache. (Note that even in the "fast block," the first run was not fast.) If I increase the size of the image: img = np.random.randint(0, 65535, (2048, 2048, 3), dtype='uint16') then my results seem more reproducible in the sense that I have not yet seen one of these transitions. ---------- components: Library (Lib) messages: 402411 nosy: timholy priority: normal severity: normal status: open title: Unreliable (?) results from timeit (cache issue?) type: behavior versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue45261> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com