Terry J. Reedy <tjre...@udel.edu> added the comment:

I settled on the following to compare ParseMap implementations.

from idlelib.pyparse import Parser
import timeit

class ParseGet(dict):
    def __getitem__(self, key): return self.get(key, ord('x'))
class ParseMis(dict):
    def __missing__(self, key): return ord('x')

for P in (ParseGet, ParseMis):
    print(P.__name__, 'hit', 'miss')
    p = p=P({i:i for i in (10, 34, 35, 39, 40, 41, 91, 92, 93, 123, 125)})
    print(timeit.timeit(
        "p[10],p[34],p[35],p[39],p[40],p[41],p[91],p[92],p[93],p[125]",
        number=100000, globals = globals()))
    print(timeit.timeit(
        "p[11],p[33],p[36],p[45],p[50],p[61],p[71],p[82],p[99],p[125]",
        number=100000, globals = globals()))

ParseGet hit miss
1.104342376
1.112531999
ParseMis hit miss
0.3530207070000002
1.2165967760000003

ParseGet hit miss
1.185322191
1.1915449519999999
ParseMis hit miss
0.3477272720000002
1.317010653

Avoiding custom code for all ascii chars will be a win.  I am sure that calling 
__missing__ for non-ascii will be at least as fast as it is presently.  I will 
commit a revision tomorrow.  

I may then compare to Serhiy's sub/replace suggestion.  My experiments with 
'code.translate(tran)' indicate that time grows sub-linearly up to 1000 or 
10000 chars.  This suggests that there are significant constant or log-like 
terms.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32940>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to