Hi Casey Casey Webster wrote: > On Jul 2, 7:30 am, Nils Rüttershoff <n...@ccsg.de> wrote: > > >> Rec = >> re.compile(r"^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s-\s\d+\s\[(\d{2}/\w+/\d{4}:\d{2}:\d{2}:\d{2})\s\+\d{4}\].*") >> Line = '1.2.3.4 - 4459 [02/Jul/2009:01:50:26 +0200] "GET /foo HTTP/1.0" 200 >> - "-" "www.example.org" "-" "-" "-"' >> > > I'm not sure how much it will help but if you are only using the regex > to get the date/time group element, it might be faster to replace the > regex with: > > >>>> date_string = Line.split()[3][1:-1] >>>>
Indeed this would give a little speed up (by 1000000 iteration approx 3-4 sec). But this would be only a small piece of the cake. Although thx :) The problem is that time.strptime() consult locale.py for each iteration. Here the hole cProfile trace: first with epoch and second with strptime (condensed): 5000009 function calls in 33.084 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 33.084 33.084 <string>:1(<module>) 1 2.417 2.417 33.084 33.084 <timeit-src>:2(inner) 1000000 9.648 0.000 30.667 0.000 time_test.py:30(epoch) 1 0.000 0.000 33.084 33.084 timeit.py:177(timeit) 1000000 3.711 0.000 3.711 0.000 {built-in method groupdict} 1000000 4.318 0.000 4.318 0.000 {built-in method match} 1 0.000 0.000 0.000 0.000 {gc.disable} 1 0.000 0.000 0.000 0.000 {gc.enable} 1 0.000 0.000 0.000 0.000 {gc.isenabled} 1000000 7.764 0.000 7.764 0.000 {map} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1000000 5.225 0.000 5.225 0.000 {time.mktime} 2 0.000 0.000 0.000 0.000 {time.time} ################################################################ 29000009 function calls in 124.449 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 124.449 124.449 <string>:1(<module>) 1 2.244 2.244 124.449 124.449 <timeit-src>:2(inner) 1000000 3.500 0.000 33.559 0.000 _strptime.py:27(_getlang) 1000000 41.814 0.000 100.754 0.000 _strptime.py:295(_strptime) 1000000 4.010 0.000 104.764 0.000 _strptime.py:453(_strptime_time) 1000000 11.647 0.000 19.529 0.000 locale.py:316(normalize) 1000000 3.638 0.000 23.167 0.000 locale.py:382(_parse_localename) 1000000 5.120 0.000 30.059 0.000 locale.py:481(getlocale) 1000000 7.242 0.000 122.205 0.000 time_test.py:37(strptime) 1 0.000 0.000 124.449 124.449 timeit.py:177(timeit) 1000000 1.771 0.000 1.771 0.000 {_locale.setlocale} 1000000 1.735 0.000 1.735 0.000 {built-in method __enter__} 1000000 1.626 0.000 1.626 0.000 {built-in method end} 1000000 3.854 0.000 3.854 0.000 {built-in method groupdict} 1000000 1.646 0.000 1.646 0.000 {built-in method group} 2000000 8.409 0.000 8.409 0.000 {built-in method match} 1 0.000 0.000 0.000 0.000 {gc.disable} 1 0.000 0.000 0.000 0.000 {gc.enable} 1 0.000 0.000 0.000 0.000 {gc.isenabled} 2000000 2.942 0.000 2.942 0.000 {len} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 3000000 4.552 0.000 4.552 0.000 {method 'get' of 'dict' objects} 1000000 2.072 0.000 2.072 0.000 {method 'index' of 'list' objects} 1000000 1.517 0.000 1.517 0.000 {method 'iterkeys' of 'dict' objects} 2000000 3.113 0.000 3.113 0.000 {method 'lower' of 'str' objects} 2000000 3.233 0.000 3.233 0.000 {method 'replace' of 'str' objects} 2000000 2.953 0.000 2.953 0.000 {method 'toordinal' of 'datetime.date' objects} 1000000 1.476 0.000 1.476 0.000 {method 'weekday' of 'datetime.date' objects} 1000000 4.332 0.000 109.097 0.000 {time.strptime} 2 0.000 0.000 0.000 0.000 {time.time}
-- http://mail.python.org/mailman/listinfo/python-list