I have a command-line script that loads about 100 yaml files. It takes 2 or 3 seconds. I profiled my code and I'm using pstats to find what is the bottleneck.
Here's the top 10 functions, sorted by internal time: In [5]: _3.sort_stats('time').print_stats(10) Sat Jul 4 13:25:40 2009 pitz_prof 756872 function calls (739759 primitive calls) in 8.621 CPU seconds Ordered by: internal time List reduced from 1700 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 15153 0.446 0.000 0.503 0.000 build/bdist.linux-i686/egg/yaml/reader.py:134(forward) 30530 0.424 0.000 0.842 0.000 build/bdist.linux-i686/egg/yaml/scanner.py:142(need_more_tokens) 98037 0.423 0.000 0.423 0.000 build/bdist.linux-i686/egg/yaml/reader.py:122(peek) 1955 0.415 0.000 1.265 0.001 build/bdist.linux-i686/egg/yaml/scanner.py:1275(scan_plain) 69935 0.381 0.000 0.381 0.000 {isinstance} 18901 0.329 0.000 3.908 0.000 build/bdist.linux-i686/egg/yaml/scanner.py:113(check_token) 5414 0.277 0.000 0.794 0.000 /home/matt/projects/pitz/pitz/__init__.py:34(f) 30935 0.258 0.000 0.364 0.000 build/bdist.linux-i686/egg/yaml/scanner.py:276(stale_possible_simple_keys) 18945 0.192 0.000 0.314 0.000 /usr/local/lib/python2.6/uuid.py:180(__cmp__) 2368 0.172 0.000 1.345 0.001 build/bdist.linux-i686/egg/yaml/parser.py:268(parse_node) I expected to see a bunch of my IO file-reading code in there, but I don't. So this makes me think that the profiler uses CPU time, not clock-on-the-wall time. I'm not an expert on python profiling, and the docs seem sparse. Can I rule out IO as the bottleneck here? How do I see the IO consequences? TIA Matt -- http://mail.python.org/mailman/listinfo/python-list