Miguel P <prosper.spur...@gmail.com> wrote: > On Sep 12, 2:54 pm, Ned Deily <n...@acm.org> wrote: > > In article > > <da2362e0-ec68-467b-b50b-6067057d7...@y36g2000yqh.googlegroups.com>, > > Miguel P <prosper.spur...@gmail.com> wrote: > > > I've been working on parsing (tailing) a named pipe which is the > > > syslog output of the traffic for a rather busy haproxy instance. It's > > > a fair bit of traffic (upto 3k hits/s per server), but I am finding > > > that simply tailing the file in python, without any processing, is > > > taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd > > > takes 5% with the same load. `cat < /named.pipe` takes 0-2% > > > > > Am I just doing things horribly wrong or is this normal? > > > > > Here is my code: > > > > > from collections import deque > > > import io, sys > > > > > WATCHED_PIPE = '/var/log/haproxy.pipe' > > > > > if __name__ == '__main__': > > > try: > > > log_pool = deque([],10000) > > > fd = io.open(WATCHED_PIPE) > > > for line in fd: > > > log_pool.append(line) > > > except KeyboardInterrupt: > > > sys.exit() > > > > > Deque appends are O(1) so that's not it. And I am using 2.6's io > > > module because it's supposed to handle named pipes better. I have > > > commented the deque appending line and it still takes about the same > > > CPU. > > > > Be aware that the io module in Python 2.6 is written in Python and was > > viewed as a prototype. In the current svn trunk, what will be Python > > 2.7 has a much faster C implementation of the io module backported from > > Python 3.1. > > Aha, I will test with trunk and see if the performance is better, if > so I'll use 2.6 in production until 2.7 comes out. I will report back > when I have made the tests.
Why don't you try just using the builtin open() with bufsize parameter set big? Something like this (tested with named pipes). Tweak BUFFERSIZE and SLEEP_INTERVAL for maximum performance! import time BUFFERSIZE = 1024*1024 SLEEP_INTERVAL = 0.1 def tail(path): fd = open(path) buf = "" while True: buf += fd.read(BUFFERSIZE) if buf: lines = buf.splitlines(True) for line in lines[:-1]: yield line buf = lines[-1] if buf.endswith("\n"): yield buf buf = "" else: time.sleep(SLEEP_INTERVAL) def main(path): for line in tail(path): print "%r:%r" % (len(line), line) if __name__ == "__main__": import sys main(sys.argv[1]) -- Nick Craig-Wood <n...@craig-wood.com> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list