On Sep 20, 7:13 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > On Sep 20, 5:46 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > > > > > On Sep 20, 10:59 pm, Python Maniac <[EMAIL PROTECTED]> wrote: > > > > I am new to Python however I would like some feedback from those who > > > know more about Python than I do at this time. > > > > def scrambleLine(line): > > > s = '' > > > for c in line: > > > s += chr(ord(c) | 0x80) > > > return s > > > > def descrambleLine(line): > > > s = '' > > > for c in line: > > > s += chr(ord(c) & 0x7f) > > > return s > > > ... > > > Well, scrambleLine will remove line-endings, so when you're > > descrambling > > you'll be processing the entire file at once. This is particularly bad > > because of the way your functions work, adding a character at a time > > to > > s. > > > Probably your easiest bet is to iterate over the file using read(N) > > for some small N rather than doing a line at a time. Something like: > > > process_bytes = (descrambleLine, scrambleLine)[action] > > while 1: > > r = f.read(16) > > if not r: break > > ff.write(process_bytes(r)) > > > In general, rather than building strings by starting with an empty > > string and repeatedly adding to it, you should use ''.join(...) > > > For instance... > > def descrambleLine(line): > > return ''.join(chr(ord(c) & 0x7f) for c in line) > > > def scrambleLine(line): > > return ''.join(chr(ord(c) | 0x80) for c in line) > > > It's less code, more readable and faster! > > I would have thought that also from what I've heard here. > > def scrambleLine(line): > s = '' > for c in line: > s += chr(ord(c) | 0x80) > return s > > def scrambleLine1(line): > return ''.join([chr(ord(c) | 0x80) for c in line]) > > if __name__=='__main__': > from timeit import Timer > t = Timer("scrambleLine('abcdefghijklmnopqrstuvwxyz')", "from > __main__ import scrambleLine") > print t.timeit() > > ## scrambleLine > ## 13.0013366039 > ## 12.9461998318 > ## > ## scrambleLine1 > ## 14.4514098748 > ## 14.3594400695 > > How come it's not? Then I noticed you don't have brackets in > the join statement. So I tried without them and got > > ## 17.6010847978 > ## 17.6111472418 > > Am I doing something wrong?
It has to do with the input string length; try multiplying it by 10 or 100. Below is a more complete benchmark; for largish strings, the imap version is the fastest among those using the original algorithm. Of course using a lookup table as Diez showed is even faster. FWIW, here are some timings (Python 2.5, WinXP): scramble: 1.818 scramble_listcomp: 1.492 scramble_gencomp: 1.535 scramble_map: 1.377 scramble_imap: 1.332 scramble_dict: 0.817 scramble_dict_map: 0.419 scramble_dict_imap: 0.410 And the benchmark script: from itertools import imap def scramble(line): s = '' for c in line: s += chr(ord(c) | 0x80) return s def scramble_listcomp(line): return ''.join([chr(ord(c) | 0x80) for c in line]) def scramble_gencomp(line): return ''.join(chr(ord(c) | 0x80) for c in line) def scramble_map(line): return ''.join(map(chr, map(0x80.__or__, map(ord,line)))) def scramble_imap(line): return ''.join(imap(chr, imap(0x80.__or__,imap(ord,line)))) scramble_table = dict((chr(i), chr(i | 0x80)) for i in xrange(255)) def scramble_dict(line): s = '' for c in line: s += scramble_table[c] return s def scramble_dict_map(line): return ''.join(map(scramble_table.__getitem__, line)) def scramble_dict_imap(line): return ''.join(imap(scramble_table.__getitem__, line)) if __name__=='__main__': funcs = [scramble, scramble_listcomp, scramble_gencomp, scramble_map, scramble_imap, scramble_dict, scramble_dict_map, scramble_dict_imap] s = 'abcdefghijklmnopqrstuvwxyz' * 100 assert len(set(f(s) for f in funcs)) == 1 from timeit import Timer setup = "import __main__; line = %r" % s for name in (f.__name__ for f in funcs): timer = Timer("__main__.%s(line)" % name, setup) print '%s:\t%.3f' % (name, min(timer.repeat(3,1000))) George -- http://mail.python.org/mailman/listinfo/python-list