I could use some help making this Python code run faster using only Python code.
I am new to Python however I would like some feedback from those who know more about Python than I do at this time. def scrambleLine(line): s = '' for c in line: s += chr(ord(c) | 0x80) return s def descrambleLine(line): s = '' for c in line: s += chr(ord(c) & 0x7f) return s def scrambleFile(fname,action=1): if (path.exists(fname)): try: f = open(fname, "r") toks = fname.split('.') while (len(toks) > 2): toks.pop() fname = '.'.join(toks) if (action == 1): _fname = fname + '.scrambled' elif (action == 0): _fname = fname + '.descrambled' if (path.exists(_fname)): os.remove(_fname) ff = open(_fname, "w+") if (action == 1): for l in f: ff.write(scrambleLine(l)) elif (action == 0): for l in f: ff.write(descrambleLine(l)) except Exception, details: print 'ERROR :: (%s)' % details finally: f.close() ff.close() else: print 'WARNING :: Missing file "%s" - cannot continue.' % fname -- http://mail.python.org/mailman/listinfo/python-list
Re: I could use some help making this Python code run faster using only Python code.
On Sep 20, 3:57 pm, "Matt McCredie" <[EMAIL PROTECTED]> wrote: > On 9/20/07, Python Maniac <[EMAIL PROTECTED]> wrote: > > > I am new to Python however I would like some feedback from those who > > know more about Python than I do at this time. > > Well, you could save some time by not applying the scramble one line > at a time (that is if you don't mind losing the line endings in the > scrambled version). For that to be effective though, you probably want > to open in binary mode. Also, your scramble can be written using list > comprehension. > > [code] > def scramble(s, key=0x80): >return ''.join([chr(ord(c) ^ key) for c in s]) > > output = scramble(f.read()) > [/code] > > If you use xor (^) as above, you can use the same method for scramble > as descramble (running it again with the same key will descramble) and > you can use an arbitrary key. Though, with 255 combinations, it isn't > very strong encryption. > > If you want stronger encryption you can use the following AESish algorithm: > > [code] > import random > def scramble(s, key): > random.seed(key) > return ''.join([chr(ord(c) ^ random.randint(0,255)) for c in s]) > [/code] > > This allows you to use much larger keys, but with a similar effect. > Still not strong enough to be unbreakable, but much better than the > origional. It is strong enough that someone knowing how you scrambled > it will have trouble unscrambling it even if they don't know the key. > > Matt So far I like what was said in this reply however my ultimate goal is to allow the scramble method to be more than what is shown above. I considered using XOR however XOR only works for the case shown above where the scramble method simply sets or removes the MSB. What I want to be able to do is set or reset the MSB in addition to performing a series of additional steps without negatively impacting performance for the later cases when the MSB is not the only technique being employed. Hopefully this sheds more light on the goal I have in mind. BTW - My original code is able to scramble a 20 MB file in roughly 40 secs using a Dell E6600 2.4 GHz. When I began writing the code for this problem my best runtime was about 65 secs so I know I was heading in the right direction. I was about to begin the process of using the D Language and pyd to gain better performance but then I thought I might also take this opportunity to learn something about Python before delving into D. Obviously I could simply code the whole process using D but that defeats the purpose for using Python in the first place and so I would tend to limit my low-level coding to the task of scrammbling each line of text. The ironic thing about this exorcise was the fact that the optimization techniques that worked for Python caused the Ruby version I coded to decrease performance. It seems Ruby has some definite ideas about what it feels is optimal that it ain't got much to do with what I would consider to be traditional optimization techniques where Ruby is concerned. For instance, Ruby felt the putc method was less optimal than the write method which makes no sense to me but that's life with Ruby. -- http://mail.python.org/mailman/listinfo/python-list
Re: I could use some help making this Python code run faster using only Python code.
On Sep 21, 12:56 am, Duncan Booth <[EMAIL PROTECTED]> wrote: > George Sakkis <[EMAIL PROTECTED]> wrote: > > It has to do with the input string length; try multiplying it by 10 or > > 100. Below is a more complete benchmark; for largish strings, the imap > > version is the fastest among those using the original algorithm. Of > > course using a lookup table as Diez showed is even faster. FWIW, here > > are some timings (Python 2.5, WinXP): > > > scramble: 1.818 > > scramble_listcomp: 1.492 > > scramble_gencomp: 1.535 > > scramble_map: 1.377 > > scramble_imap: 1.332 > > scramble_dict: 0.817 > > scramble_dict_map: 0.419 > > scramble_dict_imap: 0.410 > > I added another one: > > import string > scramble_translation = string.maketrans(''.join(chr(i) for i in xrange > (256)), ''.join(chr(i|0x80) for i in xrange(256))) > def scramble_translate(line): > return string.translate(line, scramble_translation) > > ... > funcs = [scramble, scramble_listcomp, scramble_gencomp, > scramble_map, scramble_imap, > scramble_dict, scramble_dict_map, scramble_dict_imap, > scramble_translate > ] > > and I think I win: > > scramble: 1.949 > scramble_listcomp: 1.439 > scramble_gencomp: 1.455 > scramble_map: 1.470 > scramble_imap: 1.546 > scramble_dict: 0.914 > scramble_dict_map: 0.415 > scramble_dict_imap: 0.416 > scramble_translate: 0.007 Wow ! Now I am very impressed with Python ! The difference between where I began (70.155 secs) and where we end (2.278 secs) is a whopping 30.8x faster using some rather simple techniques that are nothing more than variations on the theme of hoisting function calls out of loops along with using some very powerful iterator functions from Python. My best runtime with Ruby using the same machine and OS was 67.797 secs which is 29.8x slower than the fastest Python runtime. This makes Ruby almost as slow as Python was made faster. The irony with Ruby was that the use of a hash in Ruby actually made the Ruby code run slower than when a hash was not used. Now I think I will code this little scrambler using nothing but the D Language just to see whether there is any benefit in using D over Python for this sort of problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: I could use some help making this Python code run faster using only Python code.
On Sep 21, 3:02 pm, "Matt McCredie" <[EMAIL PROTECTED]> wrote: > > Now I think I will code this little scrambler using nothing but the D > > Language just to see whether there is any benefit in using D over > > Python for this sort of problem. > > Isn't D compiled to machine code? I would expect it to win hands down. > That is, unless it is horribly unoptimized. > > Matt Well D code is compiled into machine code that runs via a VM. My initial D code ran in about 6 secs as compare with the 2.278 secs for the optimized Python code. If I want the D code to run faster than the optimized Python I would have to use the same Pythonic optimizations as were used in Python when crafting the D code and then I would guess the optimized D code might run only 2x faster than the optimized Python code. In real terms < 3 secs to process a 20 MB file is more than reasonable performance with no need to perform any additional optimizations. For this particular problem Python performs as well as the D powered machine code using far less effort, for me, than what it would take to make the D code run faster than the Python code. All this tells me the following: * List Comprehensions are very powerful for Python. * String translation is a must when processing string based data in an iterative manner. * Ruby has no hope of being able to beat Python for this type of problem given the proper Python optimizations are used. * There is no value in wasting time with lower-level languages to make Python run faster for this type of problem. It would be nice if Python could be made to automatically detect the LC and string translation patterns used by the unoptimized Python code and make them into optimized Python code on the fly at runtime. I am more than a little amazed nobody has chosen to build a JIT (Just In- Time compiler) or cached-compiler into Python but maybe that sort of thing is just not needed given the fact that Python code can be easily optimized to run 30x faster. -- http://mail.python.org/mailman/listinfo/python-list
Re: I could use some help making this Python code run faster using only Python code.
On Sep 21, 4:48 pm, "Matt McCredie" <[EMAIL PROTECTED]> wrote: > > It would be nice if Python could be made to automatically detect the > > LC and string translation patterns used by the unoptimized Python code > > and make them into optimized Python code on the fly at runtime. I am > > more than a little amazed nobody has chosen to build a JIT (Just In- > > Time compiler) or cached-compiler into Python but maybe that sort of > > thing is just not needed given the fact that Python code can be easily > > optimized to run 30x faster. > > See PyPyhttp://codespeak.net/pypy/for a JIT comiler for python. > Although it is in the research phase, but worth taking a look at. > > Matt You need to check-out a project called Psyco (forerunner for pypy). I was able to get almost 2x better performance by adding 3 lines of code for Psyco. See also: http://psyco.sourceforge.net/download.html I am rather amazed ! Psyco was able to give much better performance above and beyond the already optimized Python code without negatively impacting performance during its analysis at runtime. -- http://mail.python.org/mailman/listinfo/python-list
Re: I could use some help making this Python code run faster using only Python code.
On Sep 21, 11:39 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > On Fri, 21 Sep 2007 16:25:20 -0700, Python Maniac wrote: > > On Sep 21, 3:02 pm, "Matt McCredie" <[EMAIL PROTECTED]> wrote: > > >> Isn't D compiled to machine code? I would expect it to win hands down. > >> That is, unless it is horribly unoptimized. > > > Well D code is compiled into machine code that runs via a VM. > > About which D are we talking here? Not digital mars' successor to C++, > right!? > > Ciao, > Marc 'BlackJack' Rintsch Yes, Digital Mars D is what I was referring to and yes I know D is not as efficient as C++. If I knew of a good C++ compiler that is not from Microsoft that works natively with Windows I would be happy to consider using it but alas the only one I have found so far is from Digital Mars. Digital Mars D has nice integration with Python via pyd and this is another plus, in my mind. -- http://mail.python.org/mailman/listinfo/python-list
Re: I could use some help making this Python code run faster using only Python code.
On Sep 21, 12:56 am, Duncan Booth <[EMAIL PROTECTED]> wrote: > George Sakkis <[EMAIL PROTECTED]> wrote: > > It has to do with the input string length; try multiplying it by 10 or > > 100. Below is a more complete benchmark; for largish strings, the imap > > version is the fastest among those using the original algorithm. Of > > course using a lookup table as Diez showed is even faster. FWIW, here > > are some timings (Python 2.5, WinXP): > > > scramble: 1.818 > > scramble_listcomp: 1.492 > > scramble_gencomp: 1.535 > > scramble_map: 1.377 > > scramble_imap: 1.332 > > scramble_dict: 0.817 > > scramble_dict_map: 0.419 > > scramble_dict_imap: 0.410 > > I added another one: > > import string > scramble_translation = string.maketrans(''.join(chr(i) for i in xrange > (256)), ''.join(chr(i|0x80) for i in xrange(256))) > def scramble_translate(line): > return string.translate(line, scramble_translation) > > ... > funcs = [scramble, scramble_listcomp, scramble_gencomp, > scramble_map, scramble_imap, > scramble_dict, scramble_dict_map, scramble_dict_imap, > scramble_translate > ] > > and I think I win: > > scramble: 1.949 > scramble_listcomp: 1.439 > scramble_gencomp: 1.455 > scramble_map: 1.470 > scramble_imap: 1.546 > scramble_dict: 0.914 > scramble_dict_map: 0.415 > scramble_dict_imap: 0.416 > scramble_translate: 0.007 Some benchmarks showing the effectiveness of using Psyco: scramble: 4.210 scramble_listcomp: 2.343 scramble_gencomp: 2.599 scramble_map: 1.960 scramble_imap: 2.231 scramble_dict: 2.387 scramble_dict_map: 0.535 scramble_dict_imap: 0.726 scramble_translate: 0.010 Now with Psyco... psyco.bind(scramble)... scramble: 0.121 4.088 34.670x faster scramble_listcomp: 0.215 2.128 10.919x faster scramble_gencomp: 2.563 0.036 1.014x faster scramble_map: 2.002 -0.043 0.979x slower scramble_imap: 2.175 0.056 1.026x faster scramble_dict: 0.199 2.188 11.983x faster scramble_dict_map: 0.505 0.029 1.058x faster scramble_dict_imap: 0.728 -0.001 0.998x slower scramble_translate: 0.009 0.001 1.111x faster Overall, Psyco helped my little process of converting 20 MB worth of ASCII into what may not look like the original at 6x faster than without Psyco. -- http://mail.python.org/mailman/listinfo/python-list