On 13 Jan 2006 23:17:05 -0800, [EMAIL PROTECTED] wrote: > >fynali wrote: >> $ cat cleanup_ray.py >> #!/usr/bin/python >> import itertools >> >> b = set(file('/home/sajid/python/wip/stc/2/CBR0000333')) >> >> file('PSP-CBR.dat,ray','w').writelines(itertools.ifilterfalse(b.__contains__,file('/home/sajid/python/wip/stc/2/PSP0000333'))) >> >> -- >> $ time ./cleanup_ray.py >> >> real 0m5.451s >> user 0m4.496s >> sys 0m0.428s >> >> (-: Damn! That saves a bit more time! Bravo! >> >> Thanks to you Raymond. >Have you tried the explicit loop variant with psyco ? My experience is >that psyco is pretty good at optimizing for loop which usually results >in faster code than even built-in map/filter variant. > >Though it would just be 1 or 2 sec difference(given what you already >have) so may not be important but could be fun. > OTOH, when you are dealing with large files and near-optimal simple processing you are likely to be comparing i/o-bound processes, meaning differences observed will be symptoms of os and file system performance more than of the algorithms.
An exception is when a slight variation in algorithm can cause a large change in i/o performance, such as if it causes physical seek and read patterns of disk access that the OS/file_system and disk interface hardware can't entirely optimize out with smart buffering etc. Not to mention possible interactions with all the other things an OS may be doing "simultaneously" switching between things that it accounts for as real/user/sys. Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list