thanks for the feedback! I'll work on your suggestions.
bruno at modulix wrote: > [EMAIL PROTECTED] wrote: > > i put this together to fix a bunch of files with wierd names, please > > gimme feedback, i am a newbie > > Ok, so let's go... Hope you won't hate me too much !-) > > > > > #!/usr/bin/env python > > import os > > import sys > > import string > > import platform > > dir = sys.argv[1] > This will shadow the builtin 'dir' function. > > > noworky = sys.argv[2] > > If the user fails to provide args, the program will crash with an > IndexError - which may not be very helpful. > > Also, a better scheme would be > myprog [-opt1 [-opt2=val [-optN]]] arg1 arg2 argN > > Hint: the optparse module is your friend > http://www.python.org/doc/2.4.2/lib/module-optparse.html > > > if platform.system() == 'Linux': > > uglychars = ''.join( set(string.punctuation+' ') - set('/_.') ) > > else: > > if platform.system() == 'Windows':#this is broken because windows > > is gay with case > > uglychars = ''.join( set(string.punctuation+' ') - > > set(':\\/_.') ) > > else: > > print "wtf... what platform is this anyway?" > > May be MacOS Classic, MacOS X or any *n*x or *BSD variant, or any other > platform supporting Python - are there are some... > > > underscore = '_' > > underscore = underscore * len(uglychars) > > You don't need the intermediate value: > underscores = '_' * len(uglychars) > > > chars = string.maketrans(uglychars, underscore) > > > print "# PHASE I, DIRECTORIES" > > Why not processing dirs and files in one pass ? > > > for path, subdirs, files in os.walk(dir, topdown=True): > > Err... is the 'dir' argument supposed to be an absolute path or a > relative path ? > > And why using the topdown option ? > > > oldname = path > > woops ! this may be the absolute path. Are you sure you want to process > an absolute path ? > > I think you'd better process files and dirs in one path, walking bottom > up (so you don't process any file twice). > > > newname = oldname.translate(chars) > > newname = string.lower(newname) > > Use the 'str' object methods instead of functions from the string module: > > newname = newname.lower() > > You can also chain method/function calls: > newname = oldname.translate(chars).lower() > > > > while string.count(newname, "__") > 0: > > in the context of a boolean expression, 0 evaluate to False, non-zero to > True. So you don't have to be so explicit: > while newname.count('__'): > > > newname = string.replace(newname,"__","_") > > You don't need to actually *count* the occurrences of '__' - if there's > one, that's enough: > while '__' in newname: > # proceed > > Also, a regexp may be more effective here. > > > while string.count(newname, "..") > 0: > > newname = string.replace(newname,"..",".") > > Don't forget the '..' and '.' special directories in unix filesystems... > > > if oldname != newname: > > if os.path.isfile(newname) or os.path.isdir(newname): > > And if there's a special file (device etc) ? > hint : os.path.exists() > > > > print oldname, "-->\n", newname, "\t\t\tERROR: file/dir > > exists\n" > > stdout is for 'normal' program outputs (ie: outputs that may be used as > inputs to another program). This kind of output should go to stderr: > print >> sys.stdout, "%s --> %s : \t\t\tERROR: " > "file/dir exists" % (oldname, > newname,) > > > else: > > print oldname, "-->\n", newname, "\t\t\tYAY: file > > renamed\n" > > if noworky == "doit": > > os.renames(oldname, newname) > > How is the user supposed to know that he has to pass the string "doit" > as a second arg ? > > There are some (more or less agreed upon) conventions about cli options. > Like '--dry-run' to express the fact that the program shouldn't actually > do more than simulate it's execution. > > > print "# PHASE II, FILES" > > for path, subdirs, files in os.walk(dir, topdown=True): > > for oldname in files: > > oldname = os.path.join(path, oldname) > > Are you *sure* you want to operate on the *whole* *absolute* path ? > (cf my thoughts about the "first phase" and the whole algorithm) > > > newname = oldname.translate(chars) > > newname = string.lower(newname) > > Aren't you repeating some code here ? > hint : all duplicated code should be factored out into a function. > > > newname = string.replace(newname,".mpeg",".mpg") > > newname = string.replace(newname,".ram",".rm") > > newname = string.replace(newname,".jpeg",".jpg") > > newname = string.replace(newname,".qt",".mov") > > # outside the loop, define a dict like: > ext_map = {'mpeg': 'mpg', > 'ram' : 'rm', > 'jpeg' : 'jpg', > # etc > } > > # then in the loop: > base, ext = os.path.split(newname) > if ext in ext_map: > newname = "%s.%s" % (base, ext_map[ext]) > > > > while string.count(newname, "__") > 0: > > newname = string.replace(newname,"__","_") > > duplicated code... > > > while string.count(newname, "..") > 0: > > newname = string.replace(newname,"..",".") > > newname = string.replace(newname,"._","_") > > newname = string.replace(newname,"_.",".") > > all this is a perfect usecase for regexps... > > > if oldname != newname: > > if os.path.isfile(newname) or os.path.isdir(newname): > > print oldname, "-->\n", newname, "\t\t\tERROR: file/dir > > exists\n" > > else: > > print oldname, "-->\n", newname, "\t\t\tYAY: file > > renamed\n" > > if noworky == "doit": > > os.renames(oldname, newname) > > duplicated code. > > > Still alive ?-) > > I thing the first step would be to correct your algorithm, then use some > more efficient or idiomatic constructs where possible (like using dicts > instead of repeated tests, sending messages to stderr etc). > > Then here are some more hints: > > - put the processing algorithm in a dedicated function, if possible one > that doesn't rely on any global variable, > - factor out any duplicated code into helper functions > > - then add a 'main' function that take cares of options and call the > 'processing' function if everything's ok. The main function should > return 0 if ok, non-zero if errors (usually 2 for invalid cli options or > args, 1 for other errors) > > - and finally add this at the end of your module: > > if __name__ == '__main__': > sys.exit(main(sys.argv)) > > > This will allow your script to be used either as a program or as a module. > > > -- > bruno desthuilliers > python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for > p in '[EMAIL PROTECTED]'.split('@')])" -- http://mail.python.org/mailman/listinfo/python-list