In article <37ba7b40-3663-4094-b507-696fc598b...@l26g2000yqm.googlegroups.com> ad <adsquai...@gmail.com> wrote: >Please review the code pasted below. I am wondering what other ways >there are of performing the same tasks. ... I imagine one enhancement >would be to create a function out of some of this.
Indeed -- "recommended styles" include defining a main() function and calling main at the bottom, e.g., with: if __name__ == '__main__': main() or: if __name__ == '__main__': main(sys.argv) Something has also double-spaced your code; I will undo that below. >import os >import time >import shutil >import argparse So far so good. Let me start by putting the next parts into main(). I will use the no-argument format since we can simply let parser.parse_args() fetch sys.argv the same way you did (we could def main(argv) and pass argv; this is sometimes useful for testing purposes, but here it makes little difference one way or another). def main(): > CurrentTime = time.time() > epocDay = 86400 # seconds (You could write 24 * 60 * 60, but again this is sort of a six-of-one half-a-dozen-of-the-other situation.) What is not clear is why you are setting these now, rather than later, closer to where they are going to be used. Many style guides, including Guido's and pylint's, dislike UppercaseVariableName and camelCase style names within functions. (UppercaseNames are reserved to classes.) I am leaving these alone for now though. > parser = argparse.ArgumentParser(description = "Delete files and > folders in a directory N days old", add_help=False, > prog='directorycleaner', usage='%(prog)s 7 c:\\temp') Line wrap has done bad things to this. Still, there is something odd about it other than the line wrap: - sometimes you write: param_name = value but sometimes you use: param_name=value and in one case you even use both: usage= [string] - you switch back and forth between single and double quoted strings, without much reason. Consistency is not *required* but it is generally a good idea. (This is also minor, like deciding between 86400 or 24*60*60. Eventually, though, lots of minor things add up.) (I have to admit here that I tend to switch back and forth on quotes too. :-) ) Another "sixes" case occurs here with the backslashes: to get a literal backslash in a string, you can use "raw strings". Here it makes no difference but I will go ahead and do it just for illustration: parser = argparse.ArgumentParser( description='Delete files and folders in a directory N days old', add_help=False, prog='directorycleaner', usage=r'%(prog)s 7 c:\temp') Finally, I am not sure why you do not want to allow a -h / --help option (but if you take out the add_help=False, it's probably best to take out the call to parser.print_help() too), and there is no need to supply a usage= argument at all -- argparse() will build one for you. > parser.add_argument('days', type=int, help="Numeric value: delete > files and folders older then N days") > parser.add_argument('directory', help="delete files and folders in > this directory") (Again, line wrap has broken these; the fixes are obvious so I skip over them here.) > parser.print_help() > args = parser.parse_args() (So far so good, although again, you probably want to remove the call to print_help() if you allow argparse to add a -h / --help option, at least.) > dictKeys = (vars(args)) There is no *need* to do this, although it is documented as allowed. I prefer to just use args.<field> myself: > HowManyDays = dictKeys['days'] > WhatDirectory = dictKeys['directory'] so this would become: HowManyDays = args.days WhatDirectory = args.directory > print (HowManyDays) > print (WhatDirectory) These are presumably debug statements and should be removed, but until then, it might be good to prefix the output with what is being printed (i.e., a debug message). (I have taken them out of my copy, for output shown below.) (In a fancier program, you could use the logging module and logging.debug().) > DaysToDelete = HowManyDays * epocDay Right before this would be a good place to create epocDay. > dirExists = os.path.exists(WhatDirectory) > if dirExists == False: print ("The directory is missing") An explicit "if expr == False" is generally a bad idea -- if an expression can be "considered boolean" (and the return value of os.path.exists certainly can), just write "if not expr". Most style guides suggest putting subsequent statements on new lines, rather than right after the ":". Checking that the directory exists seems reasonable enough. However, after printing that it does not, you continue on with code that is going to immediately raise an OSError exception: > DirListing = os.listdir(WhatDirectory) In general, it is better to try to do the operation, and catch the failure and do something about it at that point, than to test to see if the operation is going to succeed. (Among other things, this avoids a "race" between program 1 that says "does some directory exist" and program 2 that says "delete the directory". If program 1 "wins" this race, the directory does exist at the point of the test, then program 2 deletes it, then program 1 goes on to access the now-deleted directory ... and crashes.) I am using a Unix-like system so what I get may not be quite the same as what you would get on a Windows-like system, but: % cd /tmp % python foo.py 1 /tmp/nosuchdir The directory is missing Traceback (most recent call last): ... OSError: [Errno 2] No such file or directory: '/tmp/nosuchdir' % More significantly, consider this: % python foo.py 1 /tmp/foo.py Traceback (most recent call last): ... OSError: [Errno 20] Not a directory: '/tmp/foo.py' % So instead of the three previous lines, consider: try: DirListing = os.listdir(WhatDirectory) except OSError as err: sys.exit("can't read %s: %s" % (WhatDirectory, err)) (you will need to "import sys", and I am using an older version of Python and the "except OSError, err" syntax, but the effect is the same). Now the second example results in: % python foo.py 1 /tmp/foo.py can't read /tmp/foo.py: [Errno 20] Not a directory: '/tmp/foo.py' % > for files in DirListing: > # Get the absolute path of the file name > abspath = (os.path.join(WhatDirectory, files)) This is not necessarily an absolute path -- for instance, if the program is run as: python foo.py 7 rel\ative\path the joined file names (on a Windows-like system) will be things like "rel\ative\path\file.txt" and so on. I would suggest shortening the variable name to just "path". The outermost set of parentheses are unnecessary, too. > # Get the current creation time of the file in epoc format > # (midnight 1/1/1970) > FileCreationTime = (os.path.getctime(abspath)) Beware: on Unix-like systems, this gets a "time of last change" rather than a "time of create". Even if you are sure you will use Windows you may wish to use the file's mtime (time of last "modification"; on Unix-like systems, the distinction between a "modification" and a "change" is that "change" covers alterations to the file's meta-data as well as the data, e.g., "chmod a-w file", making it read-only, changes its ctime but not its mtime, while "echo foo >> file" changes both its ctime and its mtime -- provided it is not read-only, of course :-) ). Again, the outermost parentheses here are unnecessary. > # time.ctime converts epoch to a normal date > #print (time.ctime(CurrentTime)) > # Get the date from seven days ago > WeekOldFileDate = CurrentTime - DaysToDelete > #print (CurrentTime) > #print (FileCreationTime) > #print (WeekOldFileDate) > > #If the file is older than seve days doe something Apparently, the program has evolved somewhat: originally you had "seven days" hardcoded, now you have a variable number. The comments, however, have not changed -- and the final variable name is no longer appropriate. It is probably also time to ditch the commented-out debug print statements (and fix the comments, including the typo on the last one above). > if FileCreationTime < WeekOldFileDate: > #check if the object is a file > if os.path.isfile(abspath): os.remove(abspath) > # It is not a file it is a directory > elif os.path.isdir(abspath): shutil.rmtree(abspath) Again, the comment and code do not quite agree: the comment says "if it is not a file it *is* a directory" but the code says "if it is not a file, check to see if it is a directory", which leaves open the possibility that it is some other kind of entity (this is definitely possible on a Unix-like system, where it could be a socket, symbolic link, or device node, for instance). In this particular program, written the way it is, there is no actual benefit (yet) to doing this, but I suggest moving the guts of the "clean out a directory" process into a function. What this allows is the ability to list more than one directory, provided of course you also change the argument parser a bit. Having put this all together (and neutered the actual file-or-directory removing code) gives me the code below. There are still plenty of things you could do with it -- for instance, exiting partway through processing a list of directories if one in the middle does not exist is perhaps not optimal: % python foo.py 7 /tmp/dir1 /tmp/oops /tmp/dir2 (where /tmp/dir1 and /tmp/dir2 do exist, but /tmp/oops does not) will clean out /tmp/dir1 but then exit without ever processing /tmp/dir2. (There are lots of ways to handle this; you would have to choose one and implement it.) Or, instead of the kind of processing done here, you could generalize it into a Unix-like "find" command. (Perhaps with a less-ugly syntax. :-) ) The "find" command can do what this script does: find DIRLIST -ctime +N ( -type d -o -type f ) -exec rm -rf {} \; but can also a great deal more since (a) it has many other options than just -ctime, and (b) -exec will execute any arbitrary command. --------------------------- import os import time import shutil import argparse import sys def main(): """ main program: parse arguments, and clean out directories. """ parser = argparse.ArgumentParser( description="Delete files and folders in a directory N days old", prog="directorycleaner") parser.add_argument("days", type=int, help="Numeric value: delete files and folders older than N days") parser.add_argument("directory", nargs="+", help="delete files and folders in this directory") args = parser.parse_args() for dirname in args.directory: clean_dir(dirname, args.days) def clean_dir(dirname, n_days): """ Clean one directory of files / subdirectories older than the given number of days. """ time_to_live = n_days * 86400 # 86400 = seconds-per-day current_time = time.time() try: contents = os.listdir(dirname) except OSError, err: sys.exit("can't read %s: %s" % (dirname, err)) for filename in contents: # Get the path of the file name path = os.path.join(dirname, filename) # Get the creation time of the file # NOTE: this only works on Windows-like systems when_created = os.path.getctime(path) # If the file/directory has expired, remove it if when_created + time_to_live < current_time: if os.path.isfile(path): print "os.remove(%s)" % path # It is not a file it is a directory elif os.path.isdir(path): print "shutil.rmtree(%s)" % path if __name__ == "__main__": main() -- In-Real-Life: Chris Torek, Wind River Systems Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 email: gmail (figure it out) http://web.torek.net/torek/index.html
-- http://mail.python.org/mailman/listinfo/python-list