On Wednesday, April 1, 2015 at 7:52:27 PM UTC-4, Dave Angel wrote: > On 04/01/2015 09:43 AM, Saran A wrote: > > On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote: > >> On 03/31/2015 07:00 AM, Saran A wrote: > >> > >> > @DaveA: This is a homework assignment. .... Is it possible that you > >> could provide me with some snippets or guidance on where to place your > >> suggestions (for your TO DOs 2,3,4,5)? > >> > > >> > >> > >>> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote: > >> > >>>> > >>>> It's missing a number of your requirements. But it's a start. > >>>> > >>>> If it were my file, I'd have a TODO comment at the bottom stating known > >>>> changes that are needed. In it, I'd mention: > >>>> > >>>> 1) your present code is assuming all filenames come directly from the > >>>> commandline. No searching of a directory. > >>>> > >>>> 2) your present code does not move any files to success or failure > >>>> directories > >>>> > >> > >> In function validate_files() > >> Just after the line > >> print('success with %s on %d reco... > >> you could move the file, using shutil. Likewise after the failure print. > >> > >>>> 3) your present code doesn't calculate or write to a text file any > >>>> statistics. > >> > >> You successfully print to sys.stderr. So you could print to some other > >> file in the exact same way. > >> > >>>> > >>>> 4) your present code runs once through the names, and terminates. It > >>>> doesn't "monitor" anything. > >> > >> Make a new function, perhaps called main(), with a loop that calls > >> validate_files(), with a sleep after each pass. Of course, unless you > >> fix TODO#1, that'll keep looking for the same files. No harm in that if > >> that's the spec, since you moved the earlier versions of the files. > >> > >> But if you want to "monitor" the directory, let the directory name be > >> the argument to main, and let main do a dirlist each time through the > >> loop, and pass the corresponding list to validate_files. > >> > >>>> > >>>> 5) your present code doesn't check for zero-length files > >>>> > >> > >> In validate_and_process_data(), instead of checking filesize against > >> ftell, check it against zero. > >> > >>>> I'd also wonder why you bother checking whether the > >>>> os.path.getsize(file) function returns the same value as the os.SEEK_END > >>>> and ftell() code does. Is it that you don't trust the library? Or that > >>>> you have to run on Windows, where the line-ending logic can change the > >>>> apparent file size? > >>>> > >>>> I notice you're not specifying a file mode on the open. So in Python 3, > >>>> your sizes are going to be specified in unicode characters after > >>>> decoding. Is that what the spec says? It's probably safer to > >>>> explicitly specify the mode (and the file encoding if you're in text). > >>>> > >>>> I see you call strip() before comparing the length. Could there ever be > >>>> leading or trailing whitespace that's significant? Is that the actual > >>>> specification of line size? > >>>> > >>>> -- > >>>> DaveA > >>> > >>> > >> > >>> I ask this because I have been searching fruitlessly through for some > >>> time and there are so many permutations that I am bamboozled by which is > >>> considered best practice. > >>> > >>> Moreover, as to the other comments, those are too specific. The scope of > >>> the assignment is very limited, but I am learning what I need to look out > >>> or ask questions regarding specs - in the future. > >>> > >> > >> > >> -- > >> DaveA > > > > @DaveA > > > > My most recent commit > > (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) > > has more annotations and comments for each file. > > Perhaps you don't realize how github works. The whole point is it > preserves the history of your code, and you use the same filename for > each revision. > > Or possibly it's I that doesn't understand it. I use git, but haven't > actually used github for my own code. > > > > > I have attempted to address the functional requirements that you brought up: > > > > 1) Before, my present code was assuming all filenames come directly from > > the commandline. No searching of a directory. I think that I have > > addressed this. > > > > Have you even tried to run the code? It quits immediately with an > exception since your call to main() doesn't pass any arguments, and main > requires one. > > > def main(dirslist): > > while True: > > for file in dirslist: > > return validate_files(file) > > time.sleep(5) > > In addition, you aren't actually doing anything to find what the files > in the directory are. I tried to refer to dirlist, as a hint. A > stronger hint: look up os.listdir() > > And that list of files has to change each time through the while loop, > that's the whole meaning of scanning. You don't just grab the names > once, you look to see what's there. > > The next thing is that you're using a variable called 'file', while > that's a built-in type in Python. So you really want to use a different > name. > > Next, you have a loop through the magical dirslist to get individual > filenames, but then you call the validate_files() function with a single > file, but that function is expecting a list of filenames. One or the > other has to change. > > Next, you return from main after validating the first file, so no others > will get processed. > > > 2) My present code does not move any files to success or failure > > directories (requirements for this assignment1). I am still wondering if > > and how I should use shututil() like you advised me to. I keep receiving a > > syntax error when declaring this below the print statement. > > You had a function to do that in your very first post to this thread. > Have you forgotten it already? As for syntax errors, you can't expect > any help on that when you don't show any code nor the syntax error > traceback. Remember that when a syntax error is shown, it's frequently > the previous line or two that actually was wrong. > > > > > 3) You correctly reminded me that my present code doesn't calculate or > > write to a text file any statistics or errors for the cause of the error. > > > #I wrote an error class in case the I needed such a specification for > a notifcation - this is not in the requirements > > class ErrorHandler: > > > > def __init__(self): > > pass > > > > def write(self, string): > > > > # write error to file > > fname = " Error report (" + time.strftime("%Y-%m-%d %I-%M%p") > + ").txt" > > handler = open(fname, "w") > > handler.write(string) > > handler.close() > > > > > This looks like Java code. With no data attributes, and only one > method(function), what's the reason you don't just use a function? > > > original_stderr = sys.stderr # stored in case want to revert > sys.stderr = ErrorHandler() > > This is just bogus. There are rare times when a programmer might want > to replace stderr, but that's just unnecessarily confusing in a program > like this one. When you want to write printable data to another file, > just use that file's handle as the argument to the file= argument of > print(). You could use an instance of ErrorHandler as a file handle. > Or do it one of a dozen other straightforward ways. > > While we're at it, why do you keep creating new files for your error > report? And overwriting all the old messages with whatever new ones > come in the same minute? > > > > (Should I use the copy or copy2 method in order provide metadata? If so, > > should I wrap it into a try and except logic?) > > No idea what you mean here. What metadata? And what do copy and copy2 > have to do with anything here? > > > > > 4) Before, my present code runs once through the names, and terminates. It > > doesn't "monitor" anything. I think I have addressed this with the main > > function - correct? > > > Not even close. > > > 5) Before, my present code doesn't check for zero-length files - I have > > added a comment there in case that is needed) > > > > I realize appreciate your invaluable feedback. I have grown a lot with this > > assignment! > > > > Sincerely, > > > > Saran > > > > > -- > DaveA
@DaveA Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again. from __future__ import print_function import os import time import glob import sys def initialize_logger(output_dir): logger = logging.getLogger() logger.setLevel(logging.DEBUG) # create console handler and set level to info handler = logging.StreamHandler() handler.setLevel(logging.INFO) formatter = logging.Formatter("%(levelname)s - %(message)s") handler.setFormatter(formatter) logger.addHandler(handler) # create error file handler and set level to error handler = logging.FileHandler(os.path.join(output_dir, "error.log"),"w", encoding=None, delay="true") handler.setLevel(logging.ERROR) formatter = logging.Formatter("%(levelname)s - %(message)s") handler.setFormatter(formatter) logger.addHandler(handler) # create debug file handler and set level to debug handler = logging.FileHandler(os.path.join(output_dir, "all.log"),"w") handler.setLevel(logging.DEBUG) formatter = logging.Formatter("%(levelname)s - %(message)s") handler.setFormatter(formatter) logger.addHandler(handler) #Helper Functions for the Success and Failure Folder Outcomes, respectively def file_len(filename): with open(filename) as f: for i, l in enumerate(f): pass return i + 1 def copy_and_move_File(src, dest): try: shutil.rename(src, dest) # eg. src and dest are the same file except shutil.Error as e: print('Error: %s' % e) # eg. source or destination doesn't exist except IOError as e: print('Error: %s' % e.strerror) # Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline. There was no actual searching #of a directory. # I am assuming that this is appropriate since I moved the earlier versions of the files. # I let the directory name be the argument to main, and let main do a dirlist each time through the loop, # and pass the corresponding list to validate_files. path = "/some/sample/path/" dirlist = os.listdir(path) before = dict([(f, None) for f in dirlist) #####Syntax Error? before = dict([(f, None) for f in dirlist) ^ SyntaxError: invalid syntax def main(dirlist): while True: time.sleep(10) #time between update check after = dict([(f, None) for f in dirlist) added = [f for f in after if not f in before] if added: print('Sucessfully added new file - ready to validate') ####add return statement here to pass to validate_files if __name__ == "__main__": main() #check for record time and record length - logic to be written to either pass to Failure or Success folder respectively def validate_files(): creation = time.ctime(os.path.getctime(added)) lastmod = time.ctime(os.path.getmtime(added)) #Potential Additions/Substitutions - what are the implications/consequences for this def move_to_failure_folder_and_return_error_file(): os.mkdir('Failure') copy_and_move_File(filename, 'Failure') initialize_logger('rootdir/Failure') logging.error("Either this file is empty or there are no lines") def move_to_success_folder_and_read(f): os.mkdir('Success') copy_and_move_File(filename, 'Success') print("Success", f) return file_len() #This simply checks the file information by name------> is this needed anymore? def fileinfo(file): filename = os.path.basename(f) rootdir = os.path.dirname(f) filesize = os.path.getsize(f) return filename, rootdir, filesize if __name__ == '__main__': import sys validate_files(sys.argv[1:]) # -- end of file -- https://mail.python.org/mailman/listinfo/python-list