On Friday, April 3, 2015 at 8:05:14 AM UTC-4, Dave Angel wrote: > On 04/02/2015 07:43 PM, Saran A wrote: > <snip> > > > > I debugged and rewrote everything. Here is the full version. Feel free to > > tear this apart. The homework assignment is not due until tomorrow, so I am > > currently also experimenting with pyinotify as well. I do have questions > > regarding how to make this function compatible with the ProcessEvent Class. > > I will create another post for this. > > > > What would you advise in regards to renaming the inaptly named dirlist? > > Asked and answered. You called it path at one point. dir_name would > also be good. The point is that if you have a good name, you're less > likely to have unreasonable code processing that name. For example, > > > > > # # # Without data to examine here, I can only guess based on this > > requirement's language that > > # # fixed records are in the input. If so, here's a slight revision to the > > helper functions that I wrote earlier which > > # # takes the function fileinfo as a starting point and demonstrates > > calling a function from within a function. > > # I tested this little sample on a small set of files created with MD5 > > checksums. I wrote the Python in such a way as it > > # would work with Python 2.x or 3.x (note the __future__ at the top). > > > > # # # There are so many wonderful ways of failure, so, from a development > > standpoint, I would probably spend a bit > > # # more time trying to determine which failure(s) I would want to report > > to the user, and how (perhaps creating my own Exceptions) > > > > # # # The only other comments I would make are about safe-file handling. > > > > # # # #1: Question: After a user has created a file that has failed (in > > # # # processing),can the user create a file with the same name? > > # # # If so, then you will probably want to look at some sort > > # # # of file-naming strategy to avoid overwriting evidence of > > # # # earlier failures. > > > > # # # File naming is a tricky thing. I referenced the tempfile module [1] > > and the Maildir naming scheme to see two different > > # # types of solutions to the problem of choosing a unique filename. > >> ## I am assuming that all of my files are going to be specified in unicode > > > > ## Utilized Spyder's Scientific Computing IDE to debug, check for > > indentation errors and test function suite > > > > from __future__ import print_function > > > > import os.path > > import time > > import logging > > > > > > def initialize_logger(output_dir): > <snip> > > I didn't ever bother to read the body of this function, since a simple > > print(mydata, file=mylog_file) > > will suffice to add data to the chosen text file. Why is it constantly > getting more complex, to solve a problem that was simple in the beginning? > > > > > > > #Returns filename, rootdir and filesize > > > > def fileinfo(f): > > filename = os.path.basename(f) > > rootdir = os.path.dirname(f) > > filesize = os.path.getsize(f) > > return filename, rootdir, filesize > > > > #returns length of file > > def file_len(f): > > with open(f) as f: > > for i, l in enumerate(f): > > pass > > return i + 1 > > Always returns 1 or None. Check the indentation, and test to see what > it does for empty file, for a file with one line, and for a file with > more than one line. > > > > #attempts to copy file and move file to it's directory > > def copy_and_move_file(src, dest): > > Which is it, are you trying to copy it, or move it? Pick one and make a > good function name that shows your choice. > > > try: > > os.rename(src, dest) > > Why are you using rename, when you're trying to move the file? Take a > closer look at shutil, and see if it has a function that does it safer > than rename. The function you need uses rename, when it'll work, and > does it other ways when rename will not. > > > # eg. src and dest are the same file > > except IOError as e: > > print('Error: %s' % e.strerror) > > A try/except that doesn't try to correct the problem is not generally > useful. Figure out what could be triggering the exception, and how > you're going to handle it. If it cannot be handled, terminate the program. > > For example, what if you don't have permissions to modify one of the > specified directories? You can't get any useful work done, so you > should notify the user and exit. An alternative is to produce 50 > thousand reports of each file you've got, telling how it succeeded or > failed, over and over. > > > > > path = "." > > dirlist = os.listdir(path) > > > > def main(dirlist): > > before = dict([(f, 0) for f in dirlist]) > > Since dirlist is a path, it's a string. So you're looping through the > characters of the name of the path. I still don't have a clue what the > dict is supposed to mean here. > > > while True: > > time.sleep(1) #time between update check > > That loop goes forever, so the following code will never run. > > > after = dict([(f, None) for f in dirlist]) > > Once again, you're looping through the letters of the directory name. > Or if dirlist is really a list, and you're deciding that's what it > should be, then of course after will be identical to before. > > > added = [f for f in after if not f in before] > > if added: > > f = ''.join(added) > > print('Sucessfully added %s file - ready to validate') %() > > return validate_files(f) > > else: > > return move_to_failure_folder_and_return_error_file(f) > > > > > > > > def validate_files(f): > > You completely changed the scope of this function. Why does it still > have an s in its name? > > > creation = time.ctime(os.path.getctime(f)) > > lastmod = time.ctime(os.path.getmtime(f)) > > What do timestamps have to do with anything? > > > if creation == lastmod and file_len(f) > 0: > > return move_to_success_folder_and_read(f) > > How have you decided that the records are all the same length? > > > if file_len < 0 and creation != lastmod: > > file_len cannot be safely compared to 0, it's a function. If this were > Python 3, it would give you a runtime error for trying. Even if you had > successfully called the function here, how is it possible for it to > return a negative value? > > > return move_to_success_folder_and_read(f) > > else: > > return move_to_failure_folder_and_return_error_file(f) > > > > > > #Potential Additions/Substitutions > > > > def move_to_failure_folder_and_return_error_file(): > > filename, rootdir, lastmod, creation, filesize = fileinfo(file) > > fileinfo() returns 3 things, and you're stuffing them in 5 variables. > > > os.mkdir('Failure') > > What do you do the second time through this function, when that > directory is already existing? > > > copy_and_move_file( 'Failure') > > The function takes two arguments, neither of which is likely to be that > string. > > > initialize_logger('rootdir/Failure') > > logging.error("Either this file is empty or there are no lines") > > > > > > def move_to_success_folder_and_read(): > > filename, rootdir, lastmod, creation, filesize = fileinfo(file) > > os.mkdir('Success') > > copy_and_move_file(rootdir, 'Success') #file name > > print("Success", file) > > return file_len(file) > > > > > > > > if __name__ == '__main__': > > main(dirlist) > > So now you're passing a list of filenames to main()? How then is it > going to know when new files arrive? You were originally going to tell > it a directory name. That's why I suggested that you call the parameter > 'path' or 'dirname'. > > > > -- > DaveA
@DaveA I addressed most of the issues. I do admit that, as a novice, I feel beholden to the computer - hence the over-engineering. As you correctly stated: " > What do you do the second time through this function, when that > directory is already existing? > > > copy_and_move_file( 'Failure') > > The function takes two arguments, neither of which is likely to be that > string. > > > initialize_logger('rootdir/Failure') > > logging.error("Either this file is empty or there are no lines")" How would I ensure that this s directory is made only once and every file that is passeed goes only to 'success' or 'failure'? -- https://mail.python.org/mailman/listinfo/python-list