Am 05.11.2012 11:54, schrieb andrea crotti:
Quite often I find convenient to get a filename or a file object as
argument of a function, and do something as below:

def grep_file(regexp, filepath_obj):
     """Check if the given text is found in any of the file lines, take
     a path to a file or an opened file object
     """
     if isinstance(filepath_obj, basestring):
         fobj = open(filepath_obj)
     else:
         fobj = filepath_obj

     for line in fobj:
         if re.search(regexp, line):
             return True

     return False

This makes it also more convenient to unit-test, since I can just pass
a StringIO.

I do the same for the same reason, but I either pass in a file object or the actual data contained in the file, but not a path.


But then there are other problems, for example if I pass a file
> object is the caller that has to make sure to close the file
handle..

I don't consider that a problem. If you open a file, you should do that in a with expression:

  with open(..) as f:
      found = grep_file(regex, f)

That is also the biggest criticism I have with your code, because you don't close the file after use. Another things is the readability of your code:

  grep_file("foo", "bar")

The biggest problem there is that I don't know which of the two arguments is which. I personally would expect the file to come first, although the POSIX grep has it opposite on the commandline. Consider as alternative:

  grep("foo", path="bar")
  with open(..) as f:
    grep("foo", file=f)
  with open(..) as f:
    grep("foo", data=f.read())

Using **kwargs, you could switch inside the function depending on the mode that was used, extract lines accordingly and match these against the regex.


Greetings!

Uli

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to