Douglas Alan wrote: > "John Machin" <[EMAIL PROTECTED]> writes: > > >> lines = (partialLine + charsJustRead).split(newline) > > > The above line is prepending a short string to what will typically be a > > whole buffer full. There's gotta be a better way to do it. > > If there is, I'm all ears. In a previous post I provided code that > doesn't concatinate any strings together until the last possible > moment (i.e. when yielding a value). The problem with that the code > was that it was complicated and didn't work right in all cases. > > One way of solving the string concatination issue would be to write a > string find routine that will work on lists of strings while ignoring > the boundaries between list elements. (I.e., it will consider the > list of strings to be one long string for its purposes.) Unless it is > written in C, however, I bet it will typically be much slower than the > code I just provided. > > > Perhaps you might like to refer back to CdV's solution which was > > prepending the residue to the first element of the split() result. > > The problem with that solution is that it doesn't work in all cases > when the line-separation string is more than one character. > > >> for line in lines: yield line + outputLineEnd > > > In the case of leaveNewline being false, you are concatenating an empty > > string. IMHO, to quote Jon Bentley, one should "do nothing gracefully". > > In Python, > > longString + "" is longString > > evaluates to True. I don't know how you can do nothing more > gracefully than that.
And also "" + longString is longString The string + operator provides those graceful *external* results by ugly special-case testing internally. It is not graceful IMHO to concatenate a variable which you already know refers to a null string. Let's go back to the first point, and indeed further back to the use cases: (1) multi-byte separator for lines in test files: never heard of one apart from '\r\n'; presume this is rare, so test for length of 1 and use Chris's simplification of my effort in this case. (2) keep newline: with the standard file reading routines, if one is going to do anything much with the line other than write it out again, one does buffer = buffer.rstrip('\n') anyway. In the case of a non-standard separator, one is likely to want to write the line out with the standard '\n'. So, specialisation for this is indicated: ! if keepNewline: ! for line in lines: yield line + newline ! else: ! for line in lines: yield line Cheers, John -- http://mail.python.org/mailman/listinfo/python-list