On Apr 10, 10:16 pm, Steven D'Aprano <st...@remove-this- cybersource.com.au> wrote: > After converting a text file containing doctests to use Windows line > endings, I'm getting spurious errors: > > ValueError: line 19 of the docstring for examples.txt has inconsistent > leading whitespace: '\r' > > I don't believe that doctest.testfile is documented as requiring Unix > line endings, and the line endings in the file are okay. I've checked in > a hex editor, and they are valid \r\n line endings. > > In doctest._load_testfile, I find this comment and code: > > # get_data() opens files as 'rb', so one must do the equivalent > # conversion as universal newlines would do. > return file_contents.replace(os.linesep, '\n'), filename > > which I read as an attempt to normalise line endings in the file to \n. > > (But surely this will fail? If you're running, say, Linux or MacOS, > linesep will already be '\n' not '\r\n', and consequently the replace > does nothing, any Windows line endings aren't normalised, and doctest > will choke on the \r characters. It's only useful if running on Windows.) > > But the above only occurs when using a package loader. Otherwise, > _load_testfile executes: > > return open(filename).read(), filename > > which doesn't do any line ending normalisation at all. > > To my mind, this is a bug in doctest. Does anyone disagree? I think the > simplest fix is to change it to: > > return open(filename, 'rU').read(), filename > > Comments? > > -- > Steven
Seems like a bug to me. I often assume that I don't know where a string is coming from, so one of the first steps I usually take when parsing a string is: s = s.replace('\r\n', '\n').replace('\r', '\n') And, out of long-standing pre-Python habit, I always open files in binary mode and then have my way with them. I know universal mode is available, but honestly, I don't care for all the bookkeeping on what kinds of line endings have been seen -- I just want to normalize the data. Regards, Pat -- http://mail.python.org/mailman/listinfo/python-list