Re: Detecting line endings

Bengt Richter Tue, 07 Feb 2006 07:45:47 -0800

On 6 Feb 2006 06:35:14 -0800, "Fuzzyman" <[EMAIL PROTECTED]> wrote:


>Hello all,
>
>I'm trying to detect line endings used in text files. I *might* be
>decoding the files into unicode first (which may be encoded using
>multi-byte encodings) - which is why I'm not letting Python handle the
>line endings.
>
>Is the following safe and sane :
>
>text = open('test.txt', 'rb').read()
>if encoding:
>    text = text.decode(encoding)
>ending = '\n' # default
>if '\r\n' in text:
>    text = text.replace('\r\n', '\n')
>    ending = '\r\n'
>elif '\n' in text:
>    ending = '\n'
>elif '\r' in text:
>    text = text.replace('\r', '\n')
>    ending = '\r'
>
>
>My worry is that if '\n' *doesn't* signify a line break on the Mac,
>then it may exist in the body of the text - and trigger ``ending =
>'\n'`` prematurely ?
>
Are you guaranteed that text bodies don't contain escape or quoting
mechanisms for binary data where it would be a mistake to convert
or delete an '\r' ? (E.g., I think XML CDATA might be an example).

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Detecting line endings

Reply via email to