Re: UTF16, BOM, and Windows Line endings

Neil Hodgson Mon, 06 Feb 2006 16:50:42 -0800

Fuzzyman:

> Thanks - so I need to decode to unicode and *then* split on line
> endings. Problem is, that means I can't use Python to handle line
> endings where I don't know the encoding in advance.
> 
> In another thread I've posted a small function that *guesses* line
> endings in use.


    You can normalise line endings:

 >>> x = "a\r\nb\rc\nd\n\re"
 >>> y = x.replace("\r\n", "\n").replace("\r","\n")
 >>> y
'a\nb\nc\nd\n\ne'
 >>> print y
a
b
c
d

e

    The empty line is because "\n\r" is 2 line ends.

    Neil
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: UTF16, BOM, and Windows Line endings

Reply via email to