Re: Newby: how to transform text into lines of text

John Machin Sun, 25 Jan 2009 16:45:39 -0800

On 26/01/2009 10:34 AM, Tim Chase wrote:

I believe that using the formulaic "for line in file(FILENAME)"iteration guarantees that each "line" will have at most only one '\n'and it will be at the end (again, a malformed text-file with no terminal'\n' may cause it to be absent from the last line)

It seems that you are right -- not that I can find such a guaranteewritten anywhere. I had armchair-philosophised that writing"foo\n\r\nbar\r\n" to a file in binary mode and reading it on Windows intext mode would be strict and report the first line as "foo\n\n"; I waswrong.

So, we are left with the unfortunately awkward
    if line.endswith('\n'):
        line = line[:-1]
You're welcome to it, but I'll stick with my more DWIM solution of "getrid of anything that resembles an attempt at a CR/LF".

Thanks, but I don't want it. My point was that you didn't TTOPEWYM (tellthe OP exactly what you meant).


My approach to DWIM with data is, given
   norm_space = lambda s: u' '.join(s.split())

to break up the line into fields first (just in case the field delimiter== '\t') then apply norm_space to each field. This gets rid of your '\r'at end (or start!) of line, and multiple whitespace characters arereplaced by a single space. Whitespace includes NBSP (U+00A0) as anadded bonus for being righteous and using Unicode :-)

Thank goodness I haven't found any of my data-sources using "\n\r"instead, which would require me to left-strip '\r' characters as well.Sigh. My kingdom for competency. :-/

Indeed. I actually got data in that format once from a *x programmer whowas so kind as to do it that way just for me because he knew that I useWindows and he thought that's what Windows text files looked like. Nokidding.


Cheers,
John
--
http://mail.python.org/mailman/listinfo/python-list

Re: Newby: how to transform text into lines of text

Reply via email to