John Machin wrote: > On Jan 19, 11:00 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote: >> John Machin wrote: >>> I'm happy enough with reassembling the second item. The problem is in >>> reliably and correctly collapsing the whitespace in each of the above >> > fiveelements. The standard Python idiom of u' '.join(text.split()) >> > won't work because the text is Unicode and u'\xa0' is whitespace >> >>> and would be converted to a space. >> would this (or some variation of it) work? >> >> >>> re.sub("[ \n\r\t]+", " ", u"foo\n frab\xa0farn") >> u'foo frab\xa0farn' >> >> </F> > > Yes, partially. Leading and trailing whitespace has to be removed > entirely, not replaced by one space.
Sounds like adding a .strip() to me ... Stefan -- http://mail.python.org/mailman/listinfo/python-list