Re: Excess whitespace in my soup

Stefan Behnel Sat, 19 Jan 2008 05:40:57 -0800

John Machin wrote:
> On Jan 19, 11:00 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote:
>> John Machin wrote:
>>> I'm happy enough with reassembling the second item. The problem is in
>>> reliably and  correctly collapsing the whitespace in each of the above
>>  > fiveelements. The standard Python idiom of u' '.join(text.split())
>>  > won't work because the text is Unicode and u'\xa0' is whitespace
>>
>>> and would be converted to a space.
>> would this (or some variation of it) work?
>>
>>  >>> re.sub("[ \n\r\t]+", " ", u"foo\n  frab\xa0farn")
>> u'foo frab\xa0farn'
>>
>> </F>
> 
> Yes, partially. Leading and trailing whitespace has to be removed
> entirely, not replaced by one space.


Sounds like adding a .strip() to me ...

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Excess whitespace in my soup

Reply via email to