[issue13391] string.strip Does Not Remove Zero-Width-Space (ZWSP)

2011-11-15 Thread Dave Mankoff
Dave Mankoff added the comment: "Use regular expressions for more advanced stripping than what the .strip method provides." So I guess this brings me back to my original issue. I'm not looking for particularly advanced stripping. I just want to remove all whitespace and othe

[issue13391] string.strip Does Not Remove Zero-Width-Space (ZWSP)

2011-11-14 Thread Dave Mankoff
Dave Mankoff added the comment: So I contacted the Unicode Technical Committee about the issue and received a promptly received a response back. They pointed that the ZWSP was, once upon a time considered white space but that was changed in Unicode 4.0.1 http://www.unicode.org/review

[issue13391] string.strip Does Not Remove Zero-Width-Space (ZWSP)

2011-11-14 Thread Dave Mankoff
Dave Mankoff added the comment: But why are they not a space? I mean, they literally have the word space in their name and are used as separators between words. I can't really see any reason why you wouldn't want this behavior - there's not time when I would be thankful tha

[issue13391] string.strip Does Not Remove Zero-Width-Space (ZWSP)

2011-11-14 Thread Dave Mankoff
Dave Mankoff added the comment: I appreciated the quick turnaround on this. Perhaps I am misunderstanding the resolution. I understand that strip uses _PyUnicode_IsWhitespace, and that _PyUnicode_IsWhitespace "Returns 1 for Unicode characters having the bidirectional type 'WS'

[issue13391] string.strip Does Not Remove Zero-Width-Space (ZWSP)

2011-11-12 Thread Dave Mankoff
New submission from Dave Mankoff : Title pretty much says it all. Simple test case: >>> len(u' \t\r\n\u200B'.strip()) 1 Should be zero. Same problem in Python3: >>> len(' \t\r\n\u200B'.strip()) 1 -- components: Unicode messages: 147538 nosy: