[issue25760] TextWrapper fails to split 'two-and-a-half-hour' correctly

Samwyse Sat, 28 Nov 2015 18:08:34 -0800

New submission from Samwyse:

Single character words in a hyphenated phrase are not split correctly.  The 
root issue it the wordsep_re class variable.  To reproduce, run the following:


>>> import textwrap
>>> textwrap.TextWrapper.wordsep_re.split('two-and-a-half-hour')
['', 'two-', 'and-a', '-half-', 'hour']

It works if 'a' is replaces with two or more alphabetic characters.

>>> textwrap.TextWrapper.wordsep_re.split('two-and-aa-half-hour')
['', 'two-', '', 'and-', '', 'aa-', '', 'half-', 'hour']

The problem is in this part of the pattern:  (?=\w+[^0-9\W])

I confess that I don't understand the situation that would require that 
complicated of a pattern.  Why wouldn't (?=\w) would work?

----------
components: Library (Lib)
messages: 255558
nosy: samwyse
priority: normal
severity: normal
status: open
title: TextWrapper fails to split 'two-and-a-half-hour' correctly
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25760>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25760] TextWrapper fails to split 'two-and-a-half-hour' correctly

Reply via email to