Re: Should nonbreakable space belong to whitespace class?

Jacob Sparre Andersen Fri, 24 Feb 2006 11:37:46 -0800

Denis Barbier wrote:

Miroslav Kure wrote:

Unfortunately, nonbreakable space is not included incharacter class \s or [:space:] (aka whitespace). As itis usually not distinguishable from the ordinary space inmost of the fonts, I would say that nonbreakable spaceshould be added to the whitespace class in regexplibraries.
No, that would defeat its purpose; a non-breaking space isused to glue two words together.

But only in the graphical sense. In the logical sense theyare still two separate words.

Isn't the real problem that Miroslav should have used '\b'to identify the boundary between word and non-word text or'[:^word:]' to identify all non-word characters. Ideallythis should even catch the invisible word separators used insome cases in some languages. This only has one problem;the soft hyphen is for some reason not classified as a wordcharacter (in da_DK and fo_FO), which it logically should.


Jacob
--
"... there may be many others,
 but they haven't been discovered"             -- Tom Lehrer


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Re: Should nonbreakable space belong to whitespace class?

Reply via email to