Hi Jean-Jacques,

On 2022-06-03 14:56, Jean-Jacques Wagner via use-livecode wrote:
Hi,
Version 6.7    word boudary are char number 09,10,11,12,13,32
version 9.67  word boudary are char number 09,10,11,12,13,32,202

Hypercard and livecode 6.7:  the number of chars (numtochar(32)&
numtochar(202)&numtochar(32)& numtochar(202)&numtochar(32)) = 2
livecode 9.67                      :   the number of chars
(numtochar(32)& numtochar(202)&numtochar(32)&
numtochar(202)&numtochar(32)) = 0

Is it a change or a bug considering now numtochar(202) as word
boundary, as it is with numtochar(32)

This is something we will need to consider - please do file a bug about it at quality.livecode.com (so you can track any further discussion about it).

I can see how this change occurred, and it is perhaps more a 'side-effect of implementation' rather than an intended change.

Prior to 7.0 - the word chunk used the C library 'ctype' isspace function - which returns true if a character is 'whitespace'. However, the engine *also* tweaked the C library character tables to make it so that NBSP (202 on MacRoman - something else on Windows/Linux - 160 maybe?) was *not* a space character. This was primarily a very dirty hack (which was done before my time!) to allow non-breaking spaces to prevent word breaks in fields (I strongly suspect the effect on the word chunk was never considered!).

When we moved to Unicode - we changed the word-breaking detection in fields to use a simplified version of the Unicode algorithm and Unicode character properties (NBSP has the, unsurprisingly, no-break property!). Similarly, we changed the word chunk to use the Unicode 'whitespace' property. In the unicode world - being whitespace, and non-breaking are two separate properties... Hence the difference in behavior since 7.

The reason this is 'of interest' is that the word chunk has had quite a hefty performance regression since 7.0 due to the switch to Unicode, so re-looking at what it should *actually* do (taking into account what it would be most useful in the widest possible circumstances) is definitely on the cards.

Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to