I didn't realize until now that offset() simply fails with some unicode strings:
put offset("a","↘𠜎qeiuruioqeaaa↘𠜎qeiuar",13) -- puts 0 On Mon, Nov 12, 2018 at 9:17 PM Geoff Canyon <gcan...@gmail.com> wrote: > A few things: > > 1. It seems codepointOffset can only find a single character? So it > won't work for any search for a multi-character string? > 2: codepointOffset seems to work differently for multi-byte characters and > regular characters: > > put codepointoffset("e","↘ndatestest",6) -- puts 3 > put codepointoffset("e","andatestest",6) -- puts 9 > > 3: It seems that when multi-byte characters are involved, codepointOffset > suffers from the same sort of slow-down as offset does. For example, in a > 145K string with about 20K hits for a single character, a simple > codepointOffset routine (below) takes over 10 seconds, while the item-based > routine takes about 0.3 seconds for the same results. > > On Mon, Nov 12, 2018 at 4:21 PM Monte Goulding via use-livecode < > use-livecode@lists.runrev.com> wrote: > >> Hi Folks >> >> I was a bit perplexed by this so I had a quick look about the engine and >> I see the issue. The problem is you are using `offset` which works on >> characters. Characters in LiveCode are neither unicode codepoints or bytes. >> They are graphemes. This means that when you have chars to skip the entire >> string needs to be parsed to find the grapheme boundaries so that the index >> can be translated into graphemes to skip. Note that if the strings you were >> dealing with weren’t unicode then the translation of chars to graphemes is >> 1 -> 1 so there’s no big cost which is why things are much faster when you >> textEncode and offset that. >> >> So! Change to using codepointOffset and hopefully it will be much >> speedier! >> >> Cheers >> >> Monte >> _______________________________________________ >> use-livecode mailing list >> use-livecode@lists.runrev.com >> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >> http://lists.runrev.com/mailman/listinfo/use-livecode > > _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode