Good catch Alex. My code was closer, but didn't handle repeating characters correctly. Here is an updated version.
function allOffsets2 D,S,pCase local dLength, C, R -- returns a comma-delimited list of the offsets of D in S set the caseSensitive to pCase is true set the itemDel to D put length(D) into dLength put 1 - dLength into C if dLength > 1 then local n, i, j, D2, L2 put 0 into n repeat with i = 2 to dLength if char i to -1 of D is char 1 to -i of D then add 1 to n put char (1-i) to -1 of D into D2[n] put i-1 into L2[n] end if end repeat end if repeat for each item i in S if C > 0 and n > 0 then repeat with j = 1 to n if i&D begins with D2[j] then put C+L2[j],"" after R end if end repeat end if add length(i) + dLength to C put C,"" after R end repeat set the itemDel to comma delete char -1 of R if item -1 of R > len(S) then if the number of items of R is 1 then return 0 else delete item -1 of R end if end if if len(i) > 0 then repeat with j = n down to len(i)+1 if char -len(D2[j]) to -1 of S is D2[j] then delete item -1 of R end if end repeat end if return R end allOffsets2 On Sat, Nov 3, 2018 at 8:33 AM Alex Tweedly via use-livecode < use-livecode@lists.runrev.com> wrote: > Hi Geoff, > > unfortunately the impact of overlapping delimiter strings is more severe > than simply not finding them. The code on github gets the wrong answer > if there is an overlapping string at the very end of the search string, > e.g. > > alloffsets("aaaa", "aaaaaaaaa") wrongly gives 1,5,10 > > I suspect the test for > > if char -dLength to -1 of S is D then return char 1 to -2 of R > should be (something like) > if item -1 of S is empty then return char 1 to -2 of R > but to be honest, I'm not 10% certain of that. > > Alex. > > > > On 03/11/2018 00:43, Geoff Canyon via use-livecode wrote: > > I like that, changing it. Now available at > > https://github.com/gcanyon/alloffsets > > > > One thing I don't see how to do without significantly impacting > performance > > is to return all offsets if there are overlapping strings. For example: > > > > allOffsets("aba","abababa") > > > > would return 1,5, when it might be reasonable to expect it to return > 1,3,5. > > Using the offset function with numToSkip would make that easy; adapting > > allOffsets to do so would be harder to do cleanly I think. > > > > gc > > > > On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode < > > use-livecode@lists.runrev.com> wrote: > > > >> how about allOffsets? > >> > >> Bob S > >> > >> > >>> On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode < > >> use-livecode@lists.runrev.com> wrote: > >>> All of those return a single value; I wanted to convey the concept of > >>> returning multiple values. To me listOffset implies it does the same > >> thing > >>> as itemOffset, since items come in a list. How about: > >>> > >>> offsets -- not my favorite because it's almost indistinguishable from > >> offset > >>> offsetsOf -- seems a tad clumsy > >>> > >>> On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode < > >>> use-livecode@lists.runrev.com> wrote: > >>> > >>>> It probably should be named listOffset, like itemOffset or lineOffset. > >>>> > >>>> Bob S > >>>> > >>>> > >>>>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode < > >>>> use-livecode@lists.runrev.com> wrote: > >>>>> Nice! I *just* finished creating a github repository for it, and > adding > >>>>> support for multi-char search strings, much as you did. I was coming > to > >>>> the > >>>>> list to post the update when I saw your post. > >>>>> > >>>>> Here's the GitHub link: https://github.com/gcanyon/offsetlist > >>>>> > >>>>> Here's my updated version: > >>>>> > >>>>> function offsetList D,S,pCase > >>>>> -- returns a comma-delimited list of the offsets of D in S > >>>>> set the caseSensitive to pCase is true > >>>>> set the itemDel to D > >>>>> put length(D) into dLength > >>>>> put 1 - dLength into C > >>>>> repeat for each item i in S > >>>>> add length(i) + dLength to C > >>>>> put C,"" after R > >>>>> end repeat > >>>>> set the itemDel to comma > >>>>> if char -dLength to -1 of S is D then return char 1 to -2 of R > >>>>> put length(C) + 1 into lenC > >>>>> put length(R) into lenR > >>>>> if lenC = lenR then return 0 > >>>>> return char 1 to lenR - lenC - 1 of R > >>>>> end offsetList > >>>>> > >>>>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode < > >>>>> use-livecode@lists.runrev.com> wrote: > >>>>> > >>>>>> Hi Geoff, > >>>>>> > >>>>>> thank you for this beautiful script. > >>>>>> > >>>>>> I modified it a bit to accept multi-character search string and also > >> for > >>>>>> case sensitivity. > >>>>>> > >>>>>> It definitely is a lot faster for unicode text than anything I have > >>>> seen. > >>>>>> ----------------------------- > >>>>>> function offsetList D,S, pCase > >>>>>> -- returns a comma-delimited list of the offsets of D in S > >>>>>> -- pCase is a boolean for caseSensitive > >>>>>> set the caseSensitive to pCase > >>>>>> set the itemDel to D > >>>>>> put the length of D into tDelimLength > >>>>>> repeat for each item i in S > >>>>>> add length(i) + tDelimLength to C > >>>>>> put C - (tDelimLength - 1),"" after R > >>>>>> end repeat > >>>>>> set the itemDel to comma > >>>>>> if char -1 of S is D then return char 1 to -2 of R > >>>>>> put length(C) + 1 into lenC > >>>>>> put length(R) into lenR > >>>>>> if lenC = lenR then return 0 > >>>>>> return char 1 to lenR - lenC - 1 of R > >>>>>> end offsetList > >>>>>> ------------------------------ > >>>>>> > >>>>>> Kind regards > >>>>>> Bernd > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700 > >>>>>>> From: Geoff Canyon > >>>>>>> To: How to use LiveCode <use-livecode@lists.runrev.com> > >>>>>>> Subject: Re: How to find the offset of the last instance of a > >>>>>>> repeating character in a string? > >>>>>>> > >>>>>>> I was curious if using the itemDelimiter might work for this, so I > >>>> wrote > >>>>>>> the below code out of curiosity; but in my quick testing with > >>>> single-byte > >>>>>>> characters it was only about 30% faster than the above methods, so > I > >>>>>> didn't > >>>>>>> bother to post it. > >>>>>>> > >>>>>>> But Ben Rubinstein just posted about a terrible slow-down doing > >> pretty > >>>>>> much > >>>>>>> this same thing for text with unicode characters. So I ran a simple > >>>> test > >>>>>>> with 8000 character long strings that start with a single unicode > >>>>>>> character, this is about 15x faster than offset() with skip. For > >>>>>>> 100,000-character lines it's about 300x faster, so it seems to be > >>>> immune > >>>>>> to > >>>>>>> the line-painter issues skip is subject to. So for what it's worth: > >>>>>>> > >>>>>>> function offsetList D,S > >>>>>>> -- returns a comma-delimited list of the offsets of D in S > >>>>>>> set the itemDel to D > >>>>>>> repeat for each item i in S > >>>>>>> add length(i) + 1 to C > >>>>>>> put C,"" after R > >>>>>>> end repeat > >>>>>>> set the itemDel to comma > >>>>>>> if char -1 of S is D then return char 1 to -2 of R > >>>>>>> put length(C) + 1 into lenC > >>>>>>> put length(R) into lenR > >>>>>>> if lenC = lenR then return 0 > >>>>>>> return char 1 to lenR - lenC - 1 of R > >>>>>>> end offsetList > >>>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> use-livecode mailing list > >>>>>> use-livecode@lists.runrev.com > >>>>>> Please visit this url to subscribe, unsubscribe and manage your > >>>>>> subscription preferences: > >>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode > >>>>>> > >>>>> _______________________________________________ > >>>>> use-livecode mailing list > >>>>> use-livecode@lists.runrev.com > >>>>> Please visit this url to subscribe, unsubscribe and manage your > >>>> subscription preferences: > >>>>> http://lists.runrev.com/mailman/listinfo/use-livecode > >>>> > >>>> _______________________________________________ > >>>> use-livecode mailing list > >>>> use-livecode@lists.runrev.com > >>>> Please visit this url to subscribe, unsubscribe and manage your > >>>> subscription preferences: > >>>> http://lists.runrev.com/mailman/listinfo/use-livecode > >>>> > >>> _______________________________________________ > >>> use-livecode mailing list > >>> use-livecode@lists.runrev.com > >>> Please visit this url to subscribe, unsubscribe and manage your > >> subscription preferences: > >>> http://lists.runrev.com/mailman/listinfo/use-livecode > >> > >> _______________________________________________ > >> use-livecode mailing list > >> use-livecode@lists.runrev.com > >> Please visit this url to subscribe, unsubscribe and manage your > >> subscription preferences: > >> http://lists.runrev.com/mailman/listinfo/use-livecode > >> > > _______________________________________________ > > use-livecode mailing list > > use-livecode@lists.runrev.com > > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > > http://lists.runrev.com/mailman/listinfo/use-livecode > > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode