On Nov 9, 2016, at 7:46 AM, Alejandro Tejada <capellan2...@gmail.com> wrote: > > Hi all, > > Recently I made a very long script for searching 3 words among a list > of 1080 lines of words. > > Download the zipped stack MatchingPatternsv02 > from this forum thread: > http://forums.livecode.com/viewtopic.php?f=7&t=28288 > > I suspect that LiveCode provides better tools for this task, but I > don't know which are and how to use them. Maybe a simpler solution is > to employ a regex, arrays operation or a really clever handler. > > How many different methods (functions and commands) provides Livecode > to make this task of comparing and finding 3 words (taken from a list > of 12 words) among 1080 lines of 4 words?
Here’s one way, using some text-munching utility functions. The following is a longish list of handlers, but as you can see the basic function find3words() is pretty compact. Once you have the utilities in place in a library, they can be used in all different kinds of contexts to shortcut things. You might be able to do this with a Regex expression, I don’t know, I’m allergic to Regex. I like to work in pure LC. — Peter Peter M. Brigham pmb...@gmail.com -------- — the following function is not tested: function find3words pList, pWords repeat with w = 1 to 3 put line offsets(word w of pWords,pList) into A[w] end repeat put A[1] into f1 put A[2] into f2 put A[3] into f3 put intersectLIsts(f1,f2) into out1 put intersectLIsts(out1,f3) into outlist — now have to check that we have found whole items, not just part of a word repeat for each item i in outlist repeat with w = 1 to 3 if not (item w of pWords is among the items of line i of pList) then next repeat put i & comma after finalList end repeat end repeat if finalList = empty then put 0 into finalList return item 1 to -1 of finalList end find3words function lineOffsets str, pContainer, matchWhole -- returns a comma-delimited list of all the lineOffsets of str -- in pContainer -- if matchWhole = true then only whole lines are located -- else finds line matches everywhere str is part of a line in pContainer -- duplicates are stripped out -- note: to get the last lineOffset of a string in a container (often useful) -- use "item -1 of lineOffsets(...)" -- requires offsets() if matchWhole = empty then put false into matchWhole put offsets(str,pContainer) into offList if offList = "0" then return "0" repeat for each item i in offList put the number of lines of (char 1 to i of pContainer) into lineNbr if matchWhole then if line lineNbr of pContainer <> str then next repeat end if put 1 into A[lineNbr] -- using an array avoids duplicates end repeat put the keys of A into lineList sort lines of lineList ascending numeric replace cr with comma in lineList return lineList end lineOffsets function offsets str, pContainer -- returns a comma-delimited list of all the offsets of str in pContainer -- returns 0 if str is not found -- note: offsets("xx","xxxxxx") returns "1,3,5" not "1,2,3,4,5" -- ie, overlapping offsets are not counted -- note: to get the last occurrence of a string in a container (often useful) -- use "item -1 of offsets(...)" if str is not in pContainer then return 0 put 0 into startPoint repeat put offset(str,pContainer,startPoint) into thisOffset if thisOffset = 0 then exit repeat add thisOffset to startPoint put startPoint & comma after offsetList add length(str)-1 to startPoint end repeat return item 1 to -1 of offsetList -- delete trailing comma end offsets function intersectLists listA, listB, pDelim -- returns the intersection of two lists, ie., a list of items/lines common to both -- if pDelim = empty then looks first for the presence of cr in the lists, -- if found, defaults to cr as the delimiter -- if no cr found, looks for the presence of comma in the lists, -- if found, defaults to comma as the delimiter -- if neither found, returns empty (user should have specified another delim) -- order of items may be changed, result may require sorting -- by Peter M. Brigham, pmb...@gmail.com — freeware -- the idea of using "split tArray with pDelim and pDelim" -- comes from Peter Hayworth on the use-LC list -- it's very clever! -- requires getDelimiters(), noDupes if listA = empty or listB = empty then return empty if pDelim = empty then if listA & listB contains cr then put cr into pDelim else if listA & listB contains comma then put comma into pDelim else return empty end if end if noDupes listA,pDelim noDupes listB,pDelim put getDelimiters(listA & listB) into tempDelim if tempDelim begins with "Error" then return "Error in getDelimiters()" split listA with pDelim and pDelim split listB with pDelim and pDelim intersect listA with listB combine listA with pDelim and tempDelim replace tempDelim with empty in listA return listA end intersectLists function getDelimiters pText, nbrNeeded -- returns a cr-delimited list of <nbrNeeded> characters -- none of which are found in the variable pText -- use for delimiters for, eg, parsing text files, manipulating arrays, etc. -- usage: put getDelimiters(pText,2) into tDelims -- if tDelims begins with "Error" then exit to top -- or whatever -- put line 1 of tDelims into lineDivider -- put line 2 of tDelims into itemDivider -- etc. if pText = empty then return "Error: no text specified." if nbrNeeded = empty then put 1 into nbrNeeded -- default 1 delimiter put "2,3,4,5,6,7,8,16,17,18,19,20,21,22,23,24,25,26" into baseList -- low ASCII values, excluding CR, LF, tab, etc. put the number of items of baseList into maxNbr if nbrNeeded > maxNbr then return "Error: max" && maxNbr && "delimiters." repeat for each item testCharNbr in baseList put numtochar(testCharNbr) into testChar if testChar is not in pText then -- found one, store and get next delim put testChar & cr after delimList if the number of lines of delimList = nbrNeeded then return line 1 to -1 of delimList -- done end if end repeat -- if we got this far, there was an error put the number of lines of delimList into totalFound if totalFound = 0 then return "Error: cannot get any delimiters." else if totalFound = 1 then return "Error: can only get 1 delimiter." else return "Error: can only get" && totalFound && "delimiters." end if end getDelimiters on noDupes @pList, pDelim -- strips duplicate (and empty) lines/items from a list -- note: pList is referenced, so the original list will be changed. -- if pDelim = empty then looks first for the presence of cr in pList, -- if found, defaults to cr as the delimiter -- if no cr found, looks for the presence of comma in pList, -- if found, defaults to comma as the delimiter -- if neither found, exits without changing pList -- (user should have specified another delim) -- note: the order of the list will likely be changed, may require sorting -- note: the split command is inherently case-sensitive -- (irrespective of the value of the caseSensitive property), -- so "Chuck" and "chuck" will not be considered duplicates -- if you need case insensitive, use the noDupes() function instead -- this command scales better with very large lists than noDupes() -- note: pDelim could be a string of characters, so you could do: -- put "apple and orange and pear and orange and banana and apple" into pList -- noDupes pList," and " -- after which pList will be: "pear and banana and apple and orange" -- thanks to Peter Hayworth of the use-LC mailing list -- -- the idea of using "split tArray with pDelim and pDelim" is very clever! -- adjusted by Peter M. Brigham, pmb...@gmail.com -- requires getDelimiters() if pDelim = empty then if cr is in pList then put cr into pDelim else if comma is in pList then put comma into pDelim else answer "noDupes: no delimiter specified" as sheet exit noDupes end if end if put getDelimiters(pList) into tempDelim replace pDelim with tempDelim in pList split pList by tempDelim and tempDelim put the keys of pList into pList filter pList without empty replace cr with pDelim in pList end noDupes _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode