Hi all, let me explain what i am doing.
I am working on a tool, which checks if two folders/harddrives are in sync. I create a detailed file list including all subfolders of each folder/drive. This 2 list are then compared. For my tests i created a stack with 2 Datagrids. I read in the complete detailed file/folder structure of my harddisk drive (with filename, size and modifaction date/time - tab seperated) and putted this list in both DataGrids. A line for example looks like this /Xcode 4.3/Applications/Audio/AU Lab Documentation/AULabHelp/AddingBuses.html 5284 17.08.10 18:49 In one of the Datagrids i changed some lines, just to get some "missing/wrong" lines. The file/folder list had 173910 lines. With my script i even did not get a result after 30 minutes. With Pete´s script i get a result after about 10 seconds. And that for both repeat loops. I get all wrong/missing lines listed. Is it really possible, that there is such a performance improvement? Maybe this is a cache thing? I tried also with my iTunes folder with 4179 files in it. My script needs about 6 seconds to finish. Pete´s script less than 4 seconds. But anyway. The array solution is definitely faster. Here are the script which i used for testing. My script put the dgtext of grp "Festplatte 1" into tHDD1 put the dgtext of grp "Festplatte 2" into tHDD2 REPEAT FOR each line i in tHDD1 IF i is not among the lines of tHDD2 THEN put i & return after tMissingInHDD1 END REPEAT answer the number of lines of tHDD1 &return&tMissingInHDD1 Pete´s script (slightly adjusted) put the dgtext of grp "Festplatte 1" into tHDD1 put the dgtext of grp "Festplatte 2" into tHDD2 REPEAT FOR each line i in tHDD1 put true into myArray[i]["A"] END REPEAT REPEAT FOR each line i in tHDD2 put true into myArray[i]["B"] END REPEAT REPEAT FOR each line k in the keys of myArray IF myArray[k]["A"] is not true THEN put k & return after tMissingInHDD1 IF myArray[k]["B"] is not true THEN put k & return after tMissingInHDD2 END REPEAT answer the number of lines of tHDD1 &return&tMissingInHDD1 &return&tMissingInHDD2 Regards, Matthias _____________________________________ Matthias Rebbe Bramkampsieke 13 D-32312 Lübbecke Tel +49 57 41 - 31 00 00 mobil +49 160 - 550 44 62 Fax +49 57 41 - 310 0 02 E-Mail matth...@matthiasrebbe.eu http://www.matthiasrebbe.eu Am 06.10.2011 um 21:32 schrieb Pete: > Glad it worked Matthias. Could you give us an idea of the new timing using > the arrays? > Pete > Molly's Revenge <http://www.mollysrevenge.com> > > > > > On Thu, Oct 6, 2011 at 12:17 PM, Matthias Rebbe < > matthias_livecode_150...@m-r-d.de> wrote: > >> Hi Pete, >> >> thank you very much. It´s so much faster. >> >> It seems, i should look closer to arrays. >> >> >> Regards, >> >> Matthias >> Am 06.10.2011 um 01:13 schrieb Pete: >> >>> I've used an array to do this type of operation in the past. Haven't >> tried >>> this code but it might work better. >>> >>> repeat for each line i in tTextA >>> put true into myArray[i]["A"] >>> end repeat >>> >>> repeat for each line i in tTextB >>> put true into myArray[i]["B"] >>> end repeat >>> >>> repeat for each line k in the keys of myArray >>> if myArray[k]["A"] is not true then put k & return after after >> tMissingInA >>> if myArray[k]["B"] is not true then put k & return after after >> tMissingInB >>> end repeat >>> >>> Pete >>> Molly's Revenge <http://www.mollysrevenge.com> >>> >>> >>> >>> >>> On Wed, Oct 5, 2011 at 3:00 PM, Matthias Rebbe < >>> matthias_livecode_150...@m-r-d.de> wrote: >>> >>>> Hi, >>>> >>>> i need to compare two very large text files with about 5000 - 7000 lines >>>> each with a lines size of up to 256 chars. >>>> >>>> I need to find out if there are lines missing in either file a or file >> b. >>>> >>>> What is the best way to do this with good speed? >>>> >>>> I tried to check each line in file a and if the line is in file b. >>>> And after that, i check for each line in file b and try to find out >>>> if the line is in file a. >>>> >>>> With large files it takes about 10 to 15 minutes to do the complete >> check. >>>> >>>> My script looks like this >>>> >>>> repeat for each line i in tTextA >>>> if i is not among the lines of tTextB then put i &return after >> tMissingInB >>>> end repeat >>>> >>>> repeat for each line i in tTextB >>>> if i is not among the lines of tTextA then put i &retrurn after >> tMissingInA >>>> end repeat >>>> >>>> Is there a better (faster) way? >>>> >>>> Regards, >>>> >>>> Matthias >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> use-livecode mailing list >>>> use-livecode@lists.runrev.com >>>> Please visit this url to subscribe, unsubscribe and manage your >>>> subscription preferences: >>>> http://lists.runrev.com/mailman/listinfo/use-livecode >>>> >>>> >>> _______________________________________________ >>> use-livecode mailing list >>> use-livecode@lists.runrev.com >>> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >>> http://lists.runrev.com/mailman/listinfo/use-livecode >> >> >> _______________________________________________ >> use-livecode mailing list >> use-livecode@lists.runrev.com >> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >> http://lists.runrev.com/mailman/listinfo/use-livecode >> >> > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode