Re: Performance issues LC8 versus earlier versions.

Alex Tweedly Tue, 23 Aug 2016 15:26:04 -0700


On 22/08/2016 15:47, Richard Gaskin wrote:

Alex Tweedly wrote:

> Would caseSensitive make it faster ?
In theory yes, since it avoids having to run the internal equivalentof toLower on each thing being compared.

But since these are bytes, not chars, that doesn't apply.

However in some recent experiments involving pattern matching on textI was unable to measure a difference. That shouldn't be taken asdefinitive; there are a lot of distracting things going on in theroutine I was testing with. I haven't yet done a good isolated testof caseSensitive.
> Re md5 for repeated use - yes, it probably is worth doing.
The rsync algo offers an md5 option, but by default it compares filesbased only on mod date and size. The thinking is that if both ofthose match, the odds of having a changed file are very low.
Perhaps an optimal algo in your system would reserve md5 for thosecases where size and mod date match, which will eliminate most caseswith less CPU time.

Thanks Richard, but this is a very different context. In my case, themod dates will never match; the duplicate files arise because the userhas imported the same photos from a camera more than once (intodifferent folders, or into the the same one using auto-renaming), or hascopied a folder of files to trim out the ones to be copied to anothermachine, or .... any of a number of things, but all causing the copiedfile to have a different mod date from the original.

My original benchmarking was faulty; in fact, taking the md5hash for thetwo files is only 50% more expensive than simply comparing them (higherif they are actually different), but that leaves the conclusionunchanged - it's not worth the extra complexity. There is an assumptionunderlying this - that in real life (different from my developmentphase), the majority of genuine duplicates will be dealt with (i.e. onecopy deleted or moved elsewhere) fairly quickly, so the same comparisonswon't be run repeatedly. The remaining cases of same file size are sorare (around 80 in my full 50,000 file set) that pair-wise comparisonstake only 4 seconds (or 2 seconds if I use an older version of LC), sono great impact on the user experience.

(The other parts of the overall workflow - where I would like to gatherand use the exif data - are more strongly impacted by the performanceissue - but my desire to use the latest of LC8 rather than an obsoleteversion is probably strong enough to override that, and I'll just bemore patient - even though patient is not my natural state :-)




_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Performance issues LC8 versus earlier versions.

Reply via email to