On 9 Feb 2014, at 15:53, Greg Parker <gpar...@apple.com> wrote: > On Feb 9, 2014, at 12:19 AM, Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote: >> The real app (which I am trying to optimise) has actually two loops: one is >> counting, the other one is modifying. Which seems to be good news. >> >> But I would really like to understand what I should do. Trial and error (or >> blindly groping in the mist) is not really my preferred way of working. > > Optimizing small loops like this is a black art. Very small effects become > critically important, such as the alignment of your loop instructions or the > associativity of that CPU's L1 cache.
> The code would likely be faster if each thread maintained its own sum in a > local variable, and wrote to the array of sums only at the end. That should > reduce cache line contention and also make it more likely that the compiler > optimizer can keep the sum in a register, avoiding memory entirely. This made a BIG improvement in my test app. Thanks a lot. Now I will add this idea to my real app as well. Kind regards, Gerriet. _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com