On 9 Feb 2014, at 15:53, Greg Parker <gpar...@apple.com> wrote:

> On Feb 9, 2014, at 12:19 AM, Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote:
>> The real app (which I am trying to optimise) has actually two loops: one is 
>> counting, the other one is modifying. Which seems to be good news.
>> 
>> But I would really like to understand what I should do. Trial and error (or 
>> blindly groping in the mist) is not really my preferred way of working.
> 
> Optimizing small loops like this is a black art. Very small effects become 
> critically important, such as the alignment of your loop instructions or the 
> associativity of that CPU's L1 cache. 

So it seems that my test app is not of much use.

The real loop looks like:

NSUInteger      nbrBytes = ...  (big, some GB)
unsigned char *bitField = calloc( nbrBytes, sizeof( unsigned char) );

NSUInteger len = ... might be rather big, so I tried to use dispatch_apply
NSUInteger incr = ... might be as small as 3, or much bigger
NSUInteger bitPointer = ... // bitPointer + len * incr <  nbrBytes * 8

for( NSUInteger i = 0; i < len; i++ ) 
{
        unsigned char bitIndex = bitPointer & 0x7;
        NSUInteger byteIndex = bitPointer >> 3;
        unsigned char mask = maskP[ bitIndex ]; //      mask = 0x1 << bitIndex;
        bitField[byteIndex] |= mask;
        bitIndex += incr;
};

I looked at Accelerate, but it seems not to fit.

I am also looking at OpenCL, but have not yet understood, whether this would 
help with my problem.

Kind regards,

Gerriet.


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to