> In the meantime however, I found one (surprising) cause of the performance 
> issue. After making the versions  *more* equivalent the issue become 
> apparent.  I restructured the second version (using the C++ iterators) and 
> will discuss this in more detail. The culprit is in the consumer part as 
> follows:
> 
> New restructured code:
> 
> [...]
> 
> The difference compared to the former code provided in the previous mail is 
> now
> 
> 1) The C++ instances, that is the iterators, are defined locally within the 
> block.
> 
> 2) The "Future" (that is the result of the operation) is conditional compiled 
> in or out, in order to test its impact.
>     Here, the __block modifier is used for the "Future" variables "sum" and 
> "total".  
>     When using pointers within the block accessing the outside variables, the 
> performance does not differ, but using __block may be more correct.


Ah - now then! I will take a very strong guess as to what is happening there 
(I've done it myself, and seen it done by plenty of others! [*]). In the case 
where you do NOT define USE_FUTURE, your consumer thread as written in your 
email does not make any use of the variables "sum_" and "total_". Hence the 
compiler is entirely justified in optimizing out those variables entirely! It 
will still have to check the iterator against eof, and may have to dereference 
the iterator[**], but it does not need to update the "sum_" or "total_" 
variables. 

It may well be that there is still a deeper performance issue with your 
original code, and I'm happy to have another look at that when I have a chance. 
I suggest you deal with this issue first, though, as it appears to be 
introducing misleading discrepancies in the execution times you're using for 
comparison.

As I say, it's quite a common issue when you start stripping down code with the 
aim of doing minimal performance comparisons. The easiest solution is either to 
printf the results at the end (which forces the compiler to actually evaluate 
them!), or alternatively do the sort of thing you're doing when USE_FUTURE is 
defined - writing to a shadow variable at the end. If you declare your shadow 
variable as "volatile" then the compiler is forced to write the result and is 
not permitted to optimize everything out.

Hope that helps, even if it may not deal with your original problem yet. 
Apologies that my first round of guesses were wrong - I'm pretty sure about 
this one though :)

Jonny


[**] Completely unrelated to this thread, but see this rather extreme example 
where the claimed performance had to be reduced by a factor of twelve due to 
this problem! 
http://www.ibm.com/developerworks/forums/thread.jspa?threadID=226415
[*] I ~think~ ... because this involves a memory access, which is strictly 
speaking a side effect in itself.

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to