On 08/19/2011 02:15 PM, David W Noon wrote:
My experience with OpenMP is that it is difficult to write a loop body
large enough that context switching does not overwhelm the benefits of
parallelism.

Hmmm.

If you do a multiplication of a 100*100 Matrix you could spawn 10000 threads and this will result in a huge switching overhead.

But if you have 10 cores and you aggregate the 10000 tasks in 10 groups of 1000 calculations each, spawn 10 threads and have each go through a loop of calculating 1000 cells, I gather that (in a perfect world) no task switching overhead at all would be necessary (but at the beginning and the end of the complete calculation).

If in Prism you do something like (pseudo-code draft):
------------------------------------------------------------------------
m := 100;
n :=  100

for parallel ij := 0 to m*n-1 do begin
  i := ij mod m;
  j := ij div m;
 calccell (i,j);
end;
------------------------------------------------------------------------

I understand that Prism (or rather .NET) on a 10 core machine automatically would create 10 threads each doing 1000 cells.

-Michael
_______________________________________________
fpc-devel maillist  -  [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to