Jonas Maebe pisze:

On 18 Sep 2009, at 21:30, Dariusz Mazur wrote:


On 18 Sep 2009, at 16:24, Dariusz Mazur wrote:

I use own lockfree FIFO http://www.emadar.com/fpc/lockfree.htm to distribute task between threads
its much faster and well scaling on multicore.

Note that it won't work as is on non-x86 machines, because it's missing memory barriers (and I think that you may actually need memory barriers on x86 too). Atomic operations are not memory barriers by themselves, and the fact that you perform an atomic operation does not mean that afterwards all cpu's will immediately see this new value.
I don't know other machines. On x86 used atomic operation, as Intel said, all needed cache are resolved.
I've made very stress test on several computers. it works good.

At least on PowerPC it will fail (and I guess on SPARC as well).
I don't say, that this will work on other than x86. I only test on Intel and AMD.

Algorithm is very simple, need only 32bit CAS, thus implement it is possible on most platforms.
One thing is needed: multiplatform threadswich.

I use sleep(0) ,but its not best solution (I think).

At least on Mac OS X that will not do anything. There's a procedure called "ThreadSwitch" in the interface of the system unit though. At least for Unix platforms it will work if cthreads is used.

I sow it, but not test.
Seems enough to me.

But is this optimal solution:

procedure SysThreadSwitch;
begin
      Sleep(0);
end;

   WinThreadManager.ThreadSwitch           :=...@systhreadswitch;

procedure ThreadSwitch;
begin
 CurrentTM.ThreadSwitch;
end;

We have 2 unnecessary invoke function. Can compiler optimize this? Or maybe better do ThreadSwitch and SysThreadSwitch inlined.



--
 Darek




_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to