Re: Threading - How its done?

Paul Sargent Fri, 09 May 2008 06:54:31 -0700

Sorry, long non-cocoa post, but maybe there some useful info forsomeone.


On 7 May 2008, at 18:33, Army Research Lab wrote:

Pay
particular attention to the section titled "HDL and programminglanguages".
Chip designers have had to contend with these problems for years, and
developed languages for expressing parallelism with implicit threading
already (everything in an HDL is parallel unless you carefully forceit to
be sequential).  We should be using ideas from those languages.

As somebody who's day job is writing HDL I'd like to just repeat thisfor emphasis, but it's not the languages that solves the raceconditions, it's the architectures employed by the engineers. I thinka lot of the problems software engineers have with with threading comefrom bad architectures when viewed from a parallel execution point ofview.

Locks and Semaphores are the workaround for this, and (good) hardwareengineers (almost) never use them.

For example: Pipe-lining. If faced with a set of tasks that need to beperformed sequentially on some data blocks a software engineer mightdecompose the problem like this (MIGHT, I said MIGHT):


(PA means process A, D1 means data block 1)

Thread 1: PA - D1 | PB - D1 | PC - D1 | PD - D1
Thread 2: PA - D2 | PB - D2 | PC - D2 | PD - D2
Thread 3: PA - D3 | PB - D3 | PC - D3 | PD - D3
Thread 4: PA - D4 | PB - D4 | PC - D4 | PD - D4

The thing to note is how each thread is running the same code (whichmust therefore be re-entrant) on different data.


A Hardware engineer would probably do this:

Thread 1: PA - D1 | PA - D2 | PA - D3 | PA - D4
Thread 2:           PB - D1 | PB - D2 | PB - D3 | PB - D4
Thread 3:                     PC - D1 | PC - D2 | PC - D3 | PC - D4

Thread 4: PD - D1 | PD - D2 | PD - D3 |PD - D4

Note how the data is passed from thread to thread so only one threadowns the data at any time (no locks necessary), and how no process isbeing run in more than one thread at a time so code doesn't have toworry about being re-entrant.

Granted there's a start-up / shut-down cost where full parallelismisn't achieved (which is overwhelming in this example, but give itmore data blocks and it becomes negligible), and this doesn't work forall problems, but it's a useful pattern for data-processing. The otherthing is to make sure that you're stages are of similar complexity, asthe slowest stage will define the performance of the system.

Passing the ownership of data from thread to thread would be done withFIFOs which can also be written without locks, with some care. (e.g. http://msmvps.com/blogs/vandooren/archive/2007/01/05/creating-a-thread-safe-producer-consumer-queue-in-c-without-using-locks.aspx, but read the comments esp. w.r.t out of order execution).

Yes there can be issues with something like this in software (passingdata between NUMA processors and non-shared caches), but believe me...It's makes code far, far, far easier to read, write and DEBUG (unittests for each stage).


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Re: Threading - How its done?

Reply via email to