05/01/04 01:22:32, Sam Vilain <[EMAIL PROTECTED]> wrote:

[STUFF] :)

In another post you mentions intel hyperthreading. 
Essentially, duplicate sets of registers within a single CPU.

Do these need to apply lock on every machine level entity that
they access?
 No. 

Why not?

Because they can only modify an entity if it is loaded into a register
and the logic behind hyperthreading won't allow both register sets 
to load the same entity concurrently. 

( I know this is a gross simplificationof the interactions 
between the on-board logic and L1/L2 caching!)

--- Not an advert or glorification of Intel. Just an example ---------

Hyper-Threading Technology provides thread-level-parallelism (TLP) 
on each processor resulting in increased utilization of processor 
execution resources. As a result, resource utilization yields higher 
processing throughput. Hyper-Threading Technology is a form of 
simultaneous multi-threading technology (SMT) where multiple 
threads of software applications can be run simultaneously on one
 processor. 

This is achieved by duplicating the *architectural state* on each 
processor, while *sharing one set of processor execution resources*.
------------------------------------------------------------------------------

The last paragraph is the salient one as far as I am concerned.

The basic premise of my original proposal was that multi-threaded,
machine level applications don't have to interlock on machine level 
entities, because each operation they perform is atomic. 

Whilst the state of higher level objects, that the machine level 
objects are a part of, may have their state corrupted by two 
threads modifying things concurrently. The state of the threads
(registers sets+stack) themselves cannot be corrupted. 

This is because they have their own internally consistant state,
that only changes atomically, and that is completely separated,
each from the other. They only share common data (code is data
to the cpu, just bytecode is data to a VM).

So, if you are going to emulate a (hyper)threaded CPU, in a 
register-based virtual machine interpreter. And allow for
concurrent threads of execution within that VMI.
Then one way of ensuring that the internal state of the 
VMI was never corrupted, would be to have each thread have
it's own copy of the *architectural state* of the VM, whilst
 *one set of processor execution resources*.

For this to work, you would need to achieve the same opcode
atomicity at the VMI level. Interlocking the threads so that
on shared thread can not start an opcode until anothe shared 
threads has completed gives this atomicity. The penalty is that
if the interlocking is done for every opcode, then shared 
threads end up with very long virtual timeslices. To prevent 
that being the case (most of the time), the interlocking should
 only come into effect *if* concurrent access to a VM level 
entity is imminant. 

As the VMI cannot access (modify) the state of a VM level
entity (PMC) until it has loaded it into a VM register, the
interlosking only need come into effect *if*, the entity
who's reference is being loaded into a PMC register is 
currently in-use by (another) thread. 

The state if a PMCs in-useness can be flagged by a single bit
in its header. This can be detected by a shared thread when
the reference to it is loaded into teh PMC register and 
when it is, that shared thread then waits on the single,
shared mutex before proceeding.

It is only when the combination of atomised VM opcodes,
and lightweight in-use detection come together, that the
need for a mutex/entity can be avoided.

If the mutex used is capable of handling SMP, NUMA,
clusters etc, then the mechinsm will work. 

If the lightweight bit-test&-set opcode isn't available,
then a heavyweight equivalent could be used, though the
advantages would be reduced.


>Sam Vilain, [EMAIL PROTECTED]

I hope that clarifiies my thinking and how I arrived at it.

I accept that it may not be possible on all platforms, and
it may be too expensive on some others. It may even be 
undesirable in the context of Parrot, but I have seen no
argument that goes to invalidate the underlying premise.

Regards, Nigel



Reply via email to