I've come to agree with Dan: As the threading requirements and the architecture stand, parrot requires frequent and automatic locking to prevent crashes. This is completely apart from user synchronization.
"As the architecture stands?" What's wrong with it? I think the most problematic items are:
1. parrot's core operations are heavy and multi-step, not lightweight and atomic.
--> This makes it harder for parrot to provide a crash-proof environment.
2. PMCs are implemented in C, not PIR.
--> Again, makes parrot's job of providing a crash-proof environment much harder. If a small set of safe operations can be guaranteed safe, then the crash-proofing bubbles upward.
3. New code tends to appear in parrot's core rather than accumulating in a standard library.
--> This bloats the core, increasing our exposure to bugs at that level.
4. Memory in parrot is not type-stable.
--> unions with mutable discriminators are evil, because checking the discriminator and accessing the data field could be preempted by a change of discriminator and value. Thus, unions containing pointers require locking for even read access, lest seg faults or unsafe memory accesses occur. Best example: morph. morph must die.
But parrot's already much too far along for the above to change. (ex.: morph must die.)
The JVM and CLR have successful threading implementations because their core data types are either atomic or amenable to threading. (I've been over this before, but I'm playing devil's advocate today.)
--> Many of Perl's core string types, for instance, are not threadsafe—
and they never will be. (note: I said Perl, not parrot.) Even if implemented on the JVM, Perl's string types would still require locking. That Perl doesn't use them yet doesn't mean parrot can't also have data structures that are amenable to locking. Immutable strings wouldn't require locking on parrot any more than on the JVM—so long as morph and transcode could be prevented. (Three cheers for type-stable memory.) If parrot can prove that a P-reg will point to a PMC of such-and-such type, and can know that such-and-such operation requires no locking on that type, it can avoid locking the PMC.
That, and neither environment (any longer) makes any misguided attempt to provide user-level consistency when it hasn't been requested.
--> That means they simply don't lock except when the user tells them to. No reader-writer locks to update a number.
Like Dan mentioned, there's no "JVM magic," but rather there is a lot of very careful design. The core is crash-proofed, and is small enough that the crash-proofing is reasonable and provable. Atop that is built code which inherits that crash-proofing. Thread-safety is a very high-level guarantee, only rarely necessary.
Dan Sugalski wrote:
=item All shared PMCs must have a threadsafe vtable
The first thing that any vtable function of a shared PMC must do is to aquire the mutex of the PMCs in its parameter list, in ascending address order. When the mutexes are released they are not required to be released in any order.
Wait a sec. $2->vtable->add(interpreter, $1, $2, $3). That's one dynamic dispatch. I see 2 variables that could be shared. I think that's fatal, actually.
The algorithm I'd suggest instead is this: Newborn objects couldn't have been shared, and as such can safely be accessed without locks. This is a lot given how Perl treats values, though certainly not all. All objects from foreign sources, which have been passed to another routine, or stored into a sharable container must be presumed to require locks. It's not as aggressive, true, but I think the overall cost is lower.
To back up Dan: Regarding Leo's timings, everyone that's freaking out should remember that he was testing a very fast operation, the worst case scenario for the locking overhead. *Of course* the overhead will appear high. Most of parrot's operations are much heavier, and the locking overhead will be less apparent when those are executing. That said, a 400% penalty is too high a price to pay for what, after all, isn't even a useful level of threadsafety from a user's standpoint. But, again, without respecification and redesign, parrot requires the locking. The trick is to lock less.
One way I can see to do that is to move locking upward, so that several operations can be carried out under the auspices of one lock. How would I propose to do this?
• Add some lock opcodes to PBC. The pluralized version allows parrot to acquire locks in ascending address order (hardcoded bubble sorts), according to Dan's very important deadlock-avoidance algorithm.
- op lock(in PMC)
- op lock2(in PMC, in PMC)
- ...
- op lock5(in PMC, in PMC, in PMC, in PMC, in PMC)
- ...
• Add unlock opcode(s), too. Pluralized? Doesn't matter.
• Force all locks to be released before any of:
- locking more PMCs (deadlock avoidance!)
- invoke*
- an upwards branch
- reassigning a PMC register (changing reg[n], not reg[n]->something)
- any blocking operation
- anything that checks for events
• Bytecode validation can ensure that PMC locks are held before operating on a PMC that is:
- not a temporary
- not known to be a threadsafe PMC (I repeat: morph is evil)
Pain in the ass? Maybe not. Here, I think IMCC can pull most of the weight: lock* shouldn't even be available through IMCC. Since it's just a matter of chugging through the rules, IMCC can apply locking automatically. (Hell, to avoid changing PBC, parrot could implement them automatically at runtime! Wouldn't want to be there, but the above set of rules is darned easy to apply.)
However, I would again repeat that the value provided to user programs by atomic PMC operations is negligible (since a user's data structures will typically be higher-level, and thus require user-level locking)—and is certainly not offset by a 400% performance penalty. Locking PMCs is the easy way out. If type-stable memory can be provided (morph must die), it may be possible to look away from automatic PMC locks in more cases. For instance, an Int PMC would require no locks whatsoever—so long as its type cannot change. Nor would an immutable string, or a fixed-size array. On platforms with 8-word atomic writes, a Number PMC could get by without locks, too. morph must die.
Dan Sugalski wrote:
Okay, we're going to mandate that we have read/write locks, each interpreter pool has one, and mutating vtable entries must get a read lock on the pool read/write lock. Pricey (ick) but isolated to mutators. The DOD gets a write lock on it, which'll block all read/write access so no mutators can be in process while the pool DOD runs.
Adding a reader-writer lock on every pointer mutation will be completely and utterly unfeasible from a performance standpoint. There will be massive reader contention even on uniprocessors, and the bus pollution on multiprocessors will be disastrous. Reader-writer locks are more costly than mutexes.
Can I suggest an algorithm to completely eliminate that reader-writer lock, replacing it with a form of coordination that's cheaper in aggregate, and free in the common case? It's not simple, but I'd rather one elephant than a billion mice. Might help with infant mortality, too.
—
Gordon Henriksen [EMAIL PROTECTED]
P.S. - morph must die.