At 04:21 PM 9/8/00 -0400, Chaim Frenkel wrote:
> >>>>> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>DS> The problem with using database locking and transactions as your
>DS> model is that they're *expensive*. Amazingly so. The expense is
>DS> certainly worth it for what you get, and in many cases the expense
>DS> is hidden (at least to some extent) by the cost you pay in disk
>DS> I/O, but it's definitely there.
>
>I lost you. How is the model wrong? Perl has a resource, Databases
>have a resource. Perl does a V or P operation, so does a Database.
I didn't say it was wrong. I said it was expensive. The model's just fine,
though there's some handwaving and punting in all the databases that handle
this. (Mainly in deadlock handling)
>DS> Heavyweight locking schemes are fine for relatively infrequent or
>DS> expensive operations (your average DLM for cluster-wide file
>DS> access is an example) but we're not dealing with rare or heavy
>DS> operations.
>
>Even databases have to handle this problem. The granularity of the
>locking. Row vs. page vs. Table (might be other schemes I don't know enough)
Right, but databases are all dealing with mainly disk access. A 1ms lock
operation's no big deal when it takes 100ms to fetch the data being locked.
A 1ms lock operation *is* a big deal when it takes 100ns to fetch the data
being locked...
>DS> We're dealing with very lightweight, frequent
>DS> operations. That means we need a really cheap locking scheme for
>DS> this sort of thing, or we're going to be spending most of our time
>DS> in the lock manager...
>
>The issue is correctness. Lightweight Heavyweight has no meaning to me.
>How does Lightweight and Heavyweight map to correctness. And what
>is a lightweight and what is a heavyweight.
Correctness is what we define it as. I'm more worried about expense.
We've really got three levels of cost here. (For all these I'm assuming
non-distributed--when you yank in multiple machines things get funky fast,
and that doesn't map to what perl's doing anyway)
1) At the top is the Oracle/DB level. You get locking, thread consistency
for data being read while it's updated, deadlock detection, and rollbacks
on failure.
2) In the middle level is VMS' lock manager. You get locking and deadlock
detection, along with a few different flavors of locks (exclusive, read,
write, and few others)
3) Down at the bottom is the posix thread lock. You get locking here, and
nothing else. Heck, you don't even get recursive locks unless you pay extra.
Each of these three levels have their own costs and guarantees. Levels 1 &
2 are really cool and do all sorts of nifty things for you. Unfortunately
they cost a *lot*. Great gobs of time and complexity. Core-level locking
(the stuff we use to protect ourselves) can't use either--they're too
expensive. I'm not sure I want to try and provide them to the user either,
because of the complexity of their implementation.
>One thing to consider, what do to about Deadlocks and the notification
>and recovery method. Without a rollback mechanism, each and every
>programmer will have to roll their own. So we either provide it or we
>have to make it easy for them to recover from a major blow. (Unless
>you are going to simply let the threads sit in deadlock until a human
>or watchdog timer kills the entire process)
Detecting deadlocks is expensive and it means rolling our own locking
protocols on most systems. You can't do it at all easily with PThreads
locks, unfortunately. Just detecting a lock that blocks doesn't cut it,
since that may well be legit, and doing a scan for circular locking issues
every time a lock blocks is expensive.
Rollbacks are also expensive, and they can generate unbounded amounts of
temporary data, so they're also fraught with expense and peril.
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk