Hey, I would like to register a MLDBM::Sync package, which serves as a wrapper around MLDBM to flock() serialize reads/writes to MLDBM databases so that they do not get corrupt under high load multiprocess scenarios. Attached are some emails from me and others discussing the module on the mod_perl list. Thanks, Joshua Return-Path: [EMAIL PROTECTED] Received: from proxy1.ba.best.com ([EMAIL PROTECTED] [206.184.139.12]) by shell18.ba.best.com (8.9.3/8.9.2/best.sh) with ESMTP id CAA22882 for <[EMAIL PROTECTED]>; Fri, 17 Nov 2000 02:41:28 -0800 (PST) Received: from c004.sfo.cp.net (c004-mx002.c004.sfo.cp.net [209.228.13.215]) by proxy1.ba.best.com (8.9.3/8.9.2/best.in) with SMTP id CAA01430 for <[EMAIL PROTECTED]>; Fri, 17 Nov 2000 02:40:24 -0800 (PST) Received: (cpmta 16907 invoked from network); 17 Nov 2000 02:39:54 -0800 Delivered-To: [EMAIL PROTECTED] Received: (cpmta 16905 invoked from network); 17 Nov 2000 02:39:54 -0800 Received: from locus.apache.org (63.211.145.10) by smtp.c004-mx000.c004.sfo.cp.net (209.228.13.215) with SMTP; 17 Nov 2000 02:39:54 -0800 X-Received: 17 Nov 2000 10:39:54 GMT Received: (qmail 40813 invoked by uid 500); 17 Nov 2000 10:39:16 -0000 Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm Precedence: bulk list-help: <mailto:[EMAIL PROTECTED]> list-unsubscribe: <mailto:[EMAIL PROTECTED]> list-post: <mailto:[EMAIL PROTECTED]> Delivered-To: mailing list [EMAIL PROTECTED] Received: (qmail 40797 invoked from network); 17 Nov 2000 10:39:16 -0000 Message-ID: <[EMAIL PROTECTED]> Date: Fri, 17 Nov 2000 02:38:49 -0800 From: Joshua Chamas <[EMAIL PROTECTED]> Organization: NodeWorks <http://nodeworks.com> X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en,ja MIME-Version: 1.0 To: Mod Perl <[EMAIL PROTECTED]> Subject: New Module Idea: MLDBM::Sync Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: locus.apache.org 1.6.2 0/1000/N X-Rcpt-To: [EMAIL PROTECTED] X-Mozilla-Status2: 00000000 Hey, I'm working on a new module to be used for mod_perl style caching. I'm calling it MLDBM::Sync because its a subclass of MLDBM that makes sure concurrent access is serialized with flock() and i/o flushing between reads and writes. Below is the code for the module. I believe it could be used too as a safe backing store for Memoize in a multi-process environment. It could be used like: tie %mldbm, 'MLDBM::Sync', '/tmp/mldbm_dbm', O_CREAT|O_RDWR, 0666; $mldbm{rand()} = [ rand() ]; %mldbm = (); The history is that I hunted around for on disk caching in which I can stuff db query results temporarily, and the best I liked was File::Cache, which is really cool BTW. I would use it, but MLDBM::Sync using default SDBM_File seems to be 2 to 3 times faster, getting about 1000 writes / sec on my dual PIII 400. MLDBM::Sync using MLDBM in DB_File mode is considerably slower than File::Cache, by 5-10 times, so it really depends on the data you want to store, for which you might use. The 1024 byte limit on SDBM_File makes it often not the right choice. I also thought about calling it MLDBM::Lock, MLDBM::Serialize, MLDBM::Multi ... I like MLDBM::Sync though. For modperl caching usage, I imagine tieing to it in each child, and clearing when necessary, perhaps even at parent httpd initialization... no auto-expiration here, use File::Cache, IPC::Cache for that! Any thoughts? --Joshua _________________________________________________________________ Joshua Chamas Chamas Enterprises Inc. NodeWorks >> free web link monitoring Huntington Beach, CA USA http://www.nodeworks.com 1-714-625-4051 package MLDBM::Sync; use MLDBM; use Fcntl qw(:flock); use strict; no strict qw(refs); use vars qw($AUTOLOAD); sub TIEHASH { my($class, $file, @args) = @_; my $fh = "$file.lock"; open($fh, ">>$fh") || die("can't open file $fh: $!"); bless { 'args' => [ $file, @args ], 'lock' => $fh, 'keys' => [], }; } sub DESTROY { my $self = shift; if (($self->{lock})) { close($self->{lock}) } } sub AUTOLOAD { my $self = shift; $AUTOLOAD =~ /::([^:]+)$/; my $func = $1; $self->exlock; my $rv = $self->{dbm}->$func(@_); $self->unlock; $rv; } sub STORE { my $self = shift; $self->exlock; my $rv = $self->{dbm}->STORE(@_); $self->unlock; $rv; }; sub FETCH { my $self = shift; $self->shlock; my $rv = $self->{dbm}->FETCH(@_); $self->unlock; $rv; }; sub FIRSTKEY { my $self = shift; $self->shlock; $self->{keys} = [ keys %{$self->{dbm_hash}} ]; $self->unlock; $self->NEXTKEY; } sub NEXTKEY { shift(@{shift->{keys}}); } sub mldbm_tie { my $self = shift; my $args = $self->{args}; my %dbm_hash; my $dbm = tie(%dbm_hash, 'MLDBM', @$args) || die("can't tie to MLDBM with args: ".join(',', @$args)."; error: $!"); $self->{dbm_hash} = \%dbm_hash; $self->{dbm} = $dbm; } sub exlock { my $self = shift; flock($self->{lock}, LOCK_EX) || die("can't write lock $self->{lock}: $!"); $self->mldbm_tie; } sub shlock { my $self = shift; flock($self->{lock}, LOCK_SH) || die("can't share lock $self->{lock}: $!"); $self->mldbm_tie; } sub unlock { my $self = shift; undef $self->{dbm}; untie %{$self->{dbm_hash}}; flock($self->{lock}, LOCK_UN) || die("can't unlock $self->{lock}: $!"); } --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Return-Path: [EMAIL PROTECTED] Received: from proxy1.ba.best.com ([EMAIL PROTECTED] [206.184.139.12]) by shell18.ba.best.com (8.9.3/8.9.2/best.sh) with ESMTP id PAA21790 for <[EMAIL PROTECTED]>; Tue, 21 Nov 2000 15:04:31 -0800 (PST) Received: from pharkins.office.etoys.com ([63.174.210.2]) by proxy1.ba.best.com (8.9.3/8.9.2/best.in) with ESMTP id PAA05503 for <[EMAIL PROTECTED]>; Tue, 21 Nov 2000 15:02:21 -0800 (PST) Received: from localhost (pharkins@localhost) by pharkins.office.etoys.com (8.9.3/8.9.3) with ESMTP id PAA01988; Tue, 21 Nov 2000 15:00:01 -0800 X-Authentication-Warning: pharkins.office.etoys.com: pharkins owned process doing -bs Date: Tue, 21 Nov 2000 15:00:01 -0800 (PST) From: Perrin Harkins <[EMAIL PROTECTED]> X-Sender: [EMAIL PROTECTED] Reply-To: Perrin Harkins <[EMAIL PROTECTED]> To: Joshua Chamas <[EMAIL PROTECTED]> cc: Mod Perl <[EMAIL PROTECTED]> Subject: Re: New Module Idea: MLDBM::Sync In-Reply-To: <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Rcpt-To: [EMAIL PROTECTED] X-Mozilla-Status2: 00000000 On Fri, 17 Nov 2000, Joshua Chamas wrote: > I'm working on a new module to be used for mod_perl style > caching. I'm calling it MLDBM::Sync because its a subclass > of MLDBM that makes sure concurrent access is serialized with > flock() and i/o flushing between reads and writes. I looked through the code and couldn't see how you are doing i/o flushing. This is more of an issue with Berkeley DB than SDBM I think, since Berkeley DB will cache things in memory. Can you point to me it? Also, I'm confused on the usage. Do you open the dbm file and keep it open, or do you tie/untie on every request? > Any thoughts? You might want to look at the Mason caching API. It would be nice to make an interface like that available on top of a module like this. - Perrin > package MLDBM::Sync; > use MLDBM; > use Fcntl qw(:flock); > use strict; > no strict qw(refs); > use vars qw($AUTOLOAD); > > sub TIEHASH { > my($class, $file, @args) = @_; > > my $fh = "$file.lock"; > open($fh, ">>$fh") || die("can't open file $fh: $!"); > > bless { > 'args' => [ $file, @args ], > 'lock' => $fh, > 'keys' => [], > }; > } > > sub DESTROY { > my $self = shift; > if (($self->{lock})) { > close($self->{lock}) > } > } > > sub AUTOLOAD { > my $self = shift; > $AUTOLOAD =~ /::([^:]+)$/; > my $func = $1; > $self->exlock; > my $rv = $self->{dbm}->$func(@_); > $self->unlock; > $rv; > } > > sub STORE { > my $self = shift; > $self->exlock; > my $rv = $self->{dbm}->STORE(@_); > $self->unlock; > $rv; > }; > > sub FETCH { > my $self = shift; > $self->shlock; > my $rv = $self->{dbm}->FETCH(@_); > $self->unlock; > $rv; > }; > > sub FIRSTKEY { > my $self = shift; > $self->shlock; > $self->{keys} = [ keys %{$self->{dbm_hash}} ]; > $self->unlock; > $self->NEXTKEY; > } > > sub NEXTKEY { > shift(@{shift->{keys}}); > } > > sub mldbm_tie { > my $self = shift; > my $args = $self->{args}; > my %dbm_hash; > my $dbm = tie(%dbm_hash, 'MLDBM', @$args) || die("can't tie to MLDBM with args: >".join(',', @$args)."; error: $!"); > $self->{dbm_hash} = \%dbm_hash; > $self->{dbm} = $dbm; > } > > sub exlock { > my $self = shift; > flock($self->{lock}, LOCK_EX) || die("can't write lock $self->{lock}: $!"); > $self->mldbm_tie; > } > > sub shlock { > my $self = shift; > flock($self->{lock}, LOCK_SH) || die("can't share lock $self->{lock}: $!"); > $self->mldbm_tie; > } > > sub unlock { > my $self = shift; > undef $self->{dbm}; > untie %{$self->{dbm_hash}}; > flock($self->{lock}, LOCK_UN) || die("can't unlock $self->{lock}: $!"); > } Return-Path: [EMAIL PROTECTED] Received: from proxy1.ba.best.com ([EMAIL PROTECTED] [206.184.139.12]) by shell18.ba.best.com (8.9.3/8.9.2/best.sh) with ESMTP id KAA08700 for <[EMAIL PROTECTED]>; Wed, 22 Nov 2000 10:01:41 -0800 (PST) Received: from hardrock.soma.redhat.com (firebox-ext.soma.redhat.com [205.217.45.80] (may be forged)) by proxy1.ba.best.com (8.9.3/8.9.2/best.in) with ESMTP id KAA04781 for <[EMAIL PROTECTED]>; Wed, 22 Nov 2000 10:00:38 -0800 (PST) Received: (from plindner@localhost) by hardrock.soma.redhat.com (8.9.3/8.9.3) id JAA30808; Wed, 22 Nov 2000 09:59:56 -0800 Date: Wed, 22 Nov 2000 09:59:56 -0800 From: Paul Lindner <[EMAIL PROTECTED]> To: Tim Bunce <[EMAIL PROTECTED]> Cc: Perrin Harkins <[EMAIL PROTECTED]>, Joshua Chamas <[EMAIL PROTECTED]>, Mod Perl <[EMAIL PROTECTED]> Subject: Re: New Module Idea: MLDBM::Sync Message-ID: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Mail-Followup-To: Tim Bunce <[EMAIL PROTECTED]>, Perrin Harkins <[EMAIL PROTECTED]>, Joshua Chamas <[EMAIL PROTECTED]>, Mod Perl <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <[EMAIL PROTECTED]>; from [EMAIL PROTECTED] on Wed, Nov 22, 2000 at 10:58:43AM +0000 X-Rcpt-To: [EMAIL PROTECTED] X-Mozilla-Status2: 00000000 On Wed, Nov 22, 2000 at 10:58:43AM +0000, Tim Bunce wrote: > On Tue, Nov 21, 2000 at 03:00:01PM -0800, Perrin Harkins wrote: > > On Fri, 17 Nov 2000, Joshua Chamas wrote: > > > I'm working on a new module to be used for mod_perl style > > > caching. I'm calling it MLDBM::Sync because its a subclass > > > of MLDBM that makes sure concurrent access is serialized with > > > flock() and i/o flushing between reads and writes. > > > > I looked through the code and couldn't see how you are doing i/o > > flushing. This is more of an issue with Berkeley DB than SDBM I think, > > since Berkeley DB will cache things in memory. Can you point to me it? > > I'm puzzled why people wouldn't just use version 3 of Berkeley DB (via > DB_File.pm or BerkeleyDB.pm) which supports multiple readers and > writers through a shared memory cache. No open/close/flush required > per-write and very very much faster. > > Is there a reason I'm missing? Might MLDBM::Sync work over an NFS mounted partition? That's one reason I've not used the BerkeleyDB stuff yet.. -- Paul Lindner [EMAIL PROTECTED] Red Hat Inc. Return-Path: [EMAIL PROTECTED] Received: from proxy1.ba.best.com ([EMAIL PROTECTED] [206.184.139.12]) by shell18.ba.best.com (8.9.3/8.9.2/best.sh) with ESMTP id SAA11838 for <[EMAIL PROTECTED]>; Wed, 22 Nov 2000 18:46:12 -0800 (PST) Received: from c004.sfo.cp.net (c004-mx001.c004.sfo.cp.net [209.228.14.146]) by proxy1.ba.best.com (8.9.3/8.9.2/best.in) with SMTP id SAA13348 for <[EMAIL PROTECTED]>; Wed, 22 Nov 2000 18:44:58 -0800 (PST) Received: (cpmta 15598 invoked from network); 22 Nov 2000 18:44:28 -0800 Delivered-To: [EMAIL PROTECTED] Received: (cpmta 15589 invoked from network); 22 Nov 2000 18:44:27 -0800 Received: from locus.apache.org (63.211.145.10) by smtp.c004-mx000.c004.sfo.cp.net (209.228.14.146) with SMTP; 22 Nov 2000 18:44:27 -0800 X-Received: 23 Nov 2000 02:44:27 GMT Received: (qmail 61013 invoked by uid 500); 23 Nov 2000 01:54:43 -0000 Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm Precedence: bulk list-help: <mailto:[EMAIL PROTECTED]> list-unsubscribe: <mailto:[EMAIL PROTECTED]> list-post: <mailto:[EMAIL PROTECTED]> Delivered-To: mailing list [EMAIL PROTECTED] Received: (qmail 60999 invoked from network); 23 Nov 2000 01:54:42 -0000 Message-ID: <[EMAIL PROTECTED]> Date: Wed, 22 Nov 2000 17:54:09 -0800 From: Joshua Chamas <[EMAIL PROTECTED]> Organization: NodeWorks <http://nodeworks.com> X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en,ja MIME-Version: 1.0 To: [EMAIL PROTECTED] CC: Perrin Harkins <[EMAIL PROTECTED]>, Mod Perl <[EMAIL PROTECTED]> Subject: Use Sambe, not NFS [Re: New Module Idea: MLDBM::Sync] References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: locus.apache.org 1.6.2 0/1000/N X-Rcpt-To: [EMAIL PROTECTED] X-Mozilla-Status2: 00000000 Paul Lindner wrote: > > Might MLDBM::Sync work over an NFS mounted partition? That's one > reason I've not used the BerkeleyDB stuff yet.. > Paul, For the first time, I benchmarked concurrent linux client write access over a SAMBA network share, and it worked, 0 data loss. This is opposed to a NFS share accessed from linux which would see data loss due to lack of serialization of write requests. With MLDBM::Sync, I benchmarked 8 linux clients writing to a samba mount pointed at a WinNT PIII 450 over a 10Mbs network. For 8000 writes, I got: SDBM_File: 105 writes/sec DB_File: 99 writes/sec [ better than to local disk ] It seems the network was the bottleneck on this test, as neither client nor server CPU/disk was maxed out. The WinNT server was running at 20-25% CPU utilization during the test. As Apache::ASP $Session uses a method similar to MLDBM::Sync to flush i/o, you could then point StateDir to a samba/CIFS share to cluster well an ASP application, with 0 data loss. My understanding is that you have a NetApp cluster which can export CIFS? I'd benchmark this heavily obviously to see if there are any NetApp cluster locking issues, but am guessing that you could likely get 200+ ASP requests per second on a 100Mbs network, which will likely far exceed your base application performance. -- Joshua _________________________________________________________________ Joshua Chamas Chamas Enterprises Inc. NodeWorks >> free web link monitoring Huntington Beach, CA USA http://www.nodeworks.com 1-714-625-4051 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Return-Path: [EMAIL PROTECTED] Received: from proxy5.ba.best.com ([EMAIL PROTECTED] [206.184.139.16]) by shell18.ba.best.com (8.9.3/8.9.2/best.sh) with ESMTP id RAA09609 for <[EMAIL PROTECTED]>; Tue, 21 Nov 2000 17:44:31 -0800 (PST) Received: from pharkins.office.etoys.com ([63.174.210.2]) by proxy5.ba.best.com (8.9.3/8.9.2/best.in) with ESMTP id RAA28614 for <[EMAIL PROTECTED]>; Tue, 21 Nov 2000 17:42:54 -0800 (PST) Received: from localhost (pharkins@localhost) by pharkins.office.etoys.com (8.9.3/8.9.3) with ESMTP id RAA04750; Tue, 21 Nov 2000 17:39:29 -0800 X-Authentication-Warning: pharkins.office.etoys.com: pharkins owned process doing -bs Date: Tue, 21 Nov 2000 17:39:29 -0800 (PST) From: Perrin Harkins <[EMAIL PROTECTED]> X-Sender: [EMAIL PROTECTED] Reply-To: Perrin Harkins <[EMAIL PROTECTED]> To: Joshua Chamas <[EMAIL PROTECTED]> cc: Mod Perl <[EMAIL PROTECTED]> Subject: Re: New Module Idea: MLDBM::Sync In-Reply-To: <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Rcpt-To: [EMAIL PROTECTED] X-Mozilla-Status2: 00000000 On Tue, 21 Nov 2000, Joshua Chamas wrote: > On my box, some rough numbers in writes per sec, with doing a > tie/untie for each write, are: > > sync writes/sec with tie/untie > > SDBM_File 1000 > DB_File 30 > GDBM_File 40 > > Note that on a RAM disk in Linux, DB_File goes to 500 writes per sec, > but setting up a RAM disk is a pain, so I'd probably use File::Cache > which gets about 300 writes per sec on the file system. Useful numbers. It looks as if File::Cache is the best approach if you need anything beyond the SDBM size limit. Maybe some fine-tuning of that module could bring it more in line with SDBM performance. If you have the RAM to spare - and I guess you do, if you're considering things like RAM disks - you could try IPC::MM too. I think it will be faster than the other IPC modules because it's a Perl API to a shared hash written in C. - Perrin