On Fri, Mar 13, 2009 at 06:46, Jerry Rocteur <p...@rocteur.cc> wrote:
>> On Thu, Mar 12, 2009 at 09:20, Jerry Rocteur <mac...@rocteur.cc> wrote:
>>> Hi,
>>>
>>> I'm trying to tie this kind of hash into SDBM
>>>
>>> $hash_of_baseline{$hdr_user_name} = { user_name => $hdr_user_name,
>>>                                passwd  => $hdr_user_passwd,
>>> ...
>>> ...
>>>                                groups  => [ @info_group_names ] };
>>>
>>> I can store it but when I read it back I get.
>>>
>>> Can't use string ("HASH(0x1db393b0)") as a HASH ref while "strict refs" in 
>>> use at
>>>
>>> I think I understand from what I've googled that I can't use SDBM for this 
>>> kind of structure.
>>>
>>> I searched around and found MLDBM.
>>>
>>> I'm looking for advice, is this the best solution or am I on the wrong 
>>> track ?
>> snip
>>
>> DBM files can only store simple values like strings and numbers.  You
>> are trying to store a complex data structure in one.  The general
>> method of solving this problem is to use the Storable module* to
>> serialize the data structure as a string before storing it in the DBM,
>> or to use a module like DBM::Deep** that automates this for you.
>> [...]
>> [...]
>>
>> print Dumper \%deep, \%new;
>>
>> * http://perldoc.perl.org/Storable.html
>> ** http://search.cpan.org/dist/DBM-Deep/lib/DBM/Deep.pod
>
> Thanks very much Chas. I really appreciate this. I am glad I asked the 
> question to the list.
>
> I will look into the easiest solution, I don't like having to integrate a new 
> module into the system build unless
> absolutely necessary so I will study it carefully.
>
> My idea of storing into DBM was to speed up initial access to the data!
>
> Have a GREAT weekend,
>
> Jerry

Happily Storable is part of Core Perl, so you don't need to install
anything to use it.

As for speeding up access, well, DBMs are good for data sets that
don't fit into memory, but I wouldn't go to one to speed up my code*.
If you are interested in dumping a complex data structure at the end
of a program and rebuilding it quickly on the next run you are
probably better served by using YAML::Syck** or Storable to serialize
the data structure, saving the serialized data to a file, and then
reconstituting the data structure at the start of the program.  Of
course, the real answer depends on the amount of data in the data
structure and the number of questions you intend on answering per run
of the program.  A huge data structure combined with a small number of
questions lends itself a DBM because the cost of accessing the DBM a
few times is lower than the cost of loading the whole data structure;
however, it really does need to be a huge data structure (the
performance difference between hashes and DBMs is that large).

Of course, there are other solutions to this problem.  If you need
speed, but starting up takes a long time you can always just start up
once and become a server.  You then can ask the daemon questions with
a client script.

* as shown by the following benchmark

all_true
        hash: there were 100 perfect squares
        dbm: there were 100 perfect squares
all_false
        hash: there were 0 perfect squares
        dbm: there were 0 perfect squares
mix
        hash: there were 50 perfect squares
        dbm: there were 50 perfect squares

all_true
        Rate   dbm  hash
dbm    815/s    --  -98%
hash 42708/s 5143%    --

all_false
        Rate   dbm  hash
dbm    835/s    --  -98%
hash 51691/s 6092%    --

mix
        Rate   dbm  hash
dbm    816/s    --  -98%
hash 44246/s 5325%    --

#!/usr/bin/perl

use strict;
use warnings;

use Benchmark;
use Fcntl;   # For O_RDWR, O_CREAT, etc.
use SDBM_File;

tie my %dbm, 'SDBM_File', 'database.dbm', O_RDWR|O_CREAT, 0666
        or die "could not tie SDBM database.dbm: $!\n";

#hash set of the first 1,000 perfect squares
my %hash = map { $_ * $_ => 1 } 1 .. 1_000;
%dbm     = map { $_ * $_ => 1 } 1 .. 1_000;

#list of possible perfet squares
my @a;

my %subs = (
        hash => sub {
                my $count = 0;
                for my $candidate (@a) {
                        $count++ if $hash{$candidate};
                }
                return $count;
        },
        dbm  => sub {
                my $count = 0;
                for my $candidate (@a) {
                        $count++ if $dbm{$candidate};
                }
                return $count;
        },
);

my %datasets = (
        mix       => [ map { rand > .5 ? $_ * $_ : $_ * $_ - 1 } 1 .. 100 ],
        all_true  => [ map { $_ * $_                           } 1 .. 100 ],
        all_false => [ map { $_ * $_ - 1                       } 1 .. 100 ],
);

#prove that the code works the same for various datasets
for my $dataset (keys %datasets) {
        @a = @{$datasets{$dataset}};
        print "$dataset\n";
        for my $k (keys %subs) {
                print "\t$k: there were ", $subs{$k}->(), " perfect squares\n";
        }
}
print "\n";

#show how the code performs for various datasets
for my $dataset (keys %datasets) {
        @a = @{$datasets{$dataset}};
        print "$dataset\n";
        Benchmark::cmpthese(-1, \%subs);
        print "\n";
}

** http://search.cpan.org/dist/YAML-Syck/lib/YAML/Syck.pm

-- 
Chas. Owens
wonkden.net
The most important skill a programmer can have is the ability to read.

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to