Tim Bunce writes:
> On Fri, Dec 17, 1999 at 02:56:49PM +0000, Andrew Ford wrote:
> > Tim Bunce writes:
> > > On Fri, Dec 17, 1999 at 12:55:12PM +0000, Andrew Ford wrote:
> > > > I have an urgent need for a module to tie an array of integers to an
> > > > mmap'ed file (a sparse array of several hundred million integers). I
> > > > have looked at Mmap.pm by Malcolm Beattie and seen the idea for
> > > > Array::Virtual registered by Larry Wall and have started implementing
> > > > the module I need. But should it be called Array::Virtual (taking
> > > > Larry's slot) or Mmap::Array?
> > > >
> > > > The interface I have in mind is:
> > > >
> > > > use Array::Virtual; # or Mmap::Array
> > > >
> > > > my @array;
> > > > open(FH, "...");
> > > > my $nel = 42;
> > > > my $prot = "rw"; # or "ro", or "wo" or should it be PROT_READ?
> > > > my $shared = 1; # or should it be MAP_SHARED?
> > > > my $offset = 0; # this would be the default anyway
> > > > my $type = "int4" # some set of (string) literals for 1, 2, 4, 8
> > > > # byte integers in native or network byte
> > > > # order, plus floating point (default probably int4)
> > > >
> > > > tie @array, $nel, $prot, $shared, FH, $offset, $type;
> > > >
> > > > $array[0] = 42;
> > > > #etc
> > > >
> > > > undef @array;
> > > >
> > > > Any thoughts?
> > >
> > > The word 'Virtual' doesn't carry much meaning here. Maybe:
> > >
> > > Tie::MmapArray
> > >
> > > I'd switch to using a hash of named parameters to the tie.
> > >
> > > I'd also use pack() letters to describe the element type (which would
> > > neatly expand to a string of letters for arrays of structures).
> > >
> >
> > Thanks for the prompt feedback. I agree about the name and using a
> > hash for parameters, so the call will now look like:
> >
> > use Tie::MmapArray;
> > use Fcntl;
> >
> > tie @array, { fh => $fh,
> > eltype => "l",
> > nels => 42,
> > mode => O_RDWR, # or "rw"
> > shared => 1,
> > offset => 0 };
> >
> > This raises a couple of issues:
> >
> > 1. If the fh parameter is not specified then this becomes an anonymous
> > mmap call, which is probably not a sensible default. Should I have
> > the filehandle as a separate parameter (e.g. tie @array, FH, $href)
> > and require an explicit undef for anonymous mmap'ing (or require an
> > explicit undef for the value of the "fh" element of the hash)?
>
> Umm, the former seems better, I think. But I imagine many people would
> want to just pass in a file name and let the module look after opening it.
>
Ah! I didn't think of something that simple. As Mmap.pm wants a
filehandle I thought I had to present the same interface, but I
suppose it ain't necessarily so.
OK, so now we have:
use Tie::MmapArray;
tie @array, $filename, { eltype => "l",
nels => 42,
mode => "rw", # or "ro" or "wo"
shared => 1,
offset => 0 };
and if $filename is actually a filehandle we just do The Right Thing.
> > 2. The mode could be a numeric parameter with values:
> > O_RDONLY, O_WRONLY or O_RDWR (from Fcntl). I know that mmap uses
> > PROT_READ, PROT_WRITE, but not many people are that familiar with
> > mmap compared to open(2).
>
> True. I was confusing mode (mmap prot arg) with seperate mmap flags arg.
>
> > I could allow both the Fcntl constants
> > and the strings "ro", "wo" and "rw".
>
> Given the above I'd say just go with the strings. Keep it simple.
>
> > 3. using the pack letters does not allow all (sensible) possibilities
> > (signed/unsigned, network/native, 1/2/4/8 integers or float/double).
> > specifically it looks like unsigned network order [is not covered]
>
> Then submit a perl patch that adds them, if it is really needed.
> See Porting/patching.pod in the perl source directory.
> [CC'd to perl5-porters for comment].
My client actually uses 3 byte integers to save diskspace in some
circumstances ;-)
>
> > and 8 byte integers are not covered.
>
> >From the 5.005_03 manual:
>
> q A signed quad (64-bit) value.
> Q An unsigned quad value.
> (Available only if your system supports 64-bit integer values
> _and_ if Perl has been compiled to support those.
> Causes a fatal error otherwise.)
>
Johan's Perl5 Pocket Reference doesn't have q or Q so I missed that.
> > I could allow the letters to be
> > qualified, e.g. i8 would be an eight-byte integer, and c22 would be
> > an array of 22-character, fixed-length strings.
>
> I'd suggest you just follow the well defined pack() syntax. If you need
> something outside that then (try to) either patch perl's pack to add
> it, or agree with perl5-porters what a safe 'escape' syntax would be so
> you'll be safe from future additions to pack().
Actually if the pack letters cover everything I'm interested in then
something like i8 could create a two dimensional array. One could
even extend the metaphor such that something like
tie @array, $file, { eltype => "ia24i", ... };
would tie @array to a file with the record structure given, so that
$array[$n] returns a reference to a three-element array, the elements
of which are an integer, a 24-character string and an integer
repectively, and hey presto we've got access to the fields of a
record-oriented file. (Mmm, I wonder what eltype => "perl" should do.)
I might leave this level of complexity from the first version of the
module though ;-)
Andrew
--
Andrew Ford, Director Ford & Mason Ltd +44 1531 829900 (tel)
[EMAIL PROTECTED] South Wing, Compton House +44 1531 829901 (fax)
http://www.ford-mason.co.uk Compton Green, Redmarley +44 385 258278 (mobile)
http://www.refcards.com Gloucester, GL19 3JB, UK