Andrew Ford writes:
> Tim Bunce writes:
> > On Fri, Dec 17, 1999 at 02:56:49PM +0000, Andrew Ford wrote:
> > > Tim Bunce writes:
> > > > On Fri, Dec 17, 1999 at 12:55:12PM +0000, Andrew Ford wrote:
> > > > > I have an urgent need for a module to tie an array of integers to an
> > > > > mmap'ed file (a sparse array of several hundred million integers). I
> > > > > have looked at Mmap.pm by Malcolm Beattie and seen the idea for
> > > > > Array::Virtual registered by Larry Wall and have started implementing
> > > > > the module I need. But should it be called Array::Virtual (taking
> > > > > Larry's slot) or Mmap::Array?
> > > > >
> > > > > The interface I have in mind is:
> > > > >
> > > > > use Array::Virtual; # or Mmap::Array
> > > > >
> > > > > my @array;
> > > > > open(FH, "...");
> > > > > my $nel = 42;
> > > > > my $prot = "rw"; # or "ro", or "wo" or should it be PROT_READ?
> > > > > my $shared = 1; # or should it be MAP_SHARED?
> > > > > my $offset = 0; # this would be the default anyway
> > > > > my $type = "int4" # some set of (string) literals for 1, 2, 4, 8
> > > > > # byte integers in native or network byte
> > > > > # order, plus floating point (default probably int4)
> > > > >
> > > > > tie @array, $nel, $prot, $shared, FH, $offset, $type;
> > > > >
> > > > > $array[0] = 42;
> > > > > #etc
> > > > >
> > > > > undef @array;
> > > > >
> > > > > Any thoughts?
> > > >
> > > > The word 'Virtual' doesn't carry much meaning here. Maybe:
> > > >
> > > > Tie::MmapArray
> > > >
> > > > I'd switch to using a hash of named parameters to the tie.
> > > >
> > > > I'd also use pack() letters to describe the element type (which would
> > > > neatly expand to a string of letters for arrays of structures).
> > > >
> > >
> > > Thanks for the prompt feedback. I agree about the name and using a
> > > hash for parameters, so the call will now look like:
> > >
> > > use Tie::MmapArray;
> > > use Fcntl;
> > >
> > > tie @array, { fh => $fh,
> > > eltype => "l",
> > > nels => 42,
> > > mode => O_RDWR, # or "rw"
> > > shared => 1,
> > > offset => 0 };
> > >
> > > This raises a couple of issues:
> > >
> > > 1. If the fh parameter is not specified then this becomes an anonymous
> > > mmap call, which is probably not a sensible default. Should I have
> > > the filehandle as a separate parameter (e.g. tie @array, FH, $href)
> > > and require an explicit undef for anonymous mmap'ing (or require an
> > > explicit undef for the value of the "fh" element of the hash)?
> >
> > Umm, the former seems better, I think. But I imagine many people would
> > want to just pass in a file name and let the module look after opening it.
> >
>
> Ah! I didn't think of something that simple. As Mmap.pm wants a
> filehandle I thought I had to present the same interface, but I
> suppose it ain't necessarily so.
>
> OK, so now we have:
>
> use Tie::MmapArray;
>
> tie @array, $filename, { eltype => "l",
> nels => 42,
> mode => "rw", # or "ro" or "wo"
> shared => 1,
> offset => 0 };
>
> and if $filename is actually a filehandle we just do The Right Thing.
>
>
> > > 2. The mode could be a numeric parameter with values:
> > > O_RDONLY, O_WRONLY or O_RDWR (from Fcntl). I know that mmap uses
> > > PROT_READ, PROT_WRITE, but not many people are that familiar with
> > > mmap compared to open(2).
> >
> > True. I was confusing mode (mmap prot arg) with seperate mmap flags arg.
> >
> > > I could allow both the Fcntl constants
> > > and the strings "ro", "wo" and "rw".
> >
> > Given the above I'd say just go with the strings. Keep it simple.
> >
> > > 3. using the pack letters does not allow all (sensible) possibilities
> > > (signed/unsigned, network/native, 1/2/4/8 integers or float/double).
> > > specifically it looks like unsigned network order [is not covered]
> >
> > Then submit a perl patch that adds them, if it is really needed.
> > See Porting/patching.pod in the perl source directory.
> > [CC'd to perl5-porters for comment].
>
> My client actually uses 3 byte integers to save diskspace in some
> circumstances ;-)
>
> >
> > > and 8 byte integers are not covered.
> >
> > >From the 5.005_03 manual:
> >
> > q A signed quad (64-bit) value.
> > Q An unsigned quad value.
> > (Available only if your system supports 64-bit integer values
> > _and_ if Perl has been compiled to support those.
> > Causes a fatal error otherwise.)
> >
>
> Johan's Perl5 Pocket Reference doesn't have q or Q so I missed that.
>
>
> > > I could allow the letters to be
> > > qualified, e.g. i8 would be an eight-byte integer, and c22 would be
> > > an array of 22-character, fixed-length strings.
> >
> > I'd suggest you just follow the well defined pack() syntax. If you need
> > something outside that then (try to) either patch perl's pack to add
> > it, or agree with perl5-porters what a safe 'escape' syntax would be so
> > you'll be safe from future additions to pack().
>
> Actually if the pack letters cover everything I'm interested in then
> something like i8 could create a two dimensional array. One could
> even extend the metaphor such that something like
>
> tie @array, $file, { eltype => "ia24i", ... };
that should of course be
tie @array, 'Tie::MmapArray', $file, { eltype => "ia24i", ... };
>
> would tie @array to a file with the record structure given, so that
> $array[$n] returns a reference to a three-element array, the elements
> of which are an integer, a 24-character string and an integer
> repectively, and hey presto we've got access to the fields of a
> record-oriented file. (Mmm, I wonder what eltype => "perl" should do.)
>
> I might leave this level of complexity from the first version of the
> module though ;-)
>
The initial version is now on http://www.ford-mason.co.uk/resources/
I'll upload to CPAN once I've had a chance to test it some more, flesh
out some of the missing functionality and integrate any changes in
response to feedback.
Andrew
--
Andrew Ford, Director Ford & Mason Ltd +44 1531 829900 (tel)
[EMAIL PROTECTED] South Wing, Compton House +44 1531 829901 (fax)
http://www.ford-mason.co.uk Compton Green, Redmarley +44 385 258278 (mobile)
http://www.refcards.com Gloucester, GL19 3JB, UK