Tim Bunce writes:
 > On Fri, Dec 17, 1999 at 02:56:49PM +0000, Andrew Ford wrote:
 > > Tim Bunce writes:
 > >  > On Fri, Dec 17, 1999 at 12:55:12PM +0000, Andrew Ford wrote:
 > >  > > I have an urgent need for a module to tie an array of integers to an
 > >  > > mmap'ed file (a sparse array of several hundred million integers).  I
 > >  > > have looked at Mmap.pm by Malcolm Beattie and seen the idea for
 > >  > > Array::Virtual registered by Larry Wall and have started implementing
 > >  > > the module I need.  But should it be called Array::Virtual (taking
 > >  > > Larry's slot) or Mmap::Array?
 > >  > > 
 > >  > > The interface I have in mind is:
 > >  > > 
 > >  > >     use Array::Virtual;               # or Mmap::Array
 > >  > > 
 > >  > >     my @array;
 > >  > >     open(FH, "...");
 > >  > >     my $nel  = 42;
 > >  > >     my $prot = "rw";  # or "ro", or "wo" or should it be PROT_READ?
 > >  > >     my $shared = 1;   # or should it be MAP_SHARED?
 > >  > >     my $offset = 0;     # this would be the default anyway
 > >  > >     my $type   = "int4"       # some set of (string) literals for 1, 2, 4, 8 
 > >  > >                       # byte integers in native or network byte
 > >  > >                       # order, plus floating point (default probably int4)
 > >  > > 
 > >  > >     tie @array, $nel, $prot, $shared, FH, $offset, $type;
 > >  > > 
 > >  > >     $array[0] = 42; 
 > >  > >     #etc
 > >  > > 
 > >  > >     undef @array;
 > >  > > 
 > >  > > Any thoughts?
 > >  > 
 > >  > The word 'Virtual' doesn't carry much meaning here. Maybe:
 > >  > 
 > >  >         Tie::MmapArray
 > >  > 
 > >  > I'd switch to using a hash of named parameters to the tie.
 > >  > 
 > >  > I'd also use pack() letters to describe the element type (which would
 > >  > neatly expand to a string of letters for arrays of structures).
 > >  > 
 > > 
 > > Thanks for the prompt feedback.  I agree about the name and using a
 > > hash for parameters, so the call will now look like:
 > > 
 > >    use Tie::MmapArray;
 > >    use Fcntl;
 > > 
 > >    tie @array, { fh     => $fh,
 > >                  eltype => "l",
 > >                  nels   => 42,
 > >                  mode   => O_RDWR,  # or "rw"
 > >                  shared => 1,
 > >                  offset => 0 };
 > > 
 > > This raises a couple of issues:
 > > 
 > > 1. If the fh parameter is not specified then this becomes an anonymous
 > >    mmap call, which is probably not a sensible default.  Should I have 
 > >    the filehandle as a separate parameter (e.g. tie @array, FH, $href)
 > >    and require an explicit undef for anonymous mmap'ing (or require an 
 > >    explicit undef for the value of the "fh" element of the hash)?
 > 
 > Umm, the former seems better, I think. But I imagine many people would
 > want to just pass in a file name and let the module look after opening it.
 > 

Ah!  I didn't think of something that simple.  As Mmap.pm wants a
filehandle I thought I had to present the same interface, but I
suppose it ain't necessarily so.

OK, so now we have:

    use Tie::MmapArray;
 
    tie @array, $filename, { eltype => "l",
                             nels   => 42,
                             mode   => "rw",    # or "ro" or "wo"
                             shared => 1,
                             offset => 0 };

and if $filename is actually a filehandle we just do The Right Thing.


 > > 2. The mode could be a numeric parameter with values:
 > >    O_RDONLY, O_WRONLY or O_RDWR (from Fcntl).  I know that mmap uses
 > >    PROT_READ, PROT_WRITE, but not many people are that familiar with
 > >    mmap compared to open(2).
 > 
 > True. I was confusing mode (mmap prot arg) with seperate mmap flags arg.
 > 
 > >    I could allow both the Fcntl constants 
 > >    and the strings "ro", "wo" and "rw".
 > 
 > Given the above I'd say just go with the strings. Keep it simple.
 > 
 > > 3. using the pack letters does not allow all (sensible) possibilities
 > >    (signed/unsigned, network/native, 1/2/4/8 integers or float/double).
 > >    specifically it looks like unsigned network order [is not covered]
 > 
 > Then submit a perl patch that adds them, if it is really needed.
 > See Porting/patching.pod in the perl source directory.
 > [CC'd to perl5-porters for comment].

My client actually uses 3 byte integers to save diskspace in some
circumstances ;-)

 > 
 > >    and 8 byte integers are not covered.
 > 
 > >From the 5.005_03 manual:
 > 
 >     q   A signed quad (64-bit) value.
 >     Q   An unsigned quad value.
 >       (Available only if your system supports 64-bit integer values
 >        _and_ if Perl has been compiled to support those.
 >            Causes a fatal error otherwise.)
 > 

Johan's Perl5 Pocket Reference doesn't have q or Q so I missed that.


 > >    I could allow the letters to be
 > >    qualified, e.g. i8 would be an eight-byte integer, and c22 would be 
 > >    an array of 22-character, fixed-length strings.
 > 
 > I'd suggest you just follow the well defined pack() syntax. If you need
 > something outside that then (try to) either patch perl's pack to add
 > it, or agree with perl5-porters what a safe 'escape' syntax would be so
 > you'll be safe from future additions to pack().

Actually if the pack letters cover everything I'm interested in then
something like i8 could create a two dimensional array.  One could
even extend the metaphor such that something like

    tie @array, $file, { eltype => "ia24i", ... };

would tie @array to a file with the record structure given, so that
$array[$n] returns a reference to a three-element array, the elements
of which are an integer, a 24-character string and an integer
repectively, and hey presto we've got access to the fields of a
record-oriented file.  (Mmm, I wonder what eltype => "perl" should do.)

I might leave this level of complexity from the first version of the
module though ;-)

Andrew
-- 
Andrew Ford,  Director       Ford & Mason Ltd           +44 1531 829900 (tel)
[EMAIL PROTECTED]      South Wing, Compton House  +44 1531 829901 (fax)
http://www.ford-mason.co.uk  Compton Green, Redmarley   +44 385 258278 (mobile)
http://www.refcards.com      Gloucester, GL19 3JB, UK   

Reply via email to