Re: [Feature Request] Some ideas on 'mmap'

Nala Ginrut Tue, 30 Apr 2013 06:56:09 -0700

On Tue, 2013-04-30 at 18:12 +0800, Daniel Hartwig wrote:
> On 30 April 2013 17:27, Nala Ginrut <nalagin...@gmail.com> wrote:
> > hi guys!
> > A discussion on IRC about adding 'mmap' raised, and I will share some
> > ideas on this topic:
> >
> > 1. Complex one or simple one?
> > The simple one is just a simple wrapper taking advantage of (system
> > foreign).
> > The complex one is more alike Python's mmap module. Contains a special
> > type of <mmap>, and a bunch of functions to handle it.
> >
> > 2. If simple one, just leave the rest of the mail alone.
> > But if complex one was chosen, we need (ice-9 mmap).
> >
> > 3. I'll choose Python alike interface for mmap:
> > (mmap port #:optional length prog flags offset)
> > And if port is #f, it works as anonymous-mapping.
> >
> > 4. mmap returns a new type <mmap>, and it may work like 'array':
> > (define m (mmap port))
> > (mmap-ref m 10 20)
> > (mmap-set! m 0 0 1024)
> > The interfaces maybe:
> > (mmap-ref <mmap> from to) ==> return u8 bytevector
> > (mmap-set! <mmap> byte from to) ==> return unspecified
> >
> 
> You may as well just have mmap return a bytevector or pointer to the
> start, and forget about these redundant procedures.  mmaped data is a
> bytevector, so no need to reimplement that interface with different
> names.
>


<mmap> is just a structure like this (maybe more?):
(define-record-type <mmap>
  ...
  (pointer mmap-pointer)
  (flag mmap-flag)
  (size mmap-size))
So it's actually a pointer, and with 'size', we could release it without
keep in mind the size we demanded.

If I use bytevector instead, it means I have to read all the content
from a file first. I don't think it's the same with mmap in POSIX.
mmap is used for very large data I/O, if we decide to read them all, we
lose the game.
mmap does lazy disk I/O automatically for the file.

> > 5. use munmap to release it
> > (munmap <mmap>)
> >
> 
> > 6. other helper functions also available:
> 
> If you want a port, use a port.  Likewise for strings, bytevectors.
> 

For an instance, in a multi-thread program, if we use port and need to
move the cursor, we have to remember/restore the cursor for other
threads. But if we use mmap, we don't have to do that, each thread keeps
their own pointer/index.
And why not read them all into a bytevector? Yes, it helps, but as I
explained, the very big file. 

> > (mmap-find <mmap> string #:optional start end)
> > (mmap-flush <mmap> #:optional offset size)
> > (mmap-move <mmap> dest src count)
> > (mmap-read <mmap> num)
> > (mmap-readbyte <mmap>)
> > (mmap-readline <mmap>)
> > (mmap-resize <mmap> newsize)
> > (mmap-rfind <mmap> string #:optional start end)
> > (mmap-seek <mmap> pos #:optional whence)
> > (mmap-size <mmap>)
> > (mmap-tell <mmap>) ; Returns the current position of the file pointer.
> > (mmap-write <mmap> str/bv)
> > (mmap-writebyte <mmap> byte)
> >
> > Comments?
> > Thanks!
> >
> >
> >

Re: [Feature Request] Some ideas on 'mmap'

Reply via email to