On Tue, 2013-04-30 at 18:12 +0800, Daniel Hartwig wrote: > On 30 April 2013 17:27, Nala Ginrut <nalagin...@gmail.com> wrote: > > hi guys! > > A discussion on IRC about adding 'mmap' raised, and I will share some > > ideas on this topic: > > > > 1. Complex one or simple one? > > The simple one is just a simple wrapper taking advantage of (system > > foreign). > > The complex one is more alike Python's mmap module. Contains a special > > type of <mmap>, and a bunch of functions to handle it. > > > > 2. If simple one, just leave the rest of the mail alone. > > But if complex one was chosen, we need (ice-9 mmap). > > > > 3. I'll choose Python alike interface for mmap: > > (mmap port #:optional length prog flags offset) > > And if port is #f, it works as anonymous-mapping. > > > > 4. mmap returns a new type <mmap>, and it may work like 'array': > > (define m (mmap port)) > > (mmap-ref m 10 20) > > (mmap-set! m 0 0 1024) > > The interfaces maybe: > > (mmap-ref <mmap> from to) ==> return u8 bytevector > > (mmap-set! <mmap> byte from to) ==> return unspecified > > > > You may as well just have mmap return a bytevector or pointer to the > start, and forget about these redundant procedures. mmaped data is a > bytevector, so no need to reimplement that interface with different > names. >
<mmap> is just a structure like this (maybe more?): (define-record-type <mmap> ... (pointer mmap-pointer) (flag mmap-flag) (size mmap-size)) So it's actually a pointer, and with 'size', we could release it without keep in mind the size we demanded. If I use bytevector instead, it means I have to read all the content from a file first. I don't think it's the same with mmap in POSIX. mmap is used for very large data I/O, if we decide to read them all, we lose the game. mmap does lazy disk I/O automatically for the file. > > 5. use munmap to release it > > (munmap <mmap>) > > > > > 6. other helper functions also available: > > If you want a port, use a port. Likewise for strings, bytevectors. > For an instance, in a multi-thread program, if we use port and need to move the cursor, we have to remember/restore the cursor for other threads. But if we use mmap, we don't have to do that, each thread keeps their own pointer/index. And why not read them all into a bytevector? Yes, it helps, but as I explained, the very big file. > > (mmap-find <mmap> string #:optional start end) > > (mmap-flush <mmap> #:optional offset size) > > (mmap-move <mmap> dest src count) > > (mmap-read <mmap> num) > > (mmap-readbyte <mmap>) > > (mmap-readline <mmap>) > > (mmap-resize <mmap> newsize) > > (mmap-rfind <mmap> string #:optional start end) > > (mmap-seek <mmap> pos #:optional whence) > > (mmap-size <mmap>) > > (mmap-tell <mmap>) ; Returns the current position of the file pointer. > > (mmap-write <mmap> str/bv) > > (mmap-writebyte <mmap> byte) > > > > Comments? > > Thanks! > > > > > >