Given that ISV apps can be only changed by the ISV who may or may not be willing to use such a new interface, having a "no cache" property for the file - or given that filesystems are now really cheap with ZFS - for the filesystem would be important as well,
like the forcedirectio mount option for UFS.
No caching at the filesystem level is always appropriate if the application itself maintains a buffer of application data and does their own application specific buffer management like DBMSes or large matrix solvers. Double caching these typicaly huge amounts data
in the filesystem is always a waste of RAM.

- Franz

Nicolas Williams wrote:

On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance 
Engineering wrote:
For read it is an interesting concept. Since

        Reading into cache
        Then copy into user space
        then keep data around but never use it

is not optimal. So 2 issues, there is the cost of copy and there is the memory.

Now could we detect the pattern that cause holding to the
cached block not optimal and do a quick freebehind after the copyout ? Something like Random access + very large file + poor cache hit
ratio ?

An interface to request no caching on a per-file basis would be good
(madvise(2) should do for mmap'ed files, an fcntl(2) or open(2) flag
would be better).

Now about avoiding the copy; That would mean dma straight
into user space ? But if the checksum does not validate the
data, what do we do ?

Who cares?  You DMA into user-space, check the checksum and if there's a
problem return an error; so there's [corrupted] data in the user space
buffer... but the app knows it, so what's the problem (see below)?

                     If storage is not raid-protected and we
have to return EIO, I don't think we can do this _and_
corrupt the user buffer also, not sure what POSIX says for
this situation.

If POSIX compliance is an issue just add new interfaces (possibly as
simple as an open(2) flag).

Now latency wise, the cost of copy is  small compared to the
I/O;  right ? So it now  turns into an  issue of saving some
CPU cycles.

Can you build a system where the cost of the copy adds significantly to
the latency numbers?  (Think RAM disks.)

Nico

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to