Re: [zfs-discuss] Re: ZFS and databases

2006-06-14 Thread Roch
For Output ops, ZFS could setup a 10MB I/O transfer to disk starting at sector X, or chunk that up in 128K while still assigning the samerangeof disk blocks forthe operations. Yes there will be more control information going around, a little more CPU consumed, but the disk w

Re: [zfs-discuss] Re: ZFS and databases

2006-06-14 Thread Richard Elling
billtodd wrote: I do want to comment on the observation that "enough concurrent 128K I/O can saturate a disk" - the apparent implication being that one could therefore do no better with larger accesses, an incorrect conclusion. Current disks can stream out 128 KB in 1.5 - 3 ms., while taking 5

[zfs-discuss] Re: ZFS and databases

2006-06-13 Thread can you guess?
Sorry for resurrecting this interesting discussion so late: I'm skinning backwards through the forum. One comment about segregating database logs is that people who take their data seriously often want a 'belt plus suspenders' approach to recovery. Conventional RAID, even supplemented with ZF

Re: [zfs-discuss] Re: ZFS and databases

2006-05-15 Thread Nicolas Williams
On Mon, May 15, 2006 at 11:17:17AM -0700, Bart Smaalders wrote: > Perhaps an fadvise call is in order? We already have directio(3C). (That was a surprise for me also.) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/

Re: [zfs-discuss] Re: ZFS and databases

2006-05-15 Thread Bart Smaalders
Nicolas Williams wrote: On Mon, May 15, 2006 at 07:16:38PM +0200, Franz Haberhauer wrote: Nicolas Williams wrote: Yes, but remember, DB vendors have adopted new features before -- they want to have the fastest DB. Same with open source web servers. So I'm a bit optimistic. Yes, but they usu

Re: [zfs-discuss] Re: ZFS and databases

2006-05-15 Thread Nicolas Williams
On Mon, May 15, 2006 at 07:16:38PM +0200, Franz Haberhauer wrote: > Nicolas Williams wrote: > >Yes, but remember, DB vendors have adopted new features before -- they > >want to have the fastest DB. Same with open source web servers. So I'm > >a bit optimistic. > > > > > Yes, but they usually ado

Re: [zfs-discuss] Re: ZFS and databases

2006-05-15 Thread Darren J Moffat
Franz Haberhauer wrote: This would work technically, but wether ISVs are willing to support such usage is a different topic (there may be startup scripts involved making it a little tricky to pass an library path to the app). Yet another reason to start the applications from SMF. -- Darren J

Re: [zfs-discuss] Re: ZFS and databases

2006-05-15 Thread Franz Haberhauer
Nicolas Williams wrote: On Sat, May 13, 2006 at 08:23:55AM +0200, Franz Haberhauer wrote: Given that ISV apps can be only changed by the ISV who may or may not be willing to use such a new interface, having a "no cache" property for the file - or given that filesystems are now really cheap

Re: [zfs-discuss] Re: ZFS and databases

2006-05-13 Thread Nicolas Williams
On Sat, May 13, 2006 at 08:23:55AM +0200, Franz Haberhauer wrote: > Given that ISV apps can be only changed by the ISV who may or may not be > willing to > use such a new interface, having a "no cache" property for the file - or > given that filesystems > are now really cheap with ZFS - for the

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Franz Haberhauer
Given that ISV apps can be only changed by the ISV who may or may not be willing to use such a new interface, having a "no cache" property for the file - or given that filesystems are now really cheap with ZFS - for the filesystem would be important as well, like the forcedirectio mount option

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Matthew Ahrens
On Fri, May 12, 2006 at 12:36:53PM -0500, Anton Rang wrote: > >We might want an interface for the app to know what the natural block > >size of the file is, so it can read at proper file offsets. > > Seems that stat(2) could be used for this ... > > long st_blksize; /* Preferred I/O blo

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Anton Rang
On May 12, 2006, at 11:59 AM, Richard Elling wrote: CPU cycles and memory bandwidth (which both can be in short supply on a database server). We can throw hardware at that :-) Imagine a machine with lots of extra CPU cycles [ ... ] Yes, I've heard this story before, and I won't believe it t

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Anton Rang
We might want an interface for the app to know what the natural block size of the file is, so it can read at proper file offsets. Seems that stat(2) could be used for this ... long st_blksize; /* Preferred I/O block size */ This isn't particularly useful for databases if they already

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Nicolas Williams
On Fri, May 12, 2006 at 09:59:56AM -0700, Richard Elling wrote: > On Fri, 2006-05-12 at 10:42 -0500, Anton Rang wrote: > > > Now latency wise, the cost of copy is small compared to the > > > I/O; right ? So it now turns into an issue of saving some > > > CPU cycles. > > > > CPU cycles and memo

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Richard Elling
On Fri, 2006-05-12 at 10:42 -0500, Anton Rang wrote: > > Now latency wise, the cost of copy is small compared to the > > I/O; right ? So it now turns into an issue of saving some > > CPU cycles. > > CPU cycles and memory bandwidth (which both can be in short > supply on a database server). We

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Nicolas Williams
On Fri, May 12, 2006 at 06:33:00PM +0200, Roch Bourbonnais - Performance Engineering wrote: > Directio is non-posix anyway and given that people have been > train to inform the system that the cache won't be useful, > that it's a hard problem to detect automatically, let's > avoid the copy and sa

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Nicolas Williams writes: > On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance > Engineering wrote: > > For read it is an interesting concept. Since > > > >Reading into cache > >Then copy into user space > >then keep data around but never use it > > > >

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Nicolas Williams
On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance Engineering wrote: > For read it is an interesting concept. Since > > Reading into cache > Then copy into user space > then keep data around but never use it > > is not optimal. > So 2 issues, there is th

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Anton Rang
Now could we detect the pattern that cause holding to the cached block not optimal and do a quick freebehind after the copyout ? Something like Random access + very large file + poor cache hit ratio ? We might detect it ... or we could let the application give us the hint, via the directio ioct

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Anton B. Rang writes: > >Were the benefits coming from extra concurrency (no > >single writer lock) or avoiding the extra copy to page cache or > >from too much readahead that is not used before pages need to > >be recycled. > > With QFS, a major benefit we see for databases and direct I/

[zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Anton B. Rang
>Were the benefits coming from extra concurrency (no >single writer lock) or avoiding the extra copy to page cache or >from too much readahead that is not used before pages need to >be recycled. With QFS, a major benefit we see for databases and direct I/O is an effective doubling of the memor