Re: [zfs-discuss] ZFS and databases

2006-05-22 Thread Roch Bourbonnais - Performance Engineering
Cool, I'll try the tool and for good measure the data I posted was sequential access (from logical point of view). As for the physical layout, Idon't know, it's quite possible that ZFS has layed out all blocks sequentially on the physical side; so certainly this is not a good way

Re: [zfs-discuss] ZFS and databases

2006-05-22 Thread Roch Bourbonnais - Performance Engineering
Gregory Shaw writes: > Rich, correct me if I'm wrong, but here's the scenario I was thinking > of: > > - A large file is created. > - Over time, the file grows and shrinks. > > The anticipated layout on disk due to this is that extents are > allocated as the file changes. The extent

Re: [zfs-discuss] ZFS and databases

2006-05-15 Thread Franz Haberhauer
The problem I see with "sequential access jump all over the place" is that this increases the utilization of the disks - over the years disks have become even faster for sequential access, whereas random access (as they have to move the actuator) has not improved at the same pace - this is what

Re: [zfs-discuss] ZFS and databases

2006-05-15 Thread Gregory Shaw
Rich, correct me if I'm wrong, but here's the scenario I was thinking of: - A large file is created. - Over time, the file grows and shrinks. The anticipated layout on disk due to this is that extents are allocated as the file changes. The extents may or may not be on multiple spindles.

Re: [zfs-discuss] ZFS and databases

2006-05-15 Thread Roch Bourbonnais - Performance Engineering
Gregory Shaw writes: > I really like the below idea: > - the ability to defragment a file 'live'. > > I can see instances where that could be very useful. For instance, > if you have multiple LUNs (or spindles, whatever) using ZFS, you > could re-optimize large files to spre

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Gregory Shaw
I really like the below idea: - the ability to defragment a file 'live'. I can see instances where that could be very useful. For instance, if you have multiple LUNs (or spindles, whatever) using ZFS, you could re-optimize large files to spread the chunks across as many spindles a

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Gregory Shaw
PROTECTED] To: Mike Gerdts <[EMAIL PROTECTED]> Cc: ZFS filesystem discussion list , [EMAIL PROTECTED] Subject: Re: [zfs-discuss] ZFS and databases Date: Thu, 11 May 2006 13:15:48 -0600 Regarding directio and quickio, is there a way with ZFS to skip the system buffer cache?

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Franz Haberhauer writes: > > 'ZFS optimizes random writes versus potential sequential reads.' > > This remark focused on the allocation policy during writes, > not the readahead that occurs during reads. > Data that are rewritten randomly but in place in a sequential, > contiguos file (l

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Peter Rival writes: > Roch Bourbonnais - Performance Engineering wrote: > > Tao Chen writes: > > > On 5/12/06, Roch Bourbonnais - Performance Engineering > > > <[EMAIL PROTECTED]> wrote: > > > > > > > > From: Gregory Shaw <[EMAIL PROTECTED]> > > > > Regarding directio and quickio,

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Franz Haberhauer
>'ZFS optimizes random writes versus potential sequential reads.' This remark focused on the allocation policy during writes, not the readahead that occurs during reads. Data that are rewritten randomly but in place in a sequential, contiguos file (like a preallocated UFS file) are not optimi

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
You could start with the ARC paper, Megiddo/Modha FAST'03 conference. ZFS uses a variation of that. It's an interesting read. -r Franz Haberhauer writes: > Gregory Shaw wrote On 05/11/06 21:15,: > > Regarding directio and quickio, is there a way with ZFS to skip the > > system buffer cache?

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Peter Rival
Roch Bourbonnais - Performance Engineering wrote: Tao Chen writes: > On 5/12/06, Roch Bourbonnais - Performance Engineering > <[EMAIL PROTECTED]> wrote: > > > > From: Gregory Shaw <[EMAIL PROTECTED]> > > Regarding directio and quickio, is there a way with ZFS to skip the > > system bu

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
'ZFS optimizes random writes versus potential sequential reads.' Now I don't think the current readahead code is where we want it to be yet but, in the same way that enough concurrent 128K I/O can saturate a disk (I sure hope that Milkowski's data will confirm this, ot

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Franz Haberhauer
Gregory Shaw wrote On 05/11/06 21:15,: Regarding directio and quickio, is there a way with ZFS to skip the system buffer cache? I've seen big benefits for using directio when the data files have been segregated from the log files. Having the system compete with the DB for read-ahead results

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Franz Haberhauer
Roch Bourbonnais - Performance Engineering wrote On 05/12/06 09:30,: Tao Chen writes: > On 5/11/06, Peter Rival <[EMAIL PROTECTED]> wrote: > > Richard Elling wrote: > > > Oracle will zero-fill the tablespace with 128kByte iops -- it is not > > > sparse. I've got a scar. Has this changed i

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Tao Chen writes: > On 5/12/06, Roch Bourbonnais - Performance Engineering > <[EMAIL PROTECTED]> wrote: > > > > From: Gregory Shaw <[EMAIL PROTECTED]> > > Regarding directio and quickio, is there a way with ZFS to skip the > > system buffer cache? I've seen big benefits for using direc

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Tao Chen
On 5/12/06, Roch Bourbonnais - Performance Engineering <[EMAIL PROTECTED]> wrote: From: Gregory Shaw <[EMAIL PROTECTED]> Regarding directio and quickio, is there a way with ZFS to skip the system buffer cache? I've seen big benefits for using directio when the data files have been segre

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Tao Chen writes: > On 5/11/06, Peter Rival <[EMAIL PROTECTED]> wrote: > > Richard Elling wrote: > > > Oracle will zero-fill the tablespace with 128kByte iops -- it is not > > > sparse. I've got a scar. Has this changed in the past few years? > > > > Multiple parallel tablespace creates is

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Jeff Bonwick writes: > > Are you saying that copy-on-write doesn't apply for mmap changes, but > > only file re-writes? I don't think that gels with anything else I > > know about ZFS. > > No, you're correct -- everything is copy-on-write. > Maybe the confusion comes from: mma

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
From: Gregory Shaw <[EMAIL PROTECTED]> Sender: [EMAIL PROTECTED] To: Mike Gerdts <[EMAIL PROTECTED]> Cc: ZFS filesystem discussion list , [EMAIL PROTECTED] Subject: Re: [zfs-discuss] ZFS and databases Date: Thu, 11 May 2006 13:15:48 -0600 Regarding directio and quick

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Jeff Bonwick
> Are you saying that copy-on-write doesn't apply for mmap changes, but > only file re-writes? I don't think that gels with anything else I > know about ZFS. No, you're correct -- everything is copy-on-write. Jeff ___ zfs-discuss mailing list zfs-d

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Boyd Adamson
On 12/05/2006, at 3:59 AM, Richard Elling wrote: On Thu, 2006-05-11 at 10:27 -0700, Richard Elling wrote: On Thu, 2006-05-11 at 10:31 -0600, Gregory Shaw wrote: A couple of points/additions with regard to oracle in particular: When talking about large database installations, copy-on-wr

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Torrey McMahon
This thread is useless without data. This thread is useless without data. This thread is useless without data. This thread is useless without data. This thread is useless without data. :-P ___ zfs-discuss mailing list zfs-discuss@op

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Gregory Shaw
Regarding directio and quickio, is there a way with ZFS to skip the system buffer cache? I've seen big benefits for using directio when the data files have been segregated from the log files. Having the system compete with the DB for read-ahead results in double work. On May 10, 2006, at

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Tao Chen
On 5/11/06, Peter Rival <[EMAIL PROTECTED]> wrote: Richard Elling wrote: > Oracle will zero-fill the tablespace with 128kByte iops -- it is not > sparse. I've got a scar. Has this changed in the past few years? Multiple parallel tablespace creates is usually a big pain point for filesystem /

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Peter Rival
Richard Elling wrote: On Thu, 2006-05-11 at 10:27 -0700, Richard Elling wrote: On Thu, 2006-05-11 at 10:31 -0600, Gregory Shaw wrote: A couple of points/additions with regard to oracle in particular: When talking about large database installations, copy-on-write may or may not apply. The

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Richard Elling
On Thu, 2006-05-11 at 10:27 -0700, Richard Elling wrote: > On Thu, 2006-05-11 at 10:31 -0600, Gregory Shaw wrote: > > A couple of points/additions with regard to oracle in particular: > > > > When talking about large database installations, copy-on-write may > > or may not apply. The files

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Peter Rival
Richard Elling wrote: On Thu, 2006-05-11 at 10:31 -0600, Gregory Shaw wrote: A couple of points/additions with regard to oracle in particular: When talking about large database installations, copy-on-write may or may not apply. The files are never completely rewritten, only changed inter

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Richard Elling
On Thu, 2006-05-11 at 10:31 -0600, Gregory Shaw wrote: > A couple of points/additions with regard to oracle in particular: > > When talking about large database installations, copy-on-write may > or may not apply. The files are never completely rewritten, only > changed internally via

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Gregory Shaw
A couple of points/additions with regard to oracle in particular: When talking about large database installations, copy-on-write may or may not apply. The files are never completely rewritten, only changed internally via mmap(). When you lay down your database, you will generally alloca

RE: [zfs-discuss] ZFS and databases

2006-05-11 Thread Gehr, Chuck R
R Cc: [EMAIL PROTECTED]; Boyd Adamson; ZFS filesystem discussion list Subject: RE: [zfs-discuss] ZFS and databases Gehr, Chuck R writes: > One word of caution about random writes. From my experience, they are > not nearly as fast as sequential writes (like 10 to 20 times slower) > un

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
- Description of why I don't need directio, quickio, or ODM. The 2 main benefits that cames out of using directio was reducing memory consumption by avoiding the page cache AND bypassing the UFS single writer behavior. ZFS does not have the single writer lock. As for memory, the UFS code

RE: [zfs-discuss] ZFS and databases

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
Gehr, Chuck R writes: > One word of caution about random writes. From my experience, they are > not nearly as fast as sequential writes (like 10 to 20 times slower) > unless they are carefully aligned on the same boundary as the file > system record size. Otherwise, there is a heavy read pena

Re: [zfs-discuss] ZFS and databases

2006-05-10 Thread Richard Elling
On Wed, 2006-05-10 at 20:42 -0500, Mike Gerdts wrote: > On 5/10/06, Boyd Adamson <[EMAIL PROTECTED]> wrote: > > What we need is some clear blueprints/best practices docs on this, I > > think. > > In due time... it was only recently that some of the performance enhancements were put back. Note: id

Re: [zfs-discuss] ZFS and databases

2006-05-10 Thread Mike Gerdts
On 5/10/06, Boyd Adamson <[EMAIL PROTECTED]> wrote: What we need is some clear blueprints/best practices docs on this, I think. Most definitely. Key things that people I work with (including me...) would like to see are... - Some success stories of people running large databases (working se

Re: [zfs-discuss] ZFS and databases

2006-05-10 Thread Boyd Adamson
On 11/05/2006, at 9:17 AM, James C. McPherson wrote: - Redundancy is performed at the filesystem level, probably on all disks in the pool. more at the pool level iirc, but yes, over all the disks where you have them mirrored or raid/raidZ-ed Yes, of course. I meant at the filesystem level

RE: [zfs-discuss] ZFS and databases

2006-05-10 Thread Gehr, Chuck R
PM To: Boyd Adamson Cc: ZFS filesystem discussion list Subject: Re: [zfs-discuss] ZFS and databases Hi Boyd, Boyd Adamson wrote: > One question that has come up a number of times when I've been > speaking with people (read: evangelizing :) ) about ZFS is about > database storage.

Re: [zfs-discuss] ZFS and databases

2006-05-10 Thread James C. McPherson
Hi Boyd, Boyd Adamson wrote: One question that has come up a number of times when I've been speaking with people (read: evangelizing :) ) about ZFS is about database storage. In conventional use storage has separated redo logs from table space, on a spindle basis. I'm not a database expert bu