I'm puzzled by 2 things.
Naively I'd think a write_cache should not help throughput
test since the cache should fill up after which you should still be
throttled by the physical drain rate. You clearly show that
it helps; Anyone knows why/how a cache helps throughput ?
And the second thing...q
Tao Chen writes:
> Hello Robert,
>
> On 6/1/06, Robert Milkowski <[EMAIL PROTECTED]> wrote:
> > Hello Anton,
> >
> > Thursday, June 1, 2006, 5:27:24 PM, you wrote:
> >
> > ABR> What about small random writes? Won't those also require reading
> > ABR> from all disks in RAID-Z to read the
You propose ((2-way mirrored) x RAID-Z (3+1)) . That gives
you 3 data disks worth and you'd have to loose 2 disk in
each mirror (4 total) to loose data.
For random read load you describe, I could expect that the
per device cache to work nicely; That is file blocks stored
at some given
Robert Milkowski writes:
>
>
>
> btw: just a quick thought - why not to write one block only on 2 disks
> (+checksum on a one disk) instead of spreading one fs block to N-1
> disks? That way zfs could read many fs block at the same time in case
> of larger raid-z pools. ?
That's what y
> I think ZFS should do fine in streaming mode also, though there are
> currently some shortcomings, such as the mentioned 128K I/O size.
It may eventually. The lack of direct I/O may also be an issue, since
some of our systems don't have enough main memory bandwidth to support
data be
Anton wrote:
(For what it's worth, the current 128K-per-I/O policy of ZFS really
hurts its performance for large writes. I imagine this would not be
too difficult to fix if we allowed multiple 128K blocks to be
allocated as a group.)
I'm not taking a stance on this, but if I keep a co
Hi Grant, this may provide some guidance for your setup;
it's somewhat theoretical (take it for what it's worth) but
it spells out some of the tradeoffs in the RAID-Z vs Mirror
battle:
http://blogs.sun.com/roller/page/roch?entry=when_to_and_not_to
As for serving NFS, the user e
Chris Csanady writes:
> On 5/26/06, Bart Smaalders <[EMAIL PROTECTED]> wrote:
> >
> > There are two failure modes associated with disk write caches:
>
> Failure modes aside, is there any benefit to a write cache when command
> queueing is available? It seems that the primary advantage is i
Scott Dickson writes:
> How does (or does) ZFS maintain sequentiality of the blocks of a file.
> If I mkfile on a clean UFS, I likely will get contiguous blocks for my
> file, right? A customer I talked to recently has a desire to access
you would get up to maxcontig worth of sequential b
Cool, I'll try the tool and for good measure the data I
posted was sequential access (from logical point of view).
As for the physical layout, Idon't know, it's quite
possible that ZFS has layed out all blocks sequentially on
the physical side; so certainly this is not a good way
Gregory Shaw writes:
> Rich, correct me if I'm wrong, but here's the scenario I was thinking
> of:
>
> - A large file is created.
> - Over time, the file grows and shrinks.
>
> The anticipated layout on disk due to this is that extents are
> allocated as the file changes. The extent
Robert Says:
Just to be sure - you did reconfigure system to actually allow larger
IO sizes?
Sure enough, I messed up (I had no tuning to get the above data); So
1 MB was my max transfer sizes. Using 8MB I now see:
Bytes Elapse of phys IO Size
Sent
8 MB; 357
Robert Milkowski writes:
> Hello Roch,
>
> Monday, May 15, 2006, 3:23:14 PM, you wrote:
>
> RBPE> The question put forth is whether the ZFS 128K blocksize is sufficient
> RBPE> to saturate a regular disk. There is great body of evidence that shows
> RBPE> that the bigger the write sizes a
Anton B. Rang writes:
> One issue is what we mean by "saturation." It's easy to
bring a disk to 100% busy. We need to keep this discussion
in the context of a workload. Generally when people care
about streaming throghput of a disk, it's because they are
reading or writing a single large file
Gregory Shaw writes:
> I really like the below idea:
> - the ability to defragment a file 'live'.
>
> I can see instances where that could be very useful. For instance,
> if you have multiple LUNs (or spindles, whatever) using ZFS, you
> could re-optimize large files to spre
The question put forth is whether the ZFS 128K blocksize is sufficient
to saturate a regular disk. There is great body of evidence that shows
that the bigger the write sizes and matching large FS clustersize lead
to more throughput. The counter point is that ZFS schedules it's I/O
like nothing
Nicolas Williams writes:
> On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance
> Engineering wrote:
> > For read it is an interesting concept. Since
> >
> >Reading into cache
> >Then copy into user space
> >th
Robert Milkowski writes:
> Hello Roch,
>
> Friday, May 12, 2006, 2:28:59 PM, you wrote:
>
> RBPE> Hi Robert,
>
> RBPE> Could you try 35 concurrent dd each issuing 128K I/O ?
> RBPE> That would be closer to how ZFS would behave.
>
> You mean to UFS?
>
> ok, I did try and I get about
Anton B. Rang writes:
> >Were the benefits coming from extra concurrency (no
> >single writer lock) or avoiding the extra copy to page cache or
> >from too much readahead that is not used before pages need to
> >be recycled.
>
> With QFS, a major benefit we see for databases and direct I/
Franz Haberhauer writes:
> > 'ZFS optimizes random writes versus potential sequential reads.'
>
> This remark focused on the allocation policy during writes,
> not the readahead that occurs during reads.
> Data that are rewritten randomly but in place in a sequential,
> contiguos file (l
Peter Rival writes:
> Roch Bourbonnais - Performance Engineering wrote:
> > Tao Chen writes:
> > > On 5/12/06, Roch Bourbonnais - Performance Engineering
> > > <[EMAIL PROTECTED]> wrote:
> > > >
> > > > From: Gregory Shaw <[
You could start with the ARC paper, Megiddo/Modha FAST'03
conference. ZFS uses a variation of that. It's an interesting
read.
-r
Franz Haberhauer writes:
> Gregory Shaw wrote On 05/11/06 21:15,:
> > Regarding directio and quickio, is there a way with ZFS to skip the
> > system buffer cache?
'ZFS optimizes random writes versus potential sequential reads.'
Now I don't think the current readahead code is where we
want it to be yet but, in the same way that enough
concurrent 128K I/O can saturate a disk (I sure hope that
Milkowski's data will confirm this, ot
Hi Robert,
Could you try 35 concurrent dd each issuing 128K I/O ?
That would be closer to how ZFS would behave.
-r
Robert Milkowski writes:
> Well I have just tested UFS on the same disk.
>
> bash-3.00# newfs -v /dev/rdsk/c5t50E0119495A0d0s0
> newfs: construct a new file system /dev/rd
Tao Chen writes:
> On 5/12/06, Roch Bourbonnais - Performance Engineering
> <[EMAIL PROTECTED]> wrote:
> >
> > From: Gregory Shaw <[EMAIL PROTECTED]>
> > Regarding directio and quickio, is there a way with ZFS to skip the
> > system buffe
Tao Chen writes:
> On 5/11/06, Peter Rival <[EMAIL PROTECTED]> wrote:
> > Richard Elling wrote:
> > > Oracle will zero-fill the tablespace with 128kByte iops -- it is not
> > > sparse. I've got a scar. Has this changed in the past few years?
> >
> > Multiple parallel tablespace creates is
Jeff Bonwick writes:
> > Are you saying that copy-on-write doesn't apply for mmap changes, but
> > only file re-writes? I don't think that gels with anything else I
> > know about ZFS.
>
> No, you're correct -- everything is copy-on-write.
>
Maybe the confusion comes from:
mma
From: Gregory Shaw <[EMAIL PROTECTED]>
Sender: [EMAIL PROTECTED]
To: Mike Gerdts <[EMAIL PROTECTED]>
Cc: ZFS filesystem discussion list ,
[EMAIL PROTECTED]
Subject: Re: [zfs-discuss] ZFS and databases
Date: Thu, 11 May 2006 13:15:48 -0600
Regarding directio and quickio, is there
the require memory pressure on ZFS.
Sounds like another bug we'd need to track;
-r
Daniel Rock writes:
> Roch Bourbonnais - Performance Engineering schrieb:
> > A already noted, this needs not be different from other FS
> > but is still an interesting question. I
Certainly something we'll have to tackle. How about a zpool
memstat (or zpool -m iostat) variation that would report at
least freemem and the amount evictable cached data ?
Would that work for you ?
-r
Philip Beevers writes:
> Roch Bourbonnais - Performance Engineeri
1.033
sys11.405
-r
>
> On 5/11/06, Roch Bourbonnais - Performance Engineering
> <[EMAIL PROTECTED]> wrote:
> >
> >
> > # ptime tar xf linux-2.2.22.tar
> > ptime tar xf linux-2.2.22.tar
> >
> > real 50.292
> &g
- Description of why I don't need directio, quickio, or ODM.
The 2 main benefits that cames out of using directio was
reducing memory consumption by avoiding the page cache AND
bypassing the UFS single writer behavior.
ZFS does not have the single writer lock.
As for memory, the UFS code
# ptime tar xf linux-2.2.22.tar
ptime tar xf linux-2.2.22.tar
real 50.292
user1.019
sys11.417
# ptime tar xf linux-2.2.22.tar
ptime tar xf linux-2.2.22.tar
real 56.833
user1.056
sys11.581
#
avg time waiting for async writes is
Gehr, Chuck R writes:
> One word of caution about random writes. From my experience, they are
> not nearly as fast as sequential writes (like 10 to 20 times slower)
> unless they are carefully aligned on the same boundary as the file
> system record size. Otherwise, there is a heavy read pena
34 matches
Mail list logo