On Jan 4, 2007, at 10:26 AM, Roch - PAE wrote:
All filesystems will incur a read-modify-write when
application is updating portion of a block.
For most Solaris file systems it is the page size, rather than
the block size, that affects read-modify-write; hence 8K (SPARC)
or 4K (x86
On Jan 4, 2007, at 3:25 AM, [EMAIL PROTECTED] wrote:
Is there some reason why a small read on a raidz2 is not
statistically very
likely to require I/O on only one device? Assuming a non-degraded
pool of
course.
ZFS stores its checksums for RAIDZ/RAIDZ2 in such a way that all
disks must b
On Dec 19, 2006, at 7:14 AM, Mike Seda wrote:
Anton B. Rang wrote:
I have a Sun SE 3511 array with 5 x 500 GB SATA-I disks in a RAID
5. This
2 TB logical drive is partitioned into 10 x 200GB slices. I gave
4 of these slices to a Solaris 10 U2 machine and added each of
them to a concat (non
On Oct 17, 2006, at 12:43 PM, Matthew Ahrens wrote:
Jeremy Teo wrote:
Heya Anton,
On 10/17/06, Anton B. Rang <[EMAIL PROTECTED]> wrote:
No, the reason to try to match recordsize to the write size is so
that a small write does not turn into a large read + a large
write. In configurations wh
On Sep 9, 2006, at 1:32 AM, Frank Cusack wrote:
On September 7, 2006 12:25:47 PM -0700 "Anton B. Rang"
<[EMAIL PROTECTED]> wrote:
The bigger problem with system utilization for software RAID is the
cache, not the CPU cycles proper. Simply preparing to write 1 MB
of data
will flush half of a
On Aug 11, 2006, at 12:38 PM, Jonathan Adams wrote:
The problem is that you don't know the actual *contents* of the
parent block
until *all* of its children have been written to their final
locations.
(This is because the block pointer's value depends on the final
location)
But I know whe
On Aug 9, 2006, at 8:18 AM, Roch wrote:
So while I'm feeling optimistic :-) we really ought to be
able to do this in two I/O operations. If we have, say, 500K
of data to write (including all of the metadata), we should
be able to allocate a contiguous 500K block on disk and
On May 31, 2006, at 10:21 AM, Bill Sommerfeld wrote:
Hunh. Gigabit ethernet devices typically implement some form of
interrupt blanking or coalescing so that the host cpu can batch I/O
completion handling. That doesn't exist in FC controllers?
Not in quite the same way, AFAIK. Usually there
On May 31, 2006, at 8:56 AM, Roch Bourbonnais - Performance
Engineering wrote:
I'm not taking a stance on this, but if I keep a controler
full of 128K I/Os and assuming there are targetting
contiguous physical blocks, how different is that to issuing
a very large I/O ?
There are d
a lot of disks on FC probably isn't too bad, though on parallel
SCSI the negotiation overhead and lack of fairness was awful, but I
haven't tested this.)
On Tue, 2006-05-30 at 11:43 -0500, Anton Rang wrote:
Sure, the block size may be 128KB, but ZFS can bundle more than one
per-file/tran
On May 30, 2006, at 12:23 PM, Nicolas Williams wrote:
Another way is to have lots of pre-allocated next ubberblock
locations,
so that seek-to-one-ubberblock times are always small. Each
ubberblock
can point to its predecessor and its copies and list the pre-allocated
possible locations of i
On May 30, 2006, at 11:25 AM, Nicolas Williams wrote:
On Tue, May 30, 2006 at 08:13:56AM -0700, Anton B. Rang wrote:
Well, I don't know about his particular case, but many QFS clients
have found the separation of data and metadata to be invaluable. The
primary reason is that it avoids disk seek
On May 30, 2006, at 10:36 AM, [EMAIL PROTECTED] wrote:
That does not answer th equestion I asked; since ZFS is a copy-on-
write
filesystem, there's no fixed inode location and streaming writes
should
always be possible.
The überblock still must be updated, however. This may not be an issu
Ok so lets consider your 2MB read. You have the option of
setting in in one contiguous place on the disk or split it
into 16 x 128K chunks, somewhat spread all over.
Now you issue a read to that 2MB of data.
As you noted, you either have to wait for the head to find
the 2MB block and stream i
On May 12, 2006, at 11:59 AM, Richard Elling wrote:
CPU cycles and memory bandwidth (which both can be in short
supply on a database server).
We can throw hardware at that :-) Imagine a machine with lots
of extra CPU cycles [ ... ]
Yes, I've heard this story before, and I won't believe it t
We might want an interface for the app to know what the natural block
size of the file is, so it can read at proper file offsets.
Seems that stat(2) could be used for this ...
long st_blksize; /* Preferred I/O block size */
This isn't particularly useful for databases if they already
Now could we detect the pattern that cause holding to the
cached block not optimal and do a quick freebehind after the
copyout ? Something like Random access + very large file +
poor cache hit ratio ?
We might detect it ... or we could let the application give us
the hint, via the directio ioct
17 matches
Mail list logo