> For example, if I gave you the right to snapshot ~darrenm, I might
> want to only allow you 10 snapshots. Is that a worthwhile restriction
> or is it better to just let quotas take care of that?
>
> At issue here is the potential for (again :) zfs to spam df output
> through potentially acciden
> You are almost certainly running in to this known bug:
>
> 630 reads from mirror are not spread evenly
Right. FYI, we fixed this in build 38.
Jeff
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailma
> Uhm... that's the point where you are IMO slightly wrong. The exact
> requirement is that inodes and data need to be seperated.
I find that difficult to believe.
What you need is performance. Based on your experiences with
completely different, static-metadata architectures, you've
concluded (
> > http://blogs.sun.com/roller/page/roch?entry=when_to_and_not_to
>
> thanks, that is very useful information. it pretty much rules out raid-z
> for this workload with any reasonable configuration I can dream up
> with only 12 disks available. it looks like mirroring is going to
> provide hig
> That helps a lot - thank you.
> I wish I knew it before... Information Roch put on his blog should be
> explained both in MAN pages and ZFS Admin Guide - as this is something
> one would not expect.
>
> It actually means raid-z is useless in many enviroments compare to
> traditional raid-5.
Wel
> btw: I'm really suprised how SATA disks are unreliable. I put dozen
> TBs of data on ZFS last time and just after few days I got few hundreds
> checksum error (there raid-z was used). And these disks are 500GB in
> 3511 array. Well that would explain some fsck's, etc. we saw before.
I suspect yo
> RL> why is sum of disks bandwidth from `zpool iostat -v 1`
> RL> less than the pool total while watching `du /zfs`
> RL> on opensol-20060605 bits?
>
> Due to raid-z implementation. See last discussion on raid-z
> performance, etc.
It's an artifact of the way raidz and the vdev read cache intera
> I assume ZFS only writes something when there is actually data?
Right.
Jeff
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> a test against the same iscsi targets using linux and XFS and the
> NFS server implementation there gave me 1.25MB/sec writes. I was about
> to throw in the towel and deem ZFS/NFS has unusable until B41 came
> along and at least gave me 1.25MB/sec.
That's still super slow -- is this over a 10Mb
> Which is better -
> zfs raidz on hardware mirrors, or zfs mirror on hardware raid-5?
The latter. With a mirror of RAID-5 arrays, you get:
(1) Self-healing data.
(2) Tolerance of whole-array failure.
(3) Tolerance of *at least* three disk failures.
(4) More IOPs than raidz of hardware mirror
> So the question becomes, what are the tradeoffs between running a two-way
> mirror vs. running RAID-Z on two disks?
A two-way mirror would be better -- no parity generation, and you have
the ability to attach/detach for more or less replication. (We could
optimize the RAID-Z code for the two-di
> Having this feature seems like a no-brainer to me. Who cares if SVM/
> UFS/whatever didn't have it. ZFS is different from those. This is
> another area where ZFS could thumb its nose at those relative
> dinosaurs, feature-wise, and I argue that this is an important
> feature to have.
Yep,
> Since transactions in ZFS are committed until the ueberblock is written,
> this boils down to:
>
> "How is the ueberblock committed atomically in a RAID-Z configuration?"
RAID-Z isn't even necessary to have this issue; all you need is a disk
that doesn't actually guarantee atomicity of sing
> Thanks for providing this last bit of my mental ZFS picture.
>
> Does ZFS keep statistics on how many ueberblocks are bad when
> it imports a pool?
No. We could, of course, but I'm not sure how it would be useful.
> Or is it the case that when fewer than 128
> ueberblocks have ever been commi
> Maybe this is a dumb question, but I've never written a
> filesystem is there a fundamental reason why you cannot have
> some files mirrored, with others as raidz, and others with no
> resilience? This would allow a pool to initially exist on one
> disk, then gracefully change between different r
> Can anyone tell me why pool created with zpool are also zfs file systems
> (and mounted) which can be used for storing files? It would have been
> more transparent if pool would not allow the storage of files.
Grab a cup of coffee and get comfortable. Ready?
Oh, what a can of worms this was.
> I have a 10 disk raidz pool running Solaris 10 U2, and after a reboot
> the whole pool became unavailable after apparently loosing a diskdrive.
> [...]
> NAMESTATE READ WRITE CKSUM
> dataUNAVAIL 0 0 0 insufficient replicas
> c1t0d0ON
> >PERMISSION GRANTING
> >
> > zfs allow [-l] [-d] <"everyone"|user|group> [,...] \
> >...
> > zfs unallow [-r] [-l] [-d]
> >
>
> If we're going to use English words, it should be "allow" and "disallow".
The problem with 'disallow' is that it implies precluding a behavior
that would no
> Of course the re-writing must be 100% safe, but that can be done with COW
> quite easily.
Almost. The hard part is snapshots. If you modify a data block,
you must also modify every L1 indirect block in every snapshot
that points to it, and every L2 above each L1, all the way up
to the uberbloc
> For 6 disks, 3x2-way RAID-1+0 offers better resiliency than RAID-Z
> or RAID-Z2.
Maybe I'm missing something, but it ought to be the other way around.
With 6 disks, RAID-Z2 can tolerate any two disk failures, whereas
for 3x2-way mirroring, of the (6 choose 2) = 6*5/2 = 15 possible
two-disk failu
> However, we do have the advantage of always knowing when something
> is corrupted, and knowing what that particular block should have been.
We also have ditto blocks for all metadata, so that even if any block
of ZFS metadata is destroyed, we always have another copy.
Bill Moore describes ditto
> For a synchronous write to a pool with mirrored disks, does the write
> unblock after just one of the disks' write caches is flushed,
> or only after all of the disks' caches are flushed?
The latter. We don't consider a write to be committed until
the data is on stable storage at full replicati
> I've a non-mirrored zfs file systems which shows the status below. I saw
> the thread in the archives about working this out but it looks like ZFS
> messages have changed. How do I find out what file(s) this is?
> [...]
> errors: The following persistent errors have been detected:
>
>
> > bonus questions: any idea when hot spares will make it to S10?
>
> good question :)
It'll be in U3, and probably available as patches for U2 as well.
The reason for U2 patches is Thumper (x4500), because we want ZFS
on Thumper to have hot spares and double-parity RAID-Z from day one.
Jeff
_
> I have a SAS array with a zfs pool on it. zfs automatically searches for
> and mounts the zfs pool I've created there. I want to attach another
> host to this array, but it doesn't have any provision for zones or the
> like. (Like you would find in an FC array or in the switch infrastructure.)
> ZFS will try to enable write cache if whole disks is given.
>
> Additionally keep in mind that outer region of a disk is much faster.
And it's portable. If you use whole disks, you can export the
pool from one machine and import it on another. There's no way
to export just one slice and leave
> is zfs any less efficient with just using a portion of a
> disk versus the entire disk?
As others mentioned, if we're given a whole disk (i.e. no slice
is specified) then we can safely enable the write cache.
One other effect -- probably not huge -- is that the block placement
algorithm is mos
> With all of the talk about performance problems due to
> ZFS doing a sync to force the drives to commit to data
> being on disk, how much of a benefit is this - especially
> for NFS?
It depends. For some drives it's literally 10x.
> Also, if I was lucky enough to have a working prestoserv
> ca
> - When the filesystems have compress=ON I see the following: reads from
compressed filesystems come in waves; zpool will report for long durations (60+
seconds) no read activity while the write activity is consistently reported at
20MB/S (no variation in the write rate throughtout the test).
> And it started replacement/resilvering... after few minutes system became
unavailbale. Reboot only gives me a few minutes, then resilvering make system
unresponsible.
>
> Is there any workaroud or patch for this problem???
Argh, sorry -- the problem is that we don't do aggressive enough
scrub
> It looks like now the scrub has completed. Should I now clear these warnings?
Yep. You survived the Unfortunate Event unscathed. You're golden.
Jeff
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listi
> Nice, this is definitely pointing the finger more definitively. Next
> time could you try:
>
> dtrace -n '[EMAIL PROTECTED](20)] = count()}' -c 'sleep 5'
>
> (just send the last 10 or so stack traces)
>
> In the mean time I'll talk with our SPA experts and see if I can figure
> out how to f
> Where can I find information on the file allocation methodology used by ZFS?
You've inspired me to blog again:
http://blogs.sun.com/bonwick/entry/zfs_block_allocation
I'll describe the way we manage free space in the next post.
Jeff
___
zfs
> What I meant is that events that "cause a permanent change..." should
> not be deleted from the circular log if there are "old" (older?)
> "operationally interesting" events that could be deleted instead.
>
> I.e., if the log can keep only so much info then I'd rather have the
> history of a poo
> I just got an Ultra 20 with the default 80GB internal disk. Right now,
> I'm using around 30GB for zfs. I will be getting a new 250GB drive.
>
> Question: If I create a 30GB slice on the 250GB drive, will that be okay
> to use as mirror (or raidz) of the current 30GB that I now have on the
> Why not use a terse XML format?
I suppose we could, but I'm not convinced that XML is stable enough
to be part of a 30-year on-disk format. 15 years ago PostScript
was going to be stable forever, but today many PostScript readers
barf on Adobe-PS-1.0 files -- which were supposed to be the most
> Why not use the Solaris audit facility?
Several reasons:
(1) We want the history to follow the data, not the host. If you
export the pool from one host and import it on another, we want
the command history to move with the pool. That won't happen
if the history file is somewhere i
> bash-3.00# dd if=/dev/urandom of=/dev/dsk/c1t10d0 bs=1024 count=20480
A couple of things:
(1) When you write to /dev/dsk, rather than /dev/rdsk, the results
are cached in memory. So the on-disk state may have been unaltered.
(2) When you write to /dev/rdsk/c-t-d, without specifying a slic
101 - 138 of 138 matches
Mail list logo