Re: [zfs-discuss] RFE filesystem ownership

2006-05-24 Thread Jeff Bonwick
> For example, if I gave you the right to snapshot ~darrenm, I might > want to only allow you 10 snapshots. Is that a worthwhile restriction > or is it better to just let quotas take care of that? > > At issue here is the potential for (again :) zfs to spam df output > through potentially acciden

Re: [zfs-discuss] ZFS mirror and read policy; kstat I/O values for zfs

2006-05-26 Thread Jeff Bonwick
> You are almost certainly running in to this known bug: > > 630 reads from mirror are not spread evenly Right. FYI, we fixed this in build 38. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailma

Re: [zfs-discuss] Re: [osol-discuss] Re: I wish Sun would open-source "QFS"... /was:Re: Re: Distributed File System for Solaris

2006-06-01 Thread Jeff Bonwick
> Uhm... that's the point where you are IMO slightly wrong. The exact > requirement is that inodes and data need to be seperated. I find that difficult to believe. What you need is performance. Based on your experiences with completely different, static-metadata architectures, you've concluded (

Re: [zfs-discuss] 3510 configuration for ZFS

2006-06-01 Thread Jeff Bonwick
> > http://blogs.sun.com/roller/page/roch?entry=when_to_and_not_to > > thanks, that is very useful information. it pretty much rules out raid-z > for this workload with any reasonable configuration I can dream up > with only 12 disks available. it looks like mirroring is going to > provide hig

Re: Re[2]: [zfs-discuss] Re: Big IOs overhead due to ZFS?

2006-06-01 Thread Jeff Bonwick
> That helps a lot - thank you. > I wish I knew it before... Information Roch put on his blog should be > explained both in MAN pages and ZFS Admin Guide - as this is something > one would not expect. > > It actually means raid-z is useless in many enviroments compare to > traditional raid-5. Wel

Re: [zfs-discuss] zpool status and CKSUM errors

2006-06-09 Thread Jeff Bonwick
> btw: I'm really suprised how SATA disks are unreliable. I put dozen > TBs of data on ZFS last time and just after few days I got few hundreds > checksum error (there raid-z was used). And these disks are 500GB in > 3511 array. Well that would explain some fsck's, etc. we saw before. I suspect yo

Re: [zfs-discuss] opensol-20060605 # zpool iostat -v 1

2006-06-09 Thread Jeff Bonwick
> RL> why is sum of disks bandwidth from `zpool iostat -v 1` > RL> less than the pool total while watching `du /zfs` > RL> on opensol-20060605 bits? > > Due to raid-z implementation. See last discussion on raid-z > performance, etc. It's an artifact of the way raidz and the vdev read cache intera

Re: [zfs-discuss] ZFS questions

2006-06-20 Thread Jeff Bonwick
> I assume ZFS only writes something when there is actually data? Right. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 15 minute fdsync problem and ZFS: Solved

2006-06-22 Thread Jeff Bonwick
> a test against the same iscsi targets using linux and XFS and the > NFS server implementation there gave me 1.25MB/sec writes. I was about > to throw in the towel and deem ZFS/NFS has unusable until B41 came > along and at least gave me 1.25MB/sec. That's still super slow -- is this over a 10Mb

Re: Re[2]: [zfs-discuss] Re: ZFS and Storage

2006-06-28 Thread Jeff Bonwick
> Which is better - > zfs raidz on hardware mirrors, or zfs mirror on hardware raid-5? The latter. With a mirror of RAID-5 arrays, you get: (1) Self-healing data. (2) Tolerance of whole-array failure. (3) Tolerance of *at least* three disk failures. (4) More IOPs than raidz of hardware mirror

Re: [zfs-discuss] RAID-Z on two disks vs. 2-way mirror

2006-07-07 Thread Jeff Bonwick
> So the question becomes, what are the tradeoffs between running a two-way > mirror vs. running RAID-Z on two disks? A two-way mirror would be better -- no parity generation, and you have the ability to attach/detach for more or less replication. (We could optimize the RAID-Z code for the two-di

Re: [zfs-discuss] ZFS needs a viable backup mechanism

2006-07-08 Thread Jeff Bonwick
> Having this feature seems like a no-brainer to me. Who cares if SVM/ > UFS/whatever didn't have it. ZFS is different from those. This is > another area where ZFS could thumb its nose at those relative > dinosaurs, feature-wise, and I argue that this is an important > feature to have. Yep,

Re: [zfs-discuss] Re: Transactional RAID-Z?

2006-07-12 Thread Jeff Bonwick
> Since transactions in ZFS are committed until the ueberblock is written, > this boils down to: > > "How is the ueberblock committed atomically in a RAID-Z configuration?" RAID-Z isn't even necessary to have this issue; all you need is a disk that doesn't actually guarantee atomicity of sing

Re: [zfs-discuss] Re: Transactional RAID-Z?

2006-07-12 Thread Jeff Bonwick
> Thanks for providing this last bit of my mental ZFS picture. > > Does ZFS keep statistics on how many ueberblocks are bad when > it imports a pool? No. We could, of course, but I'm not sure how it would be useful. > Or is it the case that when fewer than 128 > ueberblocks have ever been commi

RE: [zfs-discuss] Expanding raidz2

2006-07-13 Thread Jeff Bonwick
> Maybe this is a dumb question, but I've never written a > filesystem is there a fundamental reason why you cannot have > some files mirrored, with others as raidz, and others with no > resilience? This would allow a pool to initially exist on one > disk, then gracefully change between different r

Re: [zfs-discuss] pools are zfs file systems?

2006-07-15 Thread Jeff Bonwick
> Can anyone tell me why pool created with zpool are also zfs file systems > (and mounted) which can be used for storing files? It would have been > more transparent if pool would not allow the storage of files. Grab a cup of coffee and get comfortable. Ready? Oh, what a can of worms this was.

Re: [zfs-discuss] zpool unavailable after reboot

2006-07-17 Thread Jeff Bonwick
> I have a 10 disk raidz pool running Solaris 10 U2, and after a reboot > the whole pool became unavailable after apparently loosing a diskdrive. > [...] > NAMESTATE READ WRITE CKSUM > dataUNAVAIL 0 0 0 insufficient replicas > c1t0d0ON

Re: [zfs-discuss] Proposal: delegated administration

2006-07-18 Thread Jeff Bonwick
> >PERMISSION GRANTING > > > > zfs allow [-l] [-d] <"everyone"|user|group> [,...] \ > >... > > zfs unallow [-r] [-l] [-d] > > > > If we're going to use English words, it should be "allow" and "disallow". The problem with 'disallow' is that it implies precluding a behavior that would no

Re: [zfs-discuss] Enabling compression/encryption on a populated filesystem

2006-07-18 Thread Jeff Bonwick
> Of course the re-writing must be 100% safe, but that can be done with COW > quite easily. Almost. The hard part is snapshots. If you modify a data block, you must also modify every L1 indirect block in every snapshot that points to it, and every L2 above each L1, all the way up to the uberbloc

Re: [zfs-discuss] Big JBOD: what would you do?

2006-07-18 Thread Jeff Bonwick
> For 6 disks, 3x2-way RAID-1+0 offers better resiliency than RAID-Z > or RAID-Z2. Maybe I'm missing something, but it ought to be the other way around. With 6 disks, RAID-Z2 can tolerate any two disk failures, whereas for 3x2-way mirroring, of the (6 choose 2) = 6*5/2 = 15 possible two-disk failu

Re: [zfs-discuss] Can't remove corrupt file

2006-07-20 Thread Jeff Bonwick
> However, we do have the advantage of always knowing when something > is corrupted, and knowing what that particular block should have been. We also have ditto blocks for all metadata, so that even if any block of ZFS metadata is destroyed, we always have another copy. Bill Moore describes ditto

Re: [zfs-discuss] Flushing synchronous writes to mirrors

2006-07-26 Thread Jeff Bonwick
> For a synchronous write to a pool with mirrored disks, does the write > unblock after just one of the disks' write caches is flushed, > or only after all of the disks' caches are flushed? The latter. We don't consider a write to be committed until the data is on stable storage at full replicati

Re: [zfs-discuss] persistent errors - which file?

2006-07-27 Thread Jeff Bonwick
> I've a non-mirrored zfs file systems which shows the status below. I saw > the thread in the archives about working this out but it looks like ZFS > messages have changed. How do I find out what file(s) this is? > [...] > errors: The following persistent errors have been detected: > >

Re: [zfs-discuss] sharing a storage array

2006-07-27 Thread Jeff Bonwick
> > bonus questions: any idea when hot spares will make it to S10? > > good question :) It'll be in U3, and probably available as patches for U2 as well. The reason for U2 patches is Thumper (x4500), because we want ZFS on Thumper to have hot spares and double-parity RAID-Z from day one. Jeff _

Re: [zfs-discuss] sharing a storage array

2006-07-27 Thread Jeff Bonwick
> I have a SAS array with a zfs pool on it. zfs automatically searches for > and mounts the zfs pool I've created there. I want to attach another > host to this array, but it doesn't have any provision for zones or the > like. (Like you would find in an FC array or in the switch infrastructure.)

Re: [zfs-discuss] ZFS performance using slices vs. entire disk?

2006-08-02 Thread Jeff Bonwick
> ZFS will try to enable write cache if whole disks is given. > > Additionally keep in mind that outer region of a disk is much faster. And it's portable. If you use whole disks, you can export the pool from one machine and import it on another. There's no way to export just one slice and leave

Re: [zfs-discuss] ZFS performance using slices vs. entire disk?

2006-08-02 Thread Jeff Bonwick
> is zfs any less efficient with just using a portion of a > disk versus the entire disk? As others mentioned, if we're given a whole disk (i.e. no slice is specified) then we can safely enable the write cache. One other effect -- probably not huge -- is that the block placement algorithm is mos

Re: [zfs-discuss] ZFS performance using slices vs. entire disk?

2006-08-03 Thread Jeff Bonwick
> With all of the talk about performance problems due to > ZFS doing a sync to force the drives to commit to data > being on disk, how much of a benefit is this - especially > for NFS? It depends. For some drives it's literally 10x. > Also, if I was lucky enough to have a working prestoserv > ca

Re: [zfs-discuss] ZFS write performance problem with compression set to ON

2006-08-16 Thread Jeff Bonwick
> - When the filesystems have compress=ON I see the following: reads from compressed filesystems come in waves; zpool will report for long durations (60+ seconds) no read activity while the write activity is consistently reported at 20MB/S (no variation in the write rate throughtout the test).

Re: [zfs-discuss] Re: system unresponsive after issuing a zpool attach

2006-08-16 Thread Jeff Bonwick
> And it started replacement/resilvering... after few minutes system became unavailbale. Reboot only gives me a few minutes, then resilvering make system unresponsible. > > Is there any workaroud or patch for this problem??? Argh, sorry -- the problem is that we don't do aggressive enough scrub

Re: [zfs-discuss] Re: Corrupted LUN in RAIDZ group -- How to repair?

2006-09-10 Thread Jeff Bonwick
> It looks like now the scrub has completed. Should I now clear these warnings? Yep. You survived the Unfortunate Event unscathed. You're golden. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listi

Re: [zfs-discuss] Re: Re: Snapshots impact on performance

2006-10-29 Thread Jeff Bonwick
> Nice, this is definitely pointing the finger more definitively. Next > time could you try: > > dtrace -n '[EMAIL PROTECTED](20)] = count()}' -c 'sleep 5' > > (just send the last 10 or so stack traces) > > In the mean time I'll talk with our SPA experts and see if I can figure > out how to f

Re: [zfs-discuss] File Space Allocation

2006-11-04 Thread Jeff Bonwick
> Where can I find information on the file allocation methodology used by ZFS? You've inspired me to blog again: http://blogs.sun.com/bonwick/entry/zfs_block_allocation I'll describe the way we manage free space in the next post. Jeff ___ zfs

Re: [zfs-discuss] 'zpool history' proposal

2006-05-04 Thread Jeff Bonwick
> What I meant is that events that "cause a permanent change..." should > not be deleted from the circular log if there are "old" (older?) > "operationally interesting" events that could be deleted instead. > > I.e., if the log can keep only so much info then I'd rather have the > history of a poo

Re: [zfs-discuss] zfs mirror/raidz: can they use different types of disks

2006-05-04 Thread Jeff Bonwick
> I just got an Ultra 20 with the default 80GB internal disk. Right now, > I'm using around 30GB for zfs. I will be getting a new 250GB drive. > > Question: If I create a 30GB slice on the 250GB drive, will that be okay > to use as mirror (or raidz) of the current 30GB that I now have on the

Re: [zfs-discuss] 'zpool history' proposal

2006-05-04 Thread Jeff Bonwick
> Why not use a terse XML format? I suppose we could, but I'm not convinced that XML is stable enough to be part of a 30-year on-disk format. 15 years ago PostScript was going to be stable forever, but today many PostScript readers barf on Adobe-PS-1.0 files -- which were supposed to be the most

[zfs-discuss] Re: PSARC 2006/288 zpool history

2006-05-04 Thread Jeff Bonwick
> Why not use the Solaris audit facility? Several reasons: (1) We want the history to follow the data, not the host. If you export the pool from one host and import it on another, we want the command history to move with the pool. That won't happen if the history file is somewhere i

Re: [zfs-discuss] Trying to replicate ZFS self-heal demo and not seeing fixed error

2006-05-09 Thread Jeff Bonwick
> bash-3.00# dd if=/dev/urandom of=/dev/dsk/c1t10d0 bs=1024 count=20480 A couple of things: (1) When you write to /dev/dsk, rather than /dev/rdsk, the results are cached in memory. So the on-disk state may have been unaltered. (2) When you write to /dev/rdsk/c-t-d, without specifying a slic

<    1   2