[zfs-discuss] 'primarycache' and 'secondarycache'
My understand of the read cache is that L2ARC has a read thread to read the cache from ARC. Hence my question. if primarycache is set to 'metadata', will L2ARC get to cache user data? similarly, what if primarycache is set to none. Thanks, --Jackie -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mac OS X clients with ZFS server
On 15 sept. 2010, at 22:04, Mike Mackovitch wrote: > On Wed, Sep 15, 2010 at 12:08:20PM -0700, Nabil wrote: >> any resolution to this issue? I'm experiencing the same annoying >> lockd thing with mac osx 10.6 clients. I am at pool ver 14, fs ver >> 3. Would somehow going back to the earlier 8/2 setup make things >> better? > > As noted in the earlier thread, the "annoying lockd thing" is not a > ZFS issue, but rather a networking issue. > > FWIW, I never saw a resolution. But the suggestions for how to debug > situations like this still stand: And for reference, I have a number of 10.6 clients using NFS for sharing Fusion virtual machines, iTunes library, iPhoto libraries etc. without any issues. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Replacing a disk never completes
I have an X4540 running b134 where I'm replacing 500GB disks with 2TB disks (Seagate Constellation) and the pool seems sick now. The pool has four raidz2 vdevs (8+2) where the first set of 10 disks were replaced a few months ago. I replaced two disks in the second set (c2t0d0, c3t0d0) a couple of weeks ago, but have been unable to get the third disk to finish replacing (c4t0d0). I have tried the resilver for c4t0d0 four times now and the pool also comes up with checksum errors and a permanent error (:<0x0>). The first resilver was from 'zpool replace', which came up with checksum errors. I cleared the errors which triggered the second resilver (same result). I then did a 'zpool scrub' which started the third resilver and also identified three permanent errors (the two additional were in files in snapshots which I then destroyed). I then did a 'zpool clear' and then another scrub which started the fourth resilver attempt. This last attempt identified another file with errors in a snapshot that I have now destroyed. Any ideas how to get this disk finished being replaced without rebuilding the pool and restoring from backup? The pool is working, but is reporting as degraded and with checksum errors. Here is what the pool currently looks like: # zpool status -v pool2 pool: pool2 state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver completed after 33h9m with 4 errors on Thu Sep 16 00:28:14 config: NAME STATE READ WRITE CKSUM pool2 DEGRADED 0 0 8 raidz2-0ONLINE 0 0 0 c0t4d0ONLINE 0 0 0 c1t4d0ONLINE 0 0 0 c2t4d0ONLINE 0 0 0 c3t4d0ONLINE 0 0 0 c4t4d0ONLINE 0 0 0 c5t4d0ONLINE 0 0 0 c2t5d0ONLINE 0 0 0 c3t5d0ONLINE 0 0 0 c4t5d0ONLINE 0 0 0 c5t5d0ONLINE 0 0 0 raidz2-1DEGRADED 0 014 c0t5d0ONLINE 0 0 0 c1t5d0ONLINE 0 0 0 c2t1d0ONLINE 0 0 0 c3t1d0ONLINE 0 0 0 c4t1d0ONLINE 0 0 0 c5t1d0ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 c3t0d0ONLINE 0 0 0 replacing-8 DEGRADED 0 0 0 c4t0d0s0/o OFFLINE 0 0 0 c4t0d0 ONLINE 0 0 0 268G resilvered c5t0d0ONLINE 0 0 0 raidz2-2ONLINE 0 0 0 c0t6d0ONLINE 0 0 0 c1t6d0ONLINE 0 0 0 c2t6d0ONLINE 0 0 0 c3t6d0ONLINE 0 0 0 c4t6d0ONLINE 0 0 0 c5t6d0ONLINE 0 0 0 c2t7d0ONLINE 0 0 0 c3t7d0ONLINE 0 0 0 c4t7d0ONLINE 0 0 0 c5t7d0ONLINE 0 0 0 raidz2-3ONLINE 0 0 0 c0t7d0ONLINE 0 0 0 c1t7d0ONLINE 0 0 0 c2t3d0ONLINE 0 0 0 c3t3d0ONLINE 0 0 0 c4t3d0ONLINE 0 0 0 c5t3d0ONLINE 0 0 0 c2t2d0ONLINE 0 0 0 c3t2d0ONLINE 0 0 0 c4t2d0ONLINE 0 0 0 c5t2d0ONLINE 0 0 0 logs mirror-4ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 c1t3d0s0 ONLINE 0 0 0 cache c0t3d0s7ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: :<0x0> <0x167a2>:<0x552ed> (This second file was in a snapshot I destroyed after the resilver completed). # zpool list pool2 NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT pool2 31.8T 13.8T 17.9T43% 1.65x DEGRADED - The slog is a mirror of two SLC SSDs and the L2ARC is an MLC SSD. thanks, Ben ___ zfs-discuss mailing list zfs-discus
Re: [zfs-discuss] dedicated ZIL/L2ARC
We downloaded zilstat from http://www.richardelling.com/Home/scripts-and-programs-1 but we never could get the script to run. We are not really sure how to debug. :( ./zilstat.ksh dtrace: invalid probe specifier #pragma D option quiet inline int OPT_time = 0; inline int OPT_txg = 0; inline int OPT_pool = 0; inline int OPT_mega = 0; inline int INTERVAL = 1; inline int LINES = -1; inline int COUNTER = -1; inline int FILTER = 0; inline string POOL = ""; -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mac OS X clients with ZFS server
On Thu, 16 Sep 2010, erik.ableson wrote: > And for reference, I have a number of 10.6 clients using NFS for > sharing Fusion virtual machines, iTunes library, iPhoto libraries etc. > without any issues. Excellent; what OS is your NFS server running? -- Rich Teer, Publisher Vinylphile Magazine www.vinylphilemag.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedicated ZIL/L2ARC
We have the following setup configured. The drives are running on a couple PAC PS-5404s. Since these units do not support JBOD, we have configured each individual drive as a RAID0 and shared out all 48 RAID0’s per box. This is connected to the solaris box through a dual port 4G Emulex fibrechannel card with MPIO enabled (round-robin). This is configured with the 18 raidz2 vdevs and 1 big pool. We currently have 2 zvols created with the size being around 40TB sparse (30T in use). This in turn is shared out using a fibrechannel Qlogic QLA2462 in target mode, using both ports. We have 1 zvol connected to 1 windows server and the other zvol connected to another windows server with both windows servers having a qlogic 2462 fibrechannel adapter, using both ports and MPIO enabled. The windows servers are running Windows 2008 R2. The zvols are formatted NTFS and used as a staging area and D2D2T system for both Commvault and Microsoft Data Protection Manager backup solutions. The SAN system sees mostly writes since it is used for backups. We are using Cisco 9124 fibrechannel switches and we have recently upgraded to Cisco 10G Nexus switches for our Ethernet side. Fibrechannel support on the Nexus will be in a few years due to the cost. We are just trying to fine tune our SAN for the best performance possible and we don’t really have any expectations right now. We are always looking to improve something. :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] resilver = defrag?
On Wed, September 15, 2010 16:18, Edward Ned Harvey wrote: > For example, if you start with an empty drive, and you write a large > amount > of data to it, you will have no fragmentation. (At least, no significant > fragmentation; you may get a little bit based on random factors.) As life > goes on, as long as you keep plenty of empty space on the drive, there's > never any reason for anything to become significantly fragmented. Sure, if only a single thread is ever writing to the disk store at a time. This situation doesn't exist with any kind of enterprise disk appliance, though; there are always multiple users doing stuff. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] recordsize
What are the ramifications to changing the recordsize of a zfs filesystem that already has data on it? I want to tune down the recordsize to speed up very small reads to a size that is more in line with the read size. can I do this on a filestystem that has data already on it and how does it effect that data? zpool consists of 8 SANs Luns. Thanks mike -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mac OS X clients with ZFS server
On Thu, 16 Sep 2010, Erik Ableson wrote: > OpenSolaris snv129 Hmm, SXCE snv_130 here. Did you have to do any server-side tuning (e.g., allowing remote connections), or did it just work out of the box? I know that Sendmail needs some gentle persuasion to accept remote connections out of the box; perhaps lockd is the same? -- Rich Teer, Publisher Vinylphile Magazine www.vinylphilemag.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recordsize
On Thu, Sep 16, 2010 at 8:21 AM, Mike DeMarco wrote: > What are the ramifications to changing the recordsize of a zfs filesystem > that already has data on it? > > I want to tune down the recordsize to speed up very small reads to a size > that is more in line with the read size. > can I do this on a filestystem that has data already on it and how does it > effect that data? zpool consists of 8 SANs Luns. Changing any of the zfs properties only affects data written after the change is made. Thus, reducing the recordsize for a filesystem will only affect newly written data. Any existing data is not affected until it is re-written or copied. -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mac OS X clients with ZFS server
On Thu, Sep 16, 2010 at 08:15:53AM -0700, Rich Teer wrote: > On Thu, 16 Sep 2010, Erik Ableson wrote: > > > OpenSolaris snv129 > > Hmm, SXCE snv_130 here. Did you have to do any server-side tuning > (e.g., allowing remote connections), or did it just work out of the > box? I know that Sendmail needs some gentle persuasion to accept > remote connections out of the box; perhaps lockd is the same? So, you've been having this problem since April. Did you ever try getting packet traces to see where the problem is? As I previously stated, if you want, you can forward the traces to me to look at. Let me know if you need the directions on how to capture them. --macko ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] resilver = defrag?
> "dd" == David Dyer-Bennet writes: dd> Sure, if only a single thread is ever writing to the disk dd> store at a time. video warehousing is a reasonable use case that will have small numbers of sequential readers and writers to large files. virtual tape library is another obviously similar one. basically, things which used to be stored on tape. which are not uncommon. AIUI ZFS does not have a fragmentation problem for these cases unless you fill past 96%, though I've been trying to keep my pool below 80% because . dd> This situation doesn't exist with any kind of enterprise disk dd> appliance, though; there are always multiple users doing dd> stuff. the point's relevant, but I'm starting to tune out every time I hear the word ``enterprise.'' seems it often decodes to: (1) ``fat sacks and no clue,'' or (2) ``i can't hear you i can't hear you i have one big hammer in my toolchest and one quick answer to all questions, and everything's perfect! perfect, I say. unless you're offering an even bigger hammer I can swap for this one, I don't want to hear it,'' or (3) ``However of course I agree that hammers come in different colors, and a wise and experienced craftsman will always choose the color of his hammer based on the color of the nail he's hitting, because the interface between hammers and nails doesn't work well otherwise. We all know here how to match hammer and nail colors, but I don't want to discuss that at all because it's a private decision to make between you and your salesdroid. ``However, in this forum here we talk about GREEN NAILS ONLY. If you are hitting green nails with red hammers and finding they go into the wood anyway then you are being very unprofessional because that nail might have been a bank transaction. --posted from opensolaris.org'' pgpqzPhCxoUuU.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] resilver = defrag?
David Dyer-Bennet wote: > Sure, if only a single thread is ever writing to the > disk store at a time. > > This situation doesn't exist with any kind of > enterprise disk appliance, > though; there are always multiple users doing stuff. Ok, I'll bite. Your assertion seems to be that "any kind of enterprise disk appliance" will always have enough simultaneous I/O requests queued that any sequential read from any application will be sufficiently broken up by requests from other applications, effectively rendering all read requests as random. If I follow your logic, since all requests are essentially random anyway, then where they fall on the disk is irrelevant. I might challenge a couple of those assumptions. First, if the data is not fragmented, then ZFS would coalesce multiple contiguous read requests into a single large read request, increasing total throughput regardless of competing I/O requests (which also might benefit from the same effect). Second, I am unaware of an enterprise requirement that disk I/O run at 100% busy, any more than I am aware of the same requirement for full network link utilization, CPU utilization or PCI bus utilization. What appears to be missing from this discussion is any shred of scientific evidence that fragmentation is good or bad and by how much. We also lack any detail on how much fragmentation does take place. Let's see if some people in the community can get some real numbers behind this stuff in real world situations. Cheers, Marty -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Compression block sizes
On Wed, 15 Sep 2010, Brandon High wrote: When using compression, are the on-disk record sizes determined before or after compression is applied? In other words, if record size is set to 128k, is that the amount of data fed into the compression engine, or is the output size trimmed to fit? I think it's the former, but I'm not certain. We have been told before that the blocksize is applied to the uncompressed data and that when compression is applied, short blocks may be written to disk. This does not mean that the short blocks don't start at a particular alignment. When using raidz, the zfs blocks are already broken up into smaller chunks, using a smaller alignment than the zfs record size. For zfs send, the data is uncompressed to full records prior to sending. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Best practice for Sol10U9 ZIL -- mirrored or not?
Best practice in Solaris 10 U8 and older was to use a mirrored ZIL. With the ability to remove slog devices in Solaris 10 U9, we're thinking we may get more bang for our buck to use two slog devices for improved IOPS performance instead of needing the redundancy so much. Any thoughts on this? If we lost our slog devices and had to reboot, would the system come up (eg could we "remove" failed slog devices from the zpool so the zpool would come online..) Thanks, Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for Sol10U9 ZIL -- mirrored or not?
+-- | On 2010-09-16 18:08:46, Ray Van Dolson wrote: | | Best practice in Solaris 10 U8 and older was to use a mirrored ZIL. | | With the ability to remove slog devices in Solaris 10 U9, we're | thinking we may get more bang for our buck to use two slog devices for | improved IOPS performance instead of needing the redundancy so much. | | Any thoughts on this? | | If we lost our slog devices and had to reboot, would the system come up | (eg could we "remove" failed slog devices from the zpool so the zpool | would come online..) The ability to remove the slogs isn't really the win here, it's import -F. The problem is: If the ZIL dies, you will lose whatever writes were in-flight. I've just deployed some SSD ZIL (on U9), and decided to mirror them. Cut the two SSDs into 1GB and 31GB partitions, mirrored the two 1GBs as slog and have the two 31GB as L2ARC. So far extremely happy with it. Running a scrub during production hours, before, was unheard of. (And, well, "production" for mail storage is basically all hours, so.) As for running non-mirrored slogs... dunno. Our customers would be pretty pissed if we lost any mail, so I doubt I will do so. My SSDs were only $90 each, though, so cost is hardly a factor for us. Cheers. -- bdha cyberpunk is dead. long live cyberpunk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss