Re: [zfs-discuss] reboot when copying large amounts of data
Hi folks, I was trying to load a large file in /tmp so that a process that parses it wouldn't be limited by a disk throughput bottleneck. My rig here has only 12GB of RAM and the file I copied is about 12GB as well. Before the copied finished, my system restarted. I'm pretty up to date, the system has b129 running. Googling about this I stumbled upon this thread dating back to March ... http://mail.opensolaris.org/pipermail/zfs-discuss/2009-March/027264.html Since this seems to be a known issue, and pretty serious in my book, is there a fix pending or is that not investigated ? thanks in advance and happy holidays to all -=arnaud=- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
> Wait...whoah, hold > on.If snapshots reside within the confines of the > pool, are you saying that dedup will also count > what's contained inside the snapshots? I'm > not sure why, but that thought is vaguely disturbing > on some level. > > Then again (not sure how gurus feel on this > point) but I have this probably naive and foolish > belief that snapshots (mostly) oughtta reside on a > separate physical box/disk_array..."someplace > else" anyway. I say "mostly" because I > s'pose keeping 15 minute snapshots on board is > perfectly OK - and in fact handy. Hourly...ummm, > maybe the same - but Daily/Monthly should reside > "elsewhere". IMHO, snapshots are not a replacement for backups. Backups should definitely reside outside the system, so that if you lose your entire array, SAN, controller, etc., you can recover somewhere else. Snapshots, on the other hand, give you the ability to quickly recover to a point in time when something not-so-catastrophic happens - like a user deletes a file, an O/S update fails and hoses your system, etc. - without going to a backup system. Snapshots are nice, but they're no replacement for backups. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
On Sun, Dec 20, 2009 at 16:23, Nick wrote: > > IMHO, snapshots are not a replacement for backups. Backups should > definitely reside outside the system, so that if you lose your entire array, > SAN, controller, etc., you can recover somewhere else. Snapshots, on the > other hand, give you the ability to quickly recover to a point in time when > something not-so-catastrophic happens - like a user deletes a file, an O/S > update fails and hoses your system, etc. - without going to a backup system. > Snapshots are nice, but they're no replacement for backups. > I agree, and said so, in response to: > You seem to be confusing "snapshots" with "backup". > To which I replied: No, I wasn't confusing them at all. Backups are backups. Snapshots however, do have some limited value as backups. They're no substitute, but augment a planned backup schedule rather nicely in many situations. Please note, that I said that snapshots AUGMENT a well planned backup schedule, and in no way are they - nor should they be - considered a replacement. Your quoted scenario is the perfect illustration, a user-deleted file, a rollback for that update that "didn't quite work out as you hoped" and so forth. Agreed, no argument. The (one and only) point that I was making was that - like backups - snapshots should be kept "elsewhere" whether by using zfs-send, or zipping up the whole shebang and ssh'ing it someplace"elsewhere" meaning beyond the pool. Rolling 15 minute and hourly snapshotsno, they stay local, but daily/weekly/monthly snapshots get stashed "offsite" (off-box). Apart from anything else, it's one heck of a spacesaver - in the long run. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
Ok, dump uploaded! Thanks for your upload Your file has been stored as "/cores/redshirt-vmdump.0" on the Supportfiles service. Size of the file (in bytes) : 1743978496. The file has a cksum of : 2878443682 . It's about 1.7 GB compressed! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] raidz data loss stories?
The zfs best practices page (and all the experts in general) talk about MTTDL and raidz2 is better than raidz and so on. Has anyone here ever actually experienced data loss in a raidz that has a hot spare? Of course, I mean from disk failure, not from bugs or admin error, etc. -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] On collecting data from "hangs"
There seems to be a rash of posts lately where people are resetting or rebooting without getting any data, so I thought I'd post a quick overview on collecting crash dumps. If you think you've got a hang problem with ZFS and you want to gather data for someone to look at, then here are a few steps you should take. If you already know all about gathering crash dumps on Solaris, feel free to delete now. 1) Make sure crash dumps are enabled Enable saving of crash dumps by executing as root or with pfexec 'dumpadm -y'. The most reasonable trade-off of information vs size in the crash dump is probably 'dumpadm -c curproc' If you're running Opensolaris you likely already have a dedicated zvol as a dump device. If you're running SXCE you may need to dedicate a disk or slice for dump purposes. Change or dedicate the dump device with 'dumpadm -d ' See dumpadm(1M) for more info. 2) There are at least a couple of ways to capture crash dumps. As root or with pfexec run 'savecore -L'. This is a point-in-time capture of what's happening on the live system. The system continues to run during the capture so the results can be slightly inconsistent, but the machine won't reboot. Good if you think whatever is hung is still making progress. If you really don't mind rebooting, then 'reboot -nd' will most often get a dump without the dump also hanging up and forcing you to hard reset anyway. --- Number 1 is best done now, before you have a hang. It won't hurt anything to have crash dumps enabled - and if you ever get a panic you'll have the data needed for someone to analyze the issue. If the crash dump saving works, the default location for the dumps to be stored is the directory /var/crash/. -tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] iSCSI with Deduplication, is there any point?
I've been using OpenSolaris for my home file server for a little over a year now. For most of that time I have used smb to share files out to my other systems. I also have a Windows Server 2003 DC and all my client systems are joined to the domain. Most of that time was a permissions nightmare getting smb to jive with the domain. The last issue I ran into was the final straw. I backed up all my data and executed a zpool destroy. And it felt kinda good... I then turned to iSCSI to access my files. I've put the OpenSolaris box on a crossover cable hooked directly to my Windows server with a couple of gig cards. I've setup iSCSI, after resolving some issues with poor performance due to sync writes and ZIL issues, I'm pretty happy with the setup. I've got a 2.4T volume in my Raidz pool my Windows box uses for one big NTFS volume. Windows deals with sharing the data to all the Windows clients and everyone is happy. No ACL issues, no user maps, it just works. I'm sure a ton of nix hardcores either stopped reading already or are puking in their hat. Sorry to disturb, I don't like NTFS anymore than you. But I picked the best option I could find and it works better for me. If anyone is able to educate me on a better way or a way of making smb more reliable in a Windows Domain or any other alternative that allows me to use Windows Domain permissions I'm all ears. I have already run into one little snag that I don't see any way of overcoming with my chosen method. I've upgraded to snv_129 with high hopes for getting the most out of deduplication. But using iSCSI volumes I'm not sure how I can gain any benefit from it. The volumes are a set size, Windows sees those volumes as that size despite any sort of block level deduplication or compression taking place on the other side of the iSCSI connection. I can't create volumes that add up to more than the original pool size from what I can tell. I can see the pool is saving space but it doesn't appear to become available to zfs volumes. Dedup being pretty new I haven't found much on the subject online. So my question is... Using zfs solely for hosting iSCSI targets. Is there any way to use the space gained by deduplication? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI with Deduplication, is there any point?
> I have already run into one little snag that I don't see any way of > overcoming with my chosen method. I've upgraded to snv_129 with high hopes > for getting the most out of deduplication. But using iSCSI volumes I'm not > sure how I can gain any benefit from it. The volumes are a set size, Windows > sees those volumes as that size despite any sort of block level deduplication > or compression taking place on the other side of the iSCSI connection. I > can't create volumes that add up to more than the original pool size from > what I can tell. I can see the pool is saving space but it doesn't appear to > become available to zfs volumes. Dedup being pretty new I haven't found much > on the subject online. Create sparse volumes. -s when you create at volume or change the reservation on your volumes. Search for sparse in the zfs man-page. And don't run out of space. :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
arnaud wrote: Hi folks, I was trying to load a large file in /tmp so that a process that parses it wouldn't be limited by a disk throughput bottleneck. My rig here has only 12GB of RAM and the file I copied is about 12GB as well. Before the copied finished, my system restarted. I'm pretty up to date, the system has b129 running. 12GB of ECC RAM? This kind of spontaneous reboot with crash dumps enabled (do you have them enabled?) tend to be hardware related. The most common bit of bad hardware is RAM... -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI with Deduplication, is there any point?
Cool thx, sounds like exactly what I'm looking for. I did a bit of reading on the subject and to my understanding I should... Create a volume of a size as large as I could possibly need. So, siding on the optimistic, "zfs create -s -V 4000G tank/iscsi1". Then in Windows initialize and quick format it and Windows will think it is 4000G. Obviously I would do a quick format not a full or it would write 4000G worth of zeros or die trying. Although with Dedup I would presume it should be able to do that. Is that a good procedure or is there a better way? Anyway, my next question is what happens when it fills up? Also what happens when deleted files on the NTFS partition add up to consume all the available space. I mean if I write a file to the NTFS volume it will write all that data to the ZFS filesystem. Then I delete that file and all that happens is it gets marked as deleted, the data doesn't actually get zeroed out so as far as ZFS is concerned the blocks still contain data and need to be stored. As with most NTFS partitions it will eventually use every bit of space it sees available no matter how many active files are there. So using this type of thin provisioning should I run scheduled cleans on the NTFS partition from Windows to zero out the deleted data? Also are there any other issues I should be aware of? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI with Deduplication, is there any point?
On Sun, Dec 20, 2009 at 9:24 PM, Chris Scerbo wrote: > Cool thx, sounds like exactly what I'm looking for. > > I did a bit of reading on the subject and to my understanding I should... > Create a volume of a size as large as I could possibly need. So, siding on > the optimistic, "zfs create -s -V 4000G tank/iscsi1". Then in Windows > initialize and quick format it and Windows will think it is 4000G. Obviously > I would do a quick format not a full or it would write 4000G worth of zeros > or die trying. Although with Dedup I would presume it should be able to do > that. Is that a good procedure or is there a better way? > > Anyway, my next question is what happens when it fills up? Also what happens > when deleted files on the NTFS partition add up to consume all the available > space. Run "sdelete -c X:" where X: is your drive. That should take care of you deleted, but still occupied blocks. -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ARC not using all available RAM?
I've got an opensolaris snv_118 machine that does nothing except serve up NFS and ISCSI. The machine has 8G of ram, and I've got an 80G SSD as L2ARC. The ARC on this machine is currently sitting at around 2G, the kernel is using around 5G, and I've got about 1G free. I've pulled this from a combination of arc_summary.pl, and 'echo "::memstat" | mdb -k' It's my understanding that the kernel will use a certain amount of ram for managing the L2Arc, and that how much is needed is dependent on the size of the L2Arc and the recordsize of the zfs filesystems I have some questions that I'm hoping the group can answer... Given that I don't believe there is any other memory pressure on the system, why isn't the ARC using that last 1G of ram? Is there some way to see how much ram is used for L2Arc management? Is that what the l2_hdr_size kstat measures? Is it possible to see it via 'echo "::kmastat" | mdb -k '? Thanks everyone. Tristan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
> The (one and only) point that I was making was that - like backups - > snapshots should be kept "elsewhere" whether by using zfs-send, or zipping > up the whole shebang and ssh'ing it someplace"elsewhere" meaning beyond > the pool. Rolling 15 minute and hourly snapshotsno, they stay local, but > daily/weekly/monthly snapshots get stashed "offsite" (off-box). Apart from > anything else, it's one heck of a spacesaver - in the long run. I guess that depends on what you're doing with them and how big a part they play in your operations. On my SAN, I don't roll my snapshots off-site, because I'm comfortable losing those snapshots and still being able to recover backup data, and I'd rather not duplicate storage infrastructure just to have snapshots around. I consider the snapshots a "nice-to-have" that saves me time periodically, but not critical to my infrastructure, therefore it doesn't make sense to spend the time/money to send them off-site - if something bad happens where I cannot recover snapshots, I'm probably going to be spending a lot of time recovering, and the snapshots probably aren't that useful to me. However, if the snapshots are critical to your operations and your ability to service user requests, then, yes, putting them onto a secondary storage location is a good idea. -Nick This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS receive -dFv creates an extra "e" subdirectory..
On Sat, Dec 19, 2009 at 3:56 AM, Steven Sim wrote: > r...@sunlight:/root# zfs list -r myplace/Docs > NAME USED AVAIL REFER MOUNTPOINT > myplace/Docs 3.37G 1.05T 3.33G > /export/home/admin/Docs/e/Docs <--- *** Here is the extra "e/Docs".. I saw a similar behavior when doing a receive on b129. I don't remember if the mountpoint was set locally in the dataset or inherited, but re-inheriting it fixed the mountpoint. -B -- Brandon High : bh...@freaks.com For sale: One moral compass, never used. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARC not using all available RAM?
On Dec 20, 2009, at 12:25 PM, Tristan Ball wrote: I've got an opensolaris snv_118 machine that does nothing except serve up NFS and ISCSI. The machine has 8G of ram, and I've got an 80G SSD as L2ARC. The ARC on this machine is currently sitting at around 2G, the kernel is using around 5G, and I've got about 1G free. Yes, the ARC max is set by default to 3/4 of memory or memory - 1GB, whichever is greater. http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/arc.c#3426 I've pulled this from a combination of arc_summary.pl, and 'echo "::memstat" | mdb -k' IMHO, it is easier to look at the c_max in kstat. kstat -n arcstats -s c_max It's my understanding that the kernel will use a certain amount of ram for managing the L2Arc, and that how much is needed is dependent on the size of the L2Arc and the recordsize of the zfs filesystems Yes. I have some questions that I'm hoping the group can answer... Given that I don't believe there is any other memory pressure on the system, why isn't the ARC using that last 1G of ram? Simon says, "don't do that" ? ;-) Is there some way to see how much ram is used for L2Arc management? Is that what the l2_hdr_size kstat measures? Is it possible to see it via 'echo "::kmastat" | mdb -k '? I don't think so. OK, so why are you interested in tracking this? Capacity planning? From what I can tell so far, DDT is a much more difficult beast to measure and has a more direct impact on performance :-( -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARC not using all available RAM?
On Sun, 20 Dec 2009, Richard Elling wrote: Given that I don't believe there is any other memory pressure on the system, why isn't the ARC using that last 1G of ram? Simon says, "don't do that" ? ;-) Yes, primarily since if there is no more memory immediately available, performance when starting new processes would suck. You need to reserve some working space for processes and short term requirements. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARC not using all available RAM?
Bob Friesenhahn wrote: On Sun, 20 Dec 2009, Richard Elling wrote: Given that I don't believe there is any other memory pressure on the system, why isn't the ARC using that last 1G of ram? Simon says, "don't do that" ? ;-) Yes, primarily since if there is no more memory immediately available, performance when starting new processes would suck. You need to reserve some working space for processes and short term requirements. Why is that a given? There are several systems that steal from cache under memory pressure. Earlier versions of solaris that I've dealt with a little managed with quite a bit less that 1G free. On this system, "lotsfree" is sitting at 127mb, which seems reasonable, and isn't it "lotsfree" and the related variables and page-reclaim logic that maintain that pool of free memory for new allocations? Regards, Tristan. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] FW: ARC not using all available RAM?
Oops, should have sent to the list... Richard Elling wrote: > > On Dec 20, 2009, at 12:25 PM, Tristan Ball wrote: > >> I've got an opensolaris snv_118 machine that does nothing except >> serve up NFS and ISCSI. >> >> The machine has 8G of ram, and I've got an 80G SSD as L2ARC. >> The ARC on this machine is currently sitting at around 2G, the kernel >> is using around 5G, and I've got about 1G free. > > Yes, the ARC max is set by default to 3/4 of memory or memory - 1GB, > whichever > is greater. > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common /fs/zfs/arc.c#3426 > So I've read, but the ARC is using considerably less than 3/4 of memory, and 1G free is less that 1/4! On this box, c_max is about 7 gig (Which is more than 3/4 anyway)? > >> I've pulled this from a combination of arc_summary.pl, and 'echo >> "::memstat" | mdb -k' > > IMHO, it is easier to look at the c_max in kstat. > kstat -n arcstats -s c_max You're probably right. I've been looking at those too - actually, I've just started graphing them in munin through some slightly modified munin plugins that someone wrote for BSD. :-) > >> It's my understanding that the kernel will use a certain amount of >> ram for managing the L2Arc, and that how much is needed is dependent >> on the size of the L2Arc and the recordsize of the zfs filesystems > > Yes. > >> I have some questions that I'm hoping the group can answer... >> >> Given that I don't believe there is any other memory pressure on the >> system, why isn't the ARC using that last 1G of ram? > Simon says, "don't do that" ? ;-) Simon Says lots of things. :-) It strikes me that 1G sitting free is quite a lot. I guess what I'm really asking, is that given that 1G free doesn't appear to be the 1/4 of ram that the ARC will never touch, and that "c" is so much less than "c_max", why is "c" so small? :-) > >> Is there some way to see how much ram is used for L2Arc management? >> Is that what the l2_hdr_size kstat measures? > >> Is it possible to see it via 'echo "::kmastat" | mdb -k '? > > I don't think so. > > OK, so why are you interested in tracking this? Capacity planning? > From what I can tell so far, DDT is a much more difficult beast to > measure > and has a more direct impact on performance :-( > -- richard Firstly, what's DDT? :-) Secondly, it's because I'm replacing the system. The existing one was proof of concept, essentially built with decommissioned parts. I've got a new box,with 32G of ram, with a little bit of money left in the budget. For that money, I could get an extra 80-200G of SSD for L2ARC, or an extra 12G of ram, or perhaps both would be a waste money. Given the box will be awkward to touch once it's in, I'm going to err on the side of adding hardware now. What I'm trying to find out is is my ARC relatively small because... 1) ZFS has decided that that's all it needs (the workload is fairly random), and that adding more wont gain me anything.. 2) The system is using so much ram for tracking the L2ARC, that the ARC is being shrunk (we've got an 8K record size) 3) There's some other memory pressure on the system that I'm not aware of that is periodically chewing up then freeing the ram. 4) There's some other memory management feature that's insisting on that 1G free. Actually, because it'll be easier to add SSD's later than to add RAM later, I might just add the RAM now and be done with it. :-) It's not very scientific, but I don't think I've ever had a system where 2 or 3 years into it's life I've not wished that I'd put more ram in to start with! But I really am still interested in figuring out how much RAM is used for the L2ARC management in our system, because while our workload is fairly random, there's some moderately well defined hotter spots - so it might be that in 12 months a feasible upgrade to the system is to add 4 x 256G SSD's as L2ARC. It would take a while for the L2ARC to warm up, but once it was, most of those hotter areas would come from cache. However it may be too much L2 for the system to track efficiently. Regards, Tristan/ > > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit > http://www.messagelabs.com/email __ > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss