Re: [zfs-discuss] Disabling COMMIT at NFS level, or disabling ZIL on a per-filesystem basis
Le 23 oct. 08 à 05:40, Constantin Gonzalez a écrit : > Hi, > > Bob Friesenhahn wrote: >> On Wed, 22 Oct 2008, Neil Perrin wrote: >>> On 10/22/08 10:26, Constantin Gonzalez wrote: 3. Disable ZIL[1]. This is of course evil, but one customer pointed out to me that if a tar xvf were writing locally to a ZFS file system, the writes wouldn't be synchronous either, so there's no point in forcing NFS users to having a better availability experience at the expense of performance. >> >> The conclusion reached here is quite seriously wrong and no Sun >> employee should suggest it to a customer. If the system writing to a > > I'm not suggesting it to any customer. Actually, I argued quite a > long time > with the customer, trying to convince him that "slow but correct" is > better. > > The conclusion above is a conscious decision by the customer. He > says that he > does not want NFS to turn any write into a synchronous write, he's > happy if > all writes are asynchronous, because in this case the NFS server is > a backup to > disk device and if power fails he simply restarts the backup 'cause > he has the > data in multiple copies anyway. > The case of a full backup (but not incremental) where an operator is monitoring that the server stays up for the full duration (or does the manual restart of the operation) seems like a singular case where this might make half sense. But as was stated, for performance which is the goal here, better use a bulk type transfer of data through some specific protocol (as opposed to NFS small file manipulations). What this creates is that failure of the server has immediate obvious repercusion on the client, and things can be restarted without further coordination. I understand also that with NFS directory delegation or Exclusive mount points one could solve this NFS peculiarity (which is totally unrelated to ZFS, and not to be confused with the ZFS / SAN storage cache flush condition). If CIFS is not subject to the same penalty, I can only assume that the integrity of the client's view cannot be guaranteed after a server crash. Anyone knows this for sure ? -r >> local filesystem reboots then the applications which were running are >> also lost and will see the new filesystem state when they are >> restarted. If an NFS server sponteneously reboots, the applications >> on the many clients are still running and the client systems are >> using >> cached data. This means that clients could do very bad things if the >> filesystem state (as seen by NFS) is suddenly not consistent. One of >> the joys of NFS is that the client continues unhindered once the >> server returns. > > Yes, we're both aware of this. In this particular situation, the > customer > would restart his backup job (and thus the client application) in > case the > server dies. > > Thanks for pointing out the difference, this is indeed an important > distinction. > > Cheers, > Constantin > > -- > Constantin Gonzalez Sun Microsystems > GmbH, Germany > Principal Field Technologist > http://blogs.sun.com/constantin > Tel.: +49 89/4 60 08-25 91 > http://google.com/search?q=constantin+gonzalez > > Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim- > Heimstetten > Amtsgericht Muenchen: HRB 161028 > Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland > Boemer > Vorsitzender des Aufsichtsrates: Martin Haering > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] diagnosing read performance problem
Bob Friesenhahn wrote: > Other people on this list who experienced the exact same problem > ultimately determined that the problem was with the network card. I > recall that Intel NICs were the recommended solution. > > Note that 100MBit is now considered to be a slow link and PCI is also > considered to be slow. Thanks for the reply, Yes I understand that 100mbit and pci are a bit outdated, unfortunately I'm still campaigning to have our switches upgraded to gbit or 10gbit. I will see if I can aquire an intel nic to test it with, however before the problem with NICs started it operating fine. It seems though that there is an ongoing problem with NICs on this machine. The onboard ones haven't so much died (they still allow me to use them from the OS) but they just won't start up or accept there is a cable plugged in. The PCI nic does seem to be working and transfers to/from the server seem ok except when there's video being moved. I will do some testing and see if I can come up with a more definite reason to the performance problems. Thanks Matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sata on sparc
On Fri, Oct 24, 2008 at 01:03:32PM -0700, John-Paul Drawneek wrote: > you have buy a lsi sas card. Which work *great* by the way. > not cheap - around 100 GBP Really? I picked mine up for $68US including shipping and the SAS<->SATA breakout cable. Needless to say I haven't looked at prices lately. -brian -- "Coding in C is like sending a 3 year old to do groceries. You gotta tell them exactly what you want or you'll end up with a cupboard full of pop tarts and pancake mix." -- IRC User (http://www.bash.org/?841435) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] diagnosing read performance problem
On Sat, 25 Oct 2008, Matt Harrison wrote: > > The onboard ones haven't so much died (they still allow me to use them > from the OS) but they just won't start up or accept there is a cable > plugged in. The PCI nic does seem to be working and transfers to/from > the server seem ok except when there's video being moved. Hmmm, this may indicate that there is an ethernet cable problem. Use 'netstat -I interface' (where interface is the interface name shown by 'ifconfig -a') to see if the interface error count is increasing. If you are using a "smart" switch, use the switch admistrative interface and see if the error count is increasing for the attached switch port. Unfortunately your host can only see errors for packets it receives and it may be that errors are occuring for packets it sends. If the ethernet cable is easy to replace, then it may be easiest to simply replace it and use a different switch port to see if the problem just goes away. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] diagnosing read performance problem
On Sat, Oct 25, 2008 at 11:10:42AM -0500, Bob Friesenhahn wrote: > Hmmm, this may indicate that there is an ethernet cable problem. Use > 'netstat -I interface' (where interface is the interface name shown by > 'ifconfig -a') to see if the interface error count is increasing. If you > are using a "smart" switch, use the switch admistrative interface and see > if the error count is increasing for the attached switch port. > Unfortunately your host can only see errors for packets it receives and it > may be that errors are occuring for packets it sends. > > If the ethernet cable is easy to replace, then it may be easiest to simply > replace it and use a different switch port to see if the problem just goes > away. Ok, I've just tried 2 other cables, one doesn't even get a link light so it's probably dead. The other one I had suspected was bad and indeed the connection is terrible and the Oerr field in netstat does increase. On the other hand, the Oerr field doesn't increase with the original cable, however the video performance is still bad (although not as bad as with the 2nd replacement cable). I will make up some new cables, and also place an order for an Intell Pro100, as they are supposed to be really reliable. Thanks Matt pgpgGq6AmGFWW.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
Richard Elling <[EMAIL PROTECTED]> wrote: > Marcus Sundman wrote: > > How can I verify the checksums for a specific file? > > ZFS doesn't checksum files. AFAIK ZFS checksums all data, including the contents of files. > So a file does not have a checksum to verify. I wrote "checksums" (plural) for a "file" (singular). - Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
On Sat, Oct 25, 2008 at 6:49 PM, Marcus Sundman <[EMAIL PROTECTED]> wrote: > Richard Elling <[EMAIL PROTECTED]> wrote: > > Marcus Sundman wrote: > > > How can I verify the checksums for a specific file? > > > > ZFS doesn't checksum files. > > AFAIK ZFS checksums all data, including the contents of files. > > > So a file does not have a checksum to verify. > > I wrote "checksums" (plural) for a "file" (singular). > AH - Then you DO mean the ZFS built-in data check-summing - my mistake. ZFS checksums allocations (blocks), not files. The checksum for each block is stored in the parent of that block. These are not shown to you but you can "scrub" the pool, which will see zfs run through all the allocations, checking whether the checksums are valid. This PDF document is quite old but explains it fairly well: http://www.google.co.za/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fru.sun.com%2Ftechdays%2Fpresents%2FSolaris%2Fhow_zfs_works.pdf&ei=f3EDSbnjB5iQQbG2wIIC&usg=AFQjCNG8qtO3bFgmD11izooR7SVbiSOI2A&sig2=-EHfv5Puqz8dxkANISionQ What is not expressly stated in the block is that the ZFS allocation structure stores the posix layer and file data in the leaf nodes in the tree. Cheers, _hartz -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
"Johan Hartzenberg" <[EMAIL PROTECTED]> wrote: > On Sat, Oct 25, 2008 at 6:49 PM, Marcus Sundman <[EMAIL PROTECTED]> > wrote: > > Richard Elling <[EMAIL PROTECTED]> wrote: > > > Marcus Sundman wrote: > > > > How can I verify the checksums for a specific file? > > > > > > ZFS doesn't checksum files. > > > > AFAIK ZFS checksums all data, including the contents of files. > > > > > So a file does not have a checksum to verify. > > > > I wrote "checksums" (plural) for a "file" (singular). > > > > AH - Then you DO mean the ZFS built-in data check-summing - my > mistake. ZFS checksums allocations (blocks), not files. The checksum > for each block is stored in the parent of that block. These are not > shown to you but you can "scrub" the pool, which will see zfs run > through all the allocations, checking whether the checksums are valid. I don't want to scrub several TiB of data just to verify a 2 MiB file. I want to verify just the data of that file. (Well, I don't mind also verifying whatever other data happens to be in the same blocks.) > This PDF document is quite old but explains it fairly well: I couldn't see anything there describing either how to verify the checksums of individual files or why that would be impossible. OK, since there seems to be some confusion about what I mean, maybe I should describe the actual problems I'm trying to solve: 1) When I notice an error in a file that I've copied from a ZFS disk I want to know whether that error is also in the original file on my ZFS disk or if it's only in the copy. 2) Before I destroy an old backup copy of a file I want to know that the other copy, which is on a ZFS disk, is still OK (at least at that very moment). Naturally I could calculate new checksums for all files in question and compare the checksums, but for reasons I won't go into now this is not as feasible as it might seem, and obviously less efficient. Up to now I've been storing md5sums for all files, but keeping the files and their md5sums synchronized is a burden I could do without. Cheers, Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
On Sat, Oct 25, 2008 at 1:57 PM, Marcus Sundman <[EMAIL PROTECTED]> wrote: > I don't want to scrub several TiB of data just to verify a 2 MiB file. I > want to verify just the data of that file. (Well, I don't mind also > verifying whatever other data happens to be in the same blocks.) Just read the file. If the checksum is valid, then it'll read without problems. If it's invalid, then it'll be rebuilt (if you have redundancy in your pool) or you'll get I/O errors (if you don't). Scott ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
Marcus Sundman wrote: > > I couldn't see anything there describing either how to verify the > checksums of individual files or why that would be impossible. If you can read the file, the checksum is OK. If it were not, you would get an I/O error attempting to read it. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
"Scott Laird" <[EMAIL PROTECTED]> wrote: > On Sat, Oct 25, 2008 at 1:57 PM, Marcus Sundman <[EMAIL PROTECTED]> > wrote: > > I don't want to scrub several TiB of data just to verify a 2 MiB > > file. I want to verify just the data of that file. (Well, I don't > > mind also verifying whatever other data happens to be in the same > > blocks.) > > Just read the file. If the checksum is valid, then it'll read without > problems. If it's invalid, then it'll be rebuilt (if you have > redundancy in your pool) or you'll get I/O errors (if you don't). So what you're trying to say is "cat the file to /dev/null and check for I/O errors", right? And how do I check for I/O errors? Should I run "zpool status -v" and see if the file in question is listed there? Cheers, Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
Ian Collins <[EMAIL PROTECTED]> wrote: > Marcus Sundman wrote: > > I couldn't see anything there describing either how to verify the > > checksums of individual files or why that would be impossible. > > If you can read the file, the checksum is OK. If it were not, you > would get an I/O error attempting to read it. Are these I/O errors written to stdout or stderr or where? Regards, Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
Marcus Sundman wrote: > Ian Collins <[EMAIL PROTECTED]> wrote: > >> Marcus Sundman wrote: >> >>> I couldn't see anything there describing either how to verify the >>> checksums of individual files or why that would be impossible. >>> >> If you can read the file, the checksum is OK. If it were not, you >> would get an I/O error attempting to read it. >> > > Are these I/O errors written to stdout or stderr or where? > > Yes, stderr. You will not be able top open the file. One of the great benefits of ZFS is you don't have to manually verify checksums of files on disk. Unless you want to make sure they haven't been maliciously altered that is. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Verify files' checksums
Ian Collins <[EMAIL PROTECTED]> wrote: > Marcus Sundman wrote: > > Are these I/O errors written to stdout or stderr or where? > > Yes, stderr. OK, good, thanks. > You will not be able top open the file. What?! Even if there are errors I want to still be able to read the file to salvage what can be salvaged. E.g., if one byte in a picture file is wrong then it's quite likely I can still use the picture. If ZFS denies access to the whole file, or even to the whole block with the error, then the whole file is ruined. That's very bad. Are you sure there is no way to read the file anyway? > One of the great benefits of ZFS is you don't have to manually verify > checksums of files on disk. Unless you want to make sure they haven't > been maliciously altered that is. Malicious alteration is not the only way for unwanted changes to a disk. Cheers, Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?
On Thu, Oct 23, 2008 at 4:04 PM, Peter Bridge <[EMAIL PROTECTED]> wrote: > I'm looking to buy some new hardware to build a home ZFS based NAS. I know > ZFS can be quite CPU/mem hungry and I'd appreciate some opinions on the > following combination: > > Intel Essential Series D945GCLF2 > Kingston ValueRAM DIMM 2GB PC2-5300U CL5 (DDR2-667) (KVR667D2N5/2G) > > Firstly, does it sound like a reasonable combination to run OpenSolaris? > > Will Solaris make use of both processors? / all cores? > > Is it going to be enough power to run ZFS? > > I read that ZFS prefers 64bit, but it's not clear to me if the above board > will provide 64bit support. > > Also I already have 2 SATA II disks to throw in (using both onboard SATA II > ports), but ideally I would like to add a OS suitable PCI SATA card to add > maybe another 4 disks. Any suggestions on a suitable card please? > -- quoting myself in another (possibly off-topic post) --- I've tested OpenSolaris build 98, Belenix 0.7.1 and os20080501 on the Intel D945GCLF2 Dual Core 1.6GHz Atom Mini-ITX Board. Note the "2" at the end of the part number - this indicates the dual-core Atom CPU. All run fine and this board supports a single 2Gb DIMM. It's a little slow if you're building a desktop box, but fine if you're just doing lightweight browsing, word processing etc. Note that the board chipset consumes more power than the Atom CPU. A typical system based on this board will consume around 55 Watts. The other good news - this board costs about $80 (including the soldered in CPU). Just add a 2Gb DIMM and an IDE drive and you're up and running! -- end of quote - save time typing! -- This is a great board - but a step backwards in terms of total CPU horsepower, max memory size and expansion capability. It's 32-bit. Would I recommend it for ZFS - no. Is it future proof - no. You have not described your requirements (low-power ??, low-cost ??). But I'll contribute some pointers anyway! :) See this article entitled: G31 And E7200: The Real Low-Power Story October 10, 2008 – 1:50 AM – Motherboards at: http://www.tomshardware.com/reviews/intel-e7200-g31,2039.html The E7200 dual-core (2.53GHz with 3Mb of cache) is a "sleeper" product IMHO. Low power (well below the published 65W power envelope), plenty of grunt and priced to go. Couple this chip on a system with 4 or 8Gb of RAM and you have a winner. For example, consider the "mid tier" system here: http://www.techreport.com/articles.x/15737/5 (the motherboard is $126) with an e7200 CPU and 2 memory kits from here: http://www.amazon.com/s/ref=nb_ss_gw?url=search-alias%3Daps&field-keywords=KVR800D2K2%2F4GR&x=0&y=0 Also, take a look at: http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=2010170147%201052108080%201052420643%201052315794%201052516065&name=5 - look at the pricing *after* rebates and you're looking at brand-name memory (2 * 2Gb = 4Gb total) for $65 here: http://www.newegg.com/Product/Product.aspx?Item=N82E16820227298 With ZFS - the most important hardware component is RAM. Get as much RAM as your motherboard will support (along with any budgetary constraints). My advice is the E7200 CPU, 8Gb of RAM and you'll have a smile on your face every time you use this system. If you want a small system that is pre-built, look at every possible permutation/combination of the Dell Vostro 200 box. Yes - I just put together a system based on this box and made a few "modifications" - like replacing the PSU with a Corsair VX450W, added 4 * 1Gb of RAM and an ATI Radeon 4850 (BTW Nvidia is much better supported under OpenSolaris). This system was built as a cost effective gamer box - but it would make a great ZFS box for 2 to 4 SATA drives (with the upgrades listed above [minus the graphics card]). Email me offline if I can answer any further questions. PS: It'll probably take you 2 or 3 hours to evaluate every combination possible of the dell Vostro 200 box - but the price/performance is unbeatable and it's hard to put together a comparable system, from parts, for less money. Obviously Dell gets Intel processors for way less than you and I. Regards, -- Al Hopper Logical Approach Inc,Plano,TX [EMAIL PROTECTED] Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] diagnosing read performance problem
Hi Matt What chipset is your PCI network card? (obviously, it not Intel, but what is it?) Do you know which driver the card is using? You say '..The system was fine for a couple of weeks..'. At that point did you change any software - do any updates or upgrades? For instance, did you upgrade to a new build of OpenSolaris? If not, then I would guess it's some sort of hardware problem. Can you try different cables and a different switch - anything in the path between client & server is suspect. A mismatch of Ethernet duplex settings can cause problems - are you sure this is Ok. To get an idea of how the network is running try this: On the Solaris box, do an Ethernet capture with 'snoop' to a file. http://docs.sun.com/app/docs/doc/819-2240/snoop-1m?a=view # snoop -d {device} -o {filename} .. then while capturing, try to play your video file through the network. Control-C to stop the capture. You can then use Ethereal or WireShark to analyze the capture file. On the 'Analyze' menu, select 'Expert Info'. This will look through all the packets and will report any warning or errors it sees. Regards Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] recommendations on adding vdev to raidz zpool
I have a 7x150GB drive (+1 spare) raidz pool that I need to expand. There are 6 open drive bays, so I bought 6 300GB drives and went to add them as a raidz vdev to the existing zpool, but I didn't realize the raidz vdevs needed to have the same number of drives. (why is that?) My plan now is to, create a 5 + 1 spare raidz1 vdev with the new drives, clone all the data over to those, then wipe out the old pool and create another 5/1 raidz1 to add to the pool. Is this the best way to do the upgrade or am I overlooking something? Thanks for the advice! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss