Re: [zfs-discuss] Disabling COMMIT at NFS level, or disabling ZIL on a per-filesystem basis

2008-10-25 Thread Roch Bourbonnais

Le 23 oct. 08 à 05:40, Constantin Gonzalez a écrit :

> Hi,
>
> Bob Friesenhahn wrote:
>> On Wed, 22 Oct 2008, Neil Perrin wrote:
>>> On 10/22/08 10:26, Constantin Gonzalez wrote:
 3. Disable ZIL[1]. This is of course evil, but one customer  
 pointed out to me
that if a tar xvf were writing locally to a ZFS file system,  
 the writes
wouldn't be synchronous either, so there's no point in forcing  
 NFS users
to having a better availability experience at the expense of  
 performance.
>>
>> The conclusion reached here is quite seriously wrong and no Sun
>> employee should suggest it to a customer.  If the system writing to a
>
> I'm not suggesting it to any customer. Actually, I argued quite a  
> long time
> with the customer, trying to convince him that "slow but correct" is  
> better.
>
> The conclusion above is a conscious decision by the customer. He  
> says that he
> does not want NFS to turn any write into a synchronous write, he's  
> happy if
> all writes are asynchronous, because in this case the NFS server is  
> a backup to
> disk device and if power fails he simply restarts the backup 'cause  
> he has the
> data in multiple copies anyway.
>

The case of a full backup (but not incremental) where an operator  is  
monitoring that the server stays up
for the full duration (or does the manual restart of the operation)  
seems like a singular case where this might make half sense.

But as was stated, for performance which is the goal here, better use  
a bulk type transfer of data through some specific protocol
(as opposed to NFS small file manipulations). What this creates is  
that  failure of the server has immediate obvious repercusion on the  
client,
and things can be restarted without further coordination.

I understand also that with NFS directory delegation or Exclusive  
mount points one could solve this NFS peculiarity
(which is totally unrelated to ZFS, and not to be confused with the  
ZFS / SAN storage cache flush condition).


If CIFS is not subject to the same penalty, I can only assume that the  
integrity of the client's view cannot be guaranteed after a server  
crash.
Anyone knows this for sure ?
-r


>> local filesystem reboots then the applications which were running are
>> also lost and will see the new filesystem state when they are
>> restarted.  If an NFS server sponteneously reboots, the applications
>> on the many clients are still running and the client systems are  
>> using
>> cached data.  This means that clients could do very bad things if the
>> filesystem state (as seen by NFS) is suddenly not consistent.  One of
>> the joys of NFS is that the client continues unhindered once the
>> server returns.
>
> Yes, we're both aware of this. In this particular situation, the  
> customer
> would restart his backup job (and thus the client application) in  
> case the
> server dies.
>
> Thanks for pointing out the difference, this is indeed an important  
> distinction.
>
> Cheers,
>   Constantin
>
> -- 
> Constantin Gonzalez  Sun Microsystems  
> GmbH, Germany
> Principal Field Technologist
> http://blogs.sun.com/constantin
> Tel.: +49 89/4 60 08-25 91   
> http://google.com/search?q=constantin+gonzalez
>
> Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim- 
> Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland  
> Boemer
> Vorsitzender des Aufsichtsrates: Martin Haering
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-25 Thread Matt Harrison
Bob Friesenhahn wrote:
> Other people on this list who experienced the exact same problem
> ultimately determined that the problem was with the network card.  I
> recall that Intel NICs were the recommended solution.
> 
> Note that 100MBit is now considered to be a slow link and PCI is also
> considered to be slow.

Thanks for the reply,

Yes I understand that 100mbit and pci are a bit outdated, unfortunately
I'm still campaigning to have our switches upgraded to gbit or 10gbit.

I will see if I can aquire an intel nic to test it with, however before
the problem with NICs started it operating fine. It seems though that
there is an ongoing problem with NICs on this machine.

The onboard ones haven't so much died (they still allow me to use them
from the OS) but they just won't start up or accept there is a cable
plugged in. The PCI nic does seem to be working and transfers to/from
the server seem ok except when there's video being moved.

I will do some testing and see if I can come up with a more definite
reason to the performance problems.

Thanks

Matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] sata on sparc

2008-10-25 Thread Brian Hechinger
On Fri, Oct 24, 2008 at 01:03:32PM -0700, John-Paul Drawneek wrote:
> you have buy a lsi sas card.

Which work *great* by the way.

> not cheap - around 100 GBP

Really?  I picked mine up for $68US including shipping and the SAS<->SATA
breakout cable.  Needless to say I haven't looked at prices lately.

-brian
-- 
"Coding in C is like sending a 3 year old to do groceries. You gotta
tell them exactly what you want or you'll end up with a cupboard full of
pop tarts and pancake mix." -- IRC User (http://www.bash.org/?841435)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-25 Thread Bob Friesenhahn
On Sat, 25 Oct 2008, Matt Harrison wrote:
>
> The onboard ones haven't so much died (they still allow me to use them
> from the OS) but they just won't start up or accept there is a cable
> plugged in. The PCI nic does seem to be working and transfers to/from
> the server seem ok except when there's video being moved.

Hmmm, this may indicate that there is an ethernet cable problem.  Use 
'netstat -I interface' (where interface is the interface name shown by 
'ifconfig -a') to see if the interface error count is increasing.  If 
you are using a "smart" switch, use the switch admistrative interface 
and see if the error count is increasing for the attached switch port. 
Unfortunately your host can only see errors for packets it receives 
and it may be that errors are occuring for packets it sends.

If the ethernet cable is easy to replace, then it may be easiest to 
simply replace it and use a different switch port to see if the 
problem just goes away.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-25 Thread Matt Harrison
On Sat, Oct 25, 2008 at 11:10:42AM -0500, Bob Friesenhahn wrote:
> Hmmm, this may indicate that there is an ethernet cable problem.  Use 
> 'netstat -I interface' (where interface is the interface name shown by 
> 'ifconfig -a') to see if the interface error count is increasing.  If you 
> are using a "smart" switch, use the switch admistrative interface and see 
> if the error count is increasing for the attached switch port. 
> Unfortunately your host can only see errors for packets it receives and it 
> may be that errors are occuring for packets it sends.
>
> If the ethernet cable is easy to replace, then it may be easiest to simply 
> replace it and use a different switch port to see if the problem just goes 
> away.

Ok, I've just tried 2 other cables, one doesn't even get a link light so
it's probably dead. The other one I had suspected was bad and indeed the
connection is terrible and the Oerr field in netstat does increase.

On the other hand, the Oerr field doesn't increase with the original cable,
however the video performance is still bad (although not as bad as with the
2nd replacement cable).

I will make up some new cables, and also place an order for an Intell
Pro100, as they are supposed to be really reliable.

Thanks

Matt


pgpgGq6AmGFWW.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Marcus Sundman
Richard Elling <[EMAIL PROTECTED]> wrote:
> Marcus Sundman wrote:
> > How can I verify the checksums for a specific file?
> 
> ZFS doesn't checksum files.

AFAIK ZFS checksums all data, including the contents of files.

> So a file does not have a checksum to verify.

I wrote "checksums" (plural) for a "file" (singular).


- Marcus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Johan Hartzenberg
On Sat, Oct 25, 2008 at 6:49 PM, Marcus Sundman <[EMAIL PROTECTED]> wrote:

> Richard Elling <[EMAIL PROTECTED]> wrote:
> > Marcus Sundman wrote:
> > > How can I verify the checksums for a specific file?
> >
> > ZFS doesn't checksum files.
>
> AFAIK ZFS checksums all data, including the contents of files.
>
> > So a file does not have a checksum to verify.
>
> I wrote "checksums" (plural) for a "file" (singular).
>

AH - Then you DO mean the ZFS built-in data check-summing - my mistake.  ZFS
checksums allocations (blocks), not files. The checksum for each block is
stored in the parent of that block.  These are not shown to you but you can
"scrub" the pool, which will see zfs run through all the allocations,
checking whether the checksums are valid.

This PDF document is quite old but explains it fairly well:
http://www.google.co.za/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fru.sun.com%2Ftechdays%2Fpresents%2FSolaris%2Fhow_zfs_works.pdf&ei=f3EDSbnjB5iQQbG2wIIC&usg=AFQjCNG8qtO3bFgmD11izooR7SVbiSOI2A&sig2=-EHfv5Puqz8dxkANISionQ

What is not expressly stated in the block is that the ZFS allocation
structure stores the posix layer and file data in the leaf nodes in the
tree.

Cheers,
  _hartz

-- 
Any sufficiently advanced technology is indistinguishable from magic.
   Arthur C. Clarke

My blog: http://initialprogramload.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Marcus Sundman
"Johan Hartzenberg" <[EMAIL PROTECTED]> wrote:
> On Sat, Oct 25, 2008 at 6:49 PM, Marcus Sundman <[EMAIL PROTECTED]>
> wrote:
> > Richard Elling <[EMAIL PROTECTED]> wrote:
> > > Marcus Sundman wrote:
> > > > How can I verify the checksums for a specific file?
> > >
> > > ZFS doesn't checksum files.
> >
> > AFAIK ZFS checksums all data, including the contents of files.
> >
> > > So a file does not have a checksum to verify.
> >
> > I wrote "checksums" (plural) for a "file" (singular).
> >
> 
> AH - Then you DO mean the ZFS built-in data check-summing - my
> mistake.  ZFS checksums allocations (blocks), not files. The checksum
> for each block is stored in the parent of that block.  These are not
> shown to you but you can "scrub" the pool, which will see zfs run
> through all the allocations, checking whether the checksums are valid.

I don't want to scrub several TiB of data just to verify a 2 MiB file. I
want to verify just the data of that file. (Well, I don't mind also
verifying whatever other data happens to be in the same blocks.)

> This PDF document is quite old but explains it fairly well:

I couldn't see anything there describing either how to verify the
checksums of individual files or why that would be impossible.

OK, since there seems to be some confusion about what I mean, maybe I
should describe the actual problems I'm trying to solve:

1) When I notice an error in a file that I've copied from a ZFS disk I
want to know whether that error is also in the original file on my ZFS
disk or if it's only in the copy.

2) Before I destroy an old backup copy of a file I want to know that the
other copy, which is on a ZFS disk, is still OK (at least at that very
moment).

Naturally I could calculate new checksums for all files in question and
compare the checksums, but for reasons I won't go into now this is not
as feasible as it might seem, and obviously less efficient.

Up to now I've been storing md5sums for all files, but keeping the
files and their md5sums synchronized is a burden I could do without.


Cheers,

Marcus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Scott Laird
On Sat, Oct 25, 2008 at 1:57 PM, Marcus Sundman <[EMAIL PROTECTED]> wrote:
> I don't want to scrub several TiB of data just to verify a 2 MiB file. I
> want to verify just the data of that file. (Well, I don't mind also
> verifying whatever other data happens to be in the same blocks.)

Just read the file.  If the checksum is valid, then it'll read without
problems.  If it's invalid, then it'll be rebuilt (if you have
redundancy in your pool) or you'll get I/O errors (if you don't).


Scott
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Ian Collins
Marcus Sundman wrote:
>
> I couldn't see anything there describing either how to verify the
> checksums of individual files or why that would be impossible.
If you can read the file, the checksum is OK. If it were not, you would
get an I/O error attempting to read it.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Marcus Sundman
"Scott Laird" <[EMAIL PROTECTED]> wrote:
> On Sat, Oct 25, 2008 at 1:57 PM, Marcus Sundman <[EMAIL PROTECTED]>
> wrote:
> > I don't want to scrub several TiB of data just to verify a 2 MiB
> > file. I want to verify just the data of that file. (Well, I don't
> > mind also verifying whatever other data happens to be in the same
> > blocks.)
> 
> Just read the file.  If the checksum is valid, then it'll read without
> problems.  If it's invalid, then it'll be rebuilt (if you have
> redundancy in your pool) or you'll get I/O errors (if you don't).

So what you're trying to say is "cat the file to /dev/null and check
for I/O errors", right? And how do I check for I/O errors? Should I run
"zpool status -v" and see if the file in question is listed there?

Cheers,
Marcus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Marcus Sundman
Ian Collins <[EMAIL PROTECTED]> wrote:
> Marcus Sundman wrote:
> > I couldn't see anything there describing either how to verify the
> > checksums of individual files or why that would be impossible.
> 
> If you can read the file, the checksum is OK. If it were not, you
> would get an I/O error attempting to read it.

Are these I/O errors written to stdout or stderr or where?

Regards,
Marcus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Ian Collins
Marcus Sundman wrote:
> Ian Collins <[EMAIL PROTECTED]> wrote:
>   
>> Marcus Sundman wrote:
>> 
>>> I couldn't see anything there describing either how to verify the
>>> checksums of individual files or why that would be impossible.
>>>   
>> If you can read the file, the checksum is OK. If it were not, you
>> would get an I/O error attempting to read it.
>> 
>
> Are these I/O errors written to stdout or stderr or where?
>
>   
Yes, stderr. You will not be able top open the file.

One of the great benefits of ZFS is you don't have to manually verify
checksums of files on disk. Unless you want to make sure they haven't
been maliciously altered that is.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Verify files' checksums

2008-10-25 Thread Marcus Sundman
Ian Collins <[EMAIL PROTECTED]> wrote:
> Marcus Sundman wrote:
> > Are these I/O errors written to stdout or stderr or where?
> 
> Yes, stderr.

OK, good, thanks.

> You will not be able top open the file.

What?! Even if there are errors I want to still be able to read the
file to salvage what can be salvaged. E.g., if one byte in a picture
file is wrong then it's quite likely I can still use the picture. If
ZFS denies access to the whole file, or even to the whole block with
the error, then the whole file is ruined. That's very bad. Are you sure
there is no way to read the file anyway?

> One of the great benefits of ZFS is you don't have to manually verify
> checksums of files on disk. Unless you want to make sure they haven't
> been maliciously altered that is.

Malicious alteration is not the only way for unwanted changes to a disk.


Cheers,

Marcus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?

2008-10-25 Thread Al Hopper
On Thu, Oct 23, 2008 at 4:04 PM, Peter Bridge <[EMAIL PROTECTED]> wrote:
> I'm looking to buy some new hardware to build a home ZFS based NAS.  I know 
> ZFS can be quite CPU/mem hungry and I'd appreciate some opinions on the 
> following combination:
>
> Intel Essential Series D945GCLF2
> Kingston ValueRAM DIMM 2GB PC2-5300U CL5 (DDR2-667) (KVR667D2N5/2G)
>
> Firstly, does it sound like a reasonable combination to run OpenSolaris?
>
> Will Solaris make use of both processors? / all cores?
>
> Is it going to be enough power to run ZFS?
>
> I read that ZFS prefers 64bit, but it's not clear to me if the above board 
> will provide 64bit support.
>
> Also I already have 2 SATA II disks to throw in (using both onboard SATA II 
> ports), but ideally I would like to add a OS suitable PCI SATA card to add 
> maybe another 4 disks.  Any suggestions on a suitable card please?
>

-- quoting myself in another (possibly off-topic post) ---
I've tested OpenSolaris build 98, Belenix 0.7.1 and os20080501 on the
Intel D945GCLF2 Dual Core 1.6GHz Atom Mini-ITX Board.  Note the "2" at
the end of the part number - this indicates the dual-core Atom CPU.
All run fine and this board supports a single 2Gb DIMM.  It's a little
slow if you're building a desktop box, but fine if you're just doing
lightweight browsing, word processing etc.  Note that the board
chipset consumes more power than the Atom CPU.  A typical system based
on this board will consume around 55 Watts.   The other good news -
this board costs about $80 (including the soldered in CPU).  Just add
a 2Gb DIMM and an IDE drive and you're up and running!
-- end of quote - save time typing!  --

This is a great board - but a step backwards in terms of total CPU
horsepower, max memory size and expansion capability.  It's 32-bit.
Would I recommend it for ZFS - no.  Is it future proof - no.

You have not described your requirements (low-power ??, low-cost ??).
But I'll contribute some pointers anyway!  :)

See this article entitled: G31 And E7200: The Real Low-Power Story
October 10, 2008 – 1:50 AM – Motherboards at:
http://www.tomshardware.com/reviews/intel-e7200-g31,2039.html

The E7200 dual-core (2.53GHz with 3Mb of cache) is a "sleeper" product
IMHO.  Low power (well below the published 65W power envelope), plenty
of grunt and priced to go.  Couple this chip on a system with 4 or 8Gb
of RAM and you have a winner.  For example, consider the "mid tier"
system here: http://www.techreport.com/articles.x/15737/5 (the
motherboard is $126) with an e7200 CPU and 2 memory kits from here:
http://www.amazon.com/s/ref=nb_ss_gw?url=search-alias%3Daps&field-keywords=KVR800D2K2%2F4GR&x=0&y=0

Also, take a look at:
http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=2010170147%201052108080%201052420643%201052315794%201052516065&name=5

- look at the pricing *after* rebates and you're looking at brand-name
memory (2 * 2Gb = 4Gb total) for $65 here:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227298

With ZFS - the most important hardware component is RAM.  Get as much
RAM as your motherboard will support (along with any budgetary
constraints).  My advice is the E7200 CPU, 8Gb of RAM and you'll have
a smile on your face every time you use this system.

If you want a small system that is pre-built, look at every possible
permutation/combination of the Dell Vostro 200 box.  Yes - I just put
together a system based on this box and made a few "modifications"  -
like replacing the PSU with a Corsair VX450W, added 4 * 1Gb of RAM and
an ATI Radeon 4850 (BTW Nvidia is much better supported under
OpenSolaris).  This system was built as a cost effective gamer box -
but it would make a great ZFS box for 2 to 4 SATA drives (with the
upgrades listed above [minus the graphics card]).

Email me offline if I can answer any further questions.

PS: It'll probably take you 2 or 3 hours to evaluate every combination
possible of the dell Vostro 200 box - but the price/performance is
unbeatable and it's hard to put together a comparable system, from
parts, for less money.  Obviously  Dell gets Intel processors for way
less than you and I.

Regards,

-- 
Al Hopper  Logical Approach Inc,Plano,TX [EMAIL PROTECTED]
   Voice: 972.379.2133 Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-25 Thread Nigel Smith
Hi Matt
What chipset is your PCI network card?
(obviously, it not Intel, but what is it?)
Do you know which driver the card is using?

You say '..The system was fine for a couple of weeks..'.
At that point did you change any software - do any updates or upgrades?
For instance, did you upgrade to a new build of OpenSolaris?

If not, then I would guess it's some sort of hardware problem.
Can you try different cables and a different switch - anything
in the path between client & server is suspect.

A mismatch of Ethernet duplex settings can cause problems - are
you sure this is Ok.

To get an idea of how the network is running try this:

On the Solaris box, do an Ethernet capture with 'snoop' to a file.
http://docs.sun.com/app/docs/doc/819-2240/snoop-1m?a=view

 # snoop -d {device} -o {filename}

.. then while capturing, try to play your video file through the network.
Control-C to stop the capture.

You can then use Ethereal or WireShark to analyze the capture file.
On the 'Analyze' menu, select 'Expert Info'.
This will look through all the packets and will report
any warning or errors it sees.
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] recommendations on adding vdev to raidz zpool

2008-10-25 Thread Peter Baumgartner
I have a 7x150GB drive (+1 spare) raidz pool that I need to expand.
There are 6 open drive bays, so I bought 6 300GB drives and went to
add them as a raidz vdev to the existing zpool, but I didn't realize
the raidz vdevs needed to have the same number of drives. (why is
that?)

My plan now is to, create a 5 + 1 spare raidz1 vdev with the new
drives, clone all the data over to those, then wipe out the old pool
and create another 5/1 raidz1 to add to the pool. Is this the best way
to do the upgrade or am I overlooking something?

Thanks for the advice!
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss