> Fsck can only repair known faults; known
> discrepancies in the meta data.
> Since ZFS doesn't have such known discrepancies,
> there's nothing to repair.
I'm rather tired of hearing this mantra.
If ZFS detects an error in part of its data structures, then there is clearly
something to repair.
In other words:
Dont feed the troll.
Greets
Jan Dreyer
zfs-discuss-boun...@opensolaris.org <> wrote :
> Good. It looks like this thread can finally die. I received the
> following in response to my message below:
>
>
>
>
> This is an automatically generated Delivery Status Notification
>
>
Toby Thain wrote:
On 10-Feb-09, at 10:36 PM, Frank Cusack wrote:
On February 10, 2009 4:41:35 PM -0800 Jeff Bonwick
wrote:
Not if the disk drive just *ignores* barrier and flush-cache commands
and returns success. Some consumer drives really do exactly that.
ouch.
If it were possible to d
Good. It looks like this thread can finally die. I received the
following in response to my message below:
This is an automatically generated Delivery Status Notification
Delivery to the following recipient failed permanently:
cont...@desystems.cc
Technical details of permanent failure:
On Tue, Feb 10, 2009 at 4:14 PM, D. Eckert wrote:
> I think you are not reading carefully enough, and I
> can trace from your reply a typically American
> arrogant behavior.
>
> WE, THE PROUDEST AND infallibles on earth DID NEVER MAKE
> a mistake. It is just the stupid user who did not read the
>
We have seen some unfortunate miscommunication here, and misinterpretation.
This extends into differences of culture. One of the vocal person in here is
surely not 'Anti-xyz'; rather I sense his intense desire to further the
progress by pointing his finger to some potential wounds.
May I repeat
On 10-Feb-09, at 10:36 PM, Frank Cusack wrote:
On February 10, 2009 4:41:35 PM -0800 Jeff Bonwick
wrote:
Not if the disk drive just *ignores* barrier and flush-cache commands
and returns success. Some consumer drives really do exactly that.
ouch.
If it were possible to detect such disks
On February 10, 2009 4:41:35 PM -0800 Jeff Bonwick
wrote:
Not if the disk drive just *ignores* barrier and flush-cache commands
and returns success. Some consumer drives really do exactly that.
ouch.
If it were possible to detect such disks, I'd add code to ZFS that
would simply refuse to u
> "jb" == Jeff Bonwick writes:
> "tt" == Toby Thain writes:
jb> Not if the disk drive just *ignores* barrier and flush-cache
jb> commands and returns success. Some consumer drives really do
jb> exactly that. That's the issue that people are asking ZFS to
jb> work around
Could this be relevant? Notice sd_cache_control mismatch message. Thank
you everybody for any ideas or help. I really appreciate it.
Feb 06 2009 23:14:07.704531935 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x2487a4cf2e00c0
On 10-Feb-09, at 7:41 PM, Jeff Bonwick wrote:
wellif you want a write barrier, you can issue a flush-cache and
wait for a reply before releasing writes behind the barrier. You
will
get what you want by doing this for certain.
Not if the disk drive just *ignores* barrier and flush-cach
> wellif you want a write barrier, you can issue a flush-cache and
> wait for a reply before releasing writes behind the barrier. You will
> get what you want by doing this for certain.
Not if the disk drive just *ignores* barrier and flush-cache commands
and returns success. Some consumer d
On Tue, 10 Feb 2009, Tim wrote:
You apparently have not used apple's disk. It's nothing remotely resembling
"enterprise-type" disk.
That is not true of Apple's only server system "Xserve". It uses SAS
disks similar to the ones in the enterprise offerings of Sun, IBM,
etc, and at a similar
> wellif you want a write barrier, you can issue a flush-cache and
> wait for a reply before releasing writes behind the barrier. You will
> get what you want by doing this for certain. so a flush-cache is more
> forceful than a barrier, as long as you wait for the reply.
Yes, this is anothe
Hi again everyone,
OK... I'm even more confused at what is happening here when I try to rejoin the
split zfs send file...
When I cat the split files and pipe through cksum, I get the same cksum as the
original (unsplit) zfs send snapshot:
#cat mypictures.zfssnap.split.a[a-d] |cksum
> "ps" == Peter Schuller writes:
ps> This is something I'm interested in, since my preception so
ps> far has been that there is only one. Some driver writer has
ps> the opinion that "flush cache" means to flush the cache, while
ps> the file system writer uses "flush cache" to
On February 10, 2009 1:14:57 PM -0800 "D. Eckert"
wrote:
I hope I've made myself very clear.
Very. Rarely has the adage "what one says reveals more about the
speaker than the subject" been more evident.
And as more postings we have to read in the sound of yours as more we are
thinking to
DE:
I think that a big part of the reason you're getting the responses you
do is not arrogance from Sun or us kool-aid drinkers, but your own tone
and attitude. You didn't ask for help in your initial message at all.
The entire post was a diatribe against Sun and ZFS which was based on
your exper
Roman V. Shaposhnik wrote:
On Wed, 2009-02-11 at 09:49 +1300, Ian Collins wrote:
These posts do sound like someone who is blaming their parents after
breaking a new toy before reading the instructions.
It looks like there's a serious denial of the fact that "bad things
do happen to eve
On Tue, 10 Feb 2009 13:14:57 PST
"D. Eckert" wrote:
> Hello? Did you already recognized the sound of the shot??
> I learned my lesson well, and in future this won't happen
> again, because we will no longer use zfs, but we have a legal
> interest, to get back our data we stored in trust on a non
Peter Schuller wrote:
> It would actually be nice in general I think, not just for ZFS, to
> have some standard "run this tool" that will give you a check list of
> successes/failures that specifically target storage
> correctness. Though correctness cannot be proven, you can at least
> test for co
Mario Goebbels wrote:
The good news is that ZFS is getting popular enough on consumer-grade
hardware. The bad news is that said hardware has a different set of
failure modes, so it takes a bit of work to become resilient to them.
This is pretty high on my short list.
One thing I'd like to
> "de" == D Eckert writes:
de> from your reply a typically American arrogant behavior.
de> WE, THE PROUDEST AND infallibles on earth DID NEVER MAKE a
de> mistake.
Maybe I should speak up since I defended you at the start. To my
view:
REASONABLE:
* expect that ZFS lose al
On Wed, 2009-02-11 at 09:49 +1300, Ian Collins wrote:
> These posts do sound like someone who is blaming their parents after
> breaking a new toy before reading the instructions.
It looks like there's a serious denial of the fact that "bad things
do happen to even the best of people" on this thre
> tcook,
> ...
> Regarding the GUI, I don't know how to disable it.
> There are no virtual consoles, and unlike older
> versions of SunOS and Solaris, it comes up in XDM
> and there is no [apparent] way to get a shell
> without running gnome. I am sure that there is, but
> again, I come from the
Leonid,
You could use the fmdump -eV command to look for problems with these
disks. This command might generate a lot of output, but it should be
clear if the root cause is a problem accessing these devices.
I would also check /var/adm/messages for any driver-related messages.
Cindy
Leonid Roo
if you are interested in my IP Address: no problem:
83.236.164.80
it just exactly approves my assumption, that's best and easier for someone - if
he's in the right position - to adhere a big pavement on someone's mouth to
avoid hearing a legal critique instead of discussing out the problem to f
> ps> A test I did was to write a minimalistic program that simply
> ps> appended one block (8k in this case), fsync():ing in between,
> ps> timing each fsync().
>
> were you the one that suggested writing backwards to make the
> difference bigger? I guess you found that trick unneces
I'll make a meta comment on the thread itself, not on the ZFS issue.
There is more bashing and broad accusations than it would normally happen on a
"professional usage" situation. Maybe a board admin can run a script on the ip
addresses logged and find a more subtle meaning... I don't know, I'm
> ps> This is a recommendation I would give even when you purchase
> ps> non-cheap battery backed hardware RAID controllers (I won't
> ps> mention any names or details to avoid bashing as I'm sure it's
> ps> not specific to the particular vendor I had problems with most
> ps> re
I think you are not reading carefully enough, and I
can trace from your reply a typically American
arrogant behavior.
WE, THE PROUDEST AND infallibles on earth DID NEVER MAKE
a mistake. It is just the stupid user who did not read the
fucking manual carefully enough.
Hello? Did you already
On Tue, Feb 10, 2009 at 12:31:05PM -0800, D. Eckert wrote:
> (...)
> You don't move a pool with 'zfs umount', that only unmounts a single zfs
> filesystem within a pool, but the pool is still active.. 'zpool export'
> releases the pool from the OS, then 'zpool import' on the other machine.
> (...)
DE - could you please post the output of your 'zpool umount usbhdd1'
command? I believe the output will prove useful to the point being
discussed below.
Charles
D. Eckert wrote:
> (...)
> You don't move a pool with 'zfs umount', that only unmounts a single zfs
> filesystem within a pool, but the
> The good news is that ZFS is getting popular enough on consumer-grade
> hardware. The bad news is that said hardware has a different set of
> failure modes, so it takes a bit of work to become resilient to them.
> This is pretty high on my short list.
One thing I'd like to see is an _easy_ opti
Dear All,
Is there any way to figure out which piece is at fault? Sun SAS RAID
(Adaptec/Intel) controller is reporting that drives are good, but ZFS is not
happy about checksum errors. Is there any way to figure out which component
introduced the error?
Leonid
--
This message posted from open
On 10-Feb-09, at 1:05 PM, Peter Schuller wrote:
YES! I recently discovered that VirtualBox apparently defaults to
ignoring flushes, which would, if true, introduce a failure mode
generally absent from real hardware (and eventually resulting in
consistency problems quite unexpected to the user w
D. Eckert wrote:
(...)
You don't move a pool with 'zfs umount', that only unmounts a single zfs
filesystem within a pool, but the pool is still active.. 'zpool export'
releases the pool from the OS, then 'zpool import' on the other machine.
(...)
with all respect: I never read such a non logic
D. Eckert wrote:
(...)
Possibly so. But if you had that ufs/reiserfs on a LVM or on a RAID0
spanning removable drives, you probably wouldn't have been so lucky.
(...)
we are not talking about a RAID 5 array or an LVM. We are talking about a
single FS setup as a zpool over the entire available d
(...)
Possibly so. But if you had that ufs/reiserfs on a LVM or on a RAID0
spanning removable drives, you probably wouldn't have been so lucky.
(...)
we are not talking about a RAID 5 array or an LVM. We are talking about a
single FS setup as a zpool over the entire available disk space on an ext
(...)
You don't move a pool with 'zfs umount', that only unmounts a single zfs
filesystem within a pool, but the pool is still active.. 'zpool export'
releases the pool from the OS, then 'zpool import' on the other machine.
(...)
with all respect: I never read such a non logic ridiculous .
I have
> "rs" == Roman Shaposhnik writes:
rs>1. as a forensics tool that would let you retrieve as much
rs> information as possible from a physically ill device
a nit, but I've never foudn fsck alone useful for this. Maybe for ``a
filesystem trashed by bad RAM/CPU/bugs'' it is usefu
On Tue, Feb 10, 2009 at 1:27 PM, Chris Ridd wrote:
> On 10 Feb 2009, at 18:35, Bryant Eadon wrote:
>
> Given that ZFS is planned to be used in Snow Leopard, is it worth setting
>> something up for consumer grade appliance vendors to 'certify' against?
>> ("Ok, you play nice with ZFS by doing th
On Tue, Feb 10, 2009 at 12:46 PM, Miles Nordin wrote:
>
> It's likely other filesystems are affected by ``the problem'' as I
> define it, just much less so. If that's the case, it'd be much better
> IMHO to fix the real problem once and for all, and find it so that it
> stays fixed, than to make
Hi,
i've followed this thread a bit and I think there are some correct
points on any side of the discussion, but here I see a misconception (at
least I think it is):
D. Eckert schrieb:
> (..)
> Dave made a mistake pulling out the drives with out exporting them first.
> For sure also UFS/XFS/EXT4/
On 2/10/2009 2:54 PM, D. Eckert wrote:
I disagree, see posting above.
ZFS just accepts it 2 or 3 times. after that, your data are passed away to
nirvana for no reason.
And it should be legal, to have an external USB drive with a ZFS. with all
respect, why should a user always care for redunda
On Feb 9, 2009, at 7:06 PM, Jeff Bonwick wrote:
>>There is no substitute for cord-yank tests - many and often. The
>>weird part is, the ZFS design team simulated millions of them.
>>So the full explanation remains to be uncovered?
>
>We simulated power failure; we did not simulate disks that simp
(...)
If anyone asks questions, they get no actual information, but a huge
amount of blame heaped on the sysadmin. Your post is a great example
of the typical way this problem is handled because it does both: deny
information and blame the sysadmin. Though I'm really picking on you
way too much her
On 2/10/2009 2:50 PM, D. Eckert wrote:
(..)
Dave made a mistake pulling out the drives with out exporting them first.
For sure also UFS/XFS/EXT4/.. doesn't like that kind of operations but only
with ZFS you risk to loose ALL your data.
that's the point!
(...)
I did that many times after perform
I disagree, see posting above.
ZFS just accepts it 2 or 3 times. after that, your data are passed away to
nirvana for no reason.
And it should be legal, to have an external USB drive with a ZFS. with all
respect, why should a user always care for redundancy, e. g. setup a mirror on
a single HD
(..)
Dave made a mistake pulling out the drives with out exporting them first.
For sure also UFS/XFS/EXT4/.. doesn't like that kind of operations but only
with ZFS you risk to loose ALL your data.
that's the point!
(...)
I did that many times after performing the umount cmd with ufs/reiserfs
fil
On 10 Feb 2009, at 18:35, Bryant Eadon wrote:
Given that ZFS is planned to be used in Snow Leopard, is it worth
setting something up for consumer grade appliance vendors to
'certify' against? ("Ok, you play nice with ZFS by doing the right
things", etc.. ) Maybe you can give them a 'Gold
> "ps" == Peter Schuller writes:
ps> A test I did was to write a minimalistic program that simply
ps> appended one block (8k in this case), fsync():ing in between,
ps> timing each fsync().
were you the one that suggested writing backwards to make the
difference bigger? I guess y
> "ps" == Peter Schuller writes:
ps> This is a recommendation I would give even when you purchase
ps> non-cheap battery backed hardware RAID controllers (I won't
ps> mention any names or details to avoid bashing as I'm sure it's
ps> not specific to the particular vendor I had
> I use 3 external devices of on 2 models of external enclosures (eSATA and USB
> consumer grade)-- how can I test this write barrier issue on these 2 ?? Is
> it
> worthwhile adding to a wiki (table) somewhere what has or has not been tested
> ?
It depends on circumstances. If write barriers
> "g" == Gino writes:
g> we lost many zpools with multimillion$ EMC, Netapp and
g> HDS arrays just simulating fc switches power fails.
g> The problem is that ZFS can't properly recover itself.
I don't like what you call ``the problem''---I think it assumes too
much. You m
All,
I've been following the thread titled 'ZFS: unreliable for professional use'
and I've learned a few things. Put simply, external devices don't behave like
internal ones.
>From JB :
>The good news is that ZFS is getting popular enough on consumer-grade
>hardware. The bad news is that s
> "jb" == Jeff Bonwick writes:
jb> We simulated power failure; we did not simulate disks that
jb> simply blow off write ordering. Any disk that you'd ever
jb> deploy in an enterprise or storage appliance context gets this
jb> right.
Did you simulate power failure of iSCSI/FC
On 10-Feb-09, at 1:03 PM, Charles Binford wrote:
Jeff, what do you mean by "disks that simply blow off write
ordering."?
My experience is that most enterprise disks are some flavor of
SCSI, and
host SCSI drivers almost ALWAYS use simple queue tags, implying the
target is free to re-order th
> on a UFS ore reiserfs such errors could be corrected.
In general, UFS has zero capability to actually fix real corruption in
any reliable way.
What you normally do with fsck is repairing *expected* inconsistencies
that the file system was *designed* to produce in the event of e.g. a
sudden rebo
I have updated zilstat to add observability of the size of the sync
write. Details are at:
http://www.goldensrule.com/zilstat-intro
Enjoy!
-- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/
> And again: Why should a 2 weeks old Seagate HDD suddenly be damaged, if there
> was no shock, hit or any other event like that?
I have no information about your particular situation, but you have to
remember the ZFS uncovers problems that otherwise go unnoticed. Just
personally on my private ha
> YES! I recently discovered that VirtualBox apparently defaults to
> ignoring flushes, which would, if true, introduce a failure mode
> generally absent from real hardware (and eventually resulting in
> consistency problems quite unexpected to the user who carefully
> configured her journa
Jeff, what do you mean by "disks that simply blow off write ordering."?
My experience is that most enterprise disks are some flavor of SCSI, and
host SCSI drivers almost ALWAYS use simple queue tags, implying the
target is free to re-order the commands for performance. Are talking
about something
> >However, I just want to state a warning, that ZFS is far from being that
> >what it
> >is promising, and so far from my sum of experience I can't recommend at all
> >to
> >use zfs on a professional system.
>
>
> Or, perhaps, you've given ZFS disks which are so broken that they are
> really
Sanjeev wrote:
Hi,
There was a requirement to measure all the OSYNC writes.
Attached is a simple DTrace script which does this using the
fsinfo provider and fbt::fop_write.
I was wondering if this accurate enough or if I missed any other cases.
I am sure this can be improved in many ways.
I
>
> The good news is that ZFS is getting popular enough on consumer-grade
> hardware. The bad news is that said hardware has a different set of
> failure modes, so it takes a bit of work to become resilient to them.
> This is pretty high on my short list.
So does this basically mean zfs rolls-ba
Hi,
There was a requirement to measure all the OSYNC writes.
Attached is a simple DTrace script which does this using the
fsinfo provider and fbt::fop_write.
I was wondering if this accurate enough or if I missed any other cases.
I am sure this can be improved in many ways.
Thanks and regards,
S
Need help on removing a faulted spare. We tried following with no
success. There is no resilvering active as shown below:
# zpool clear sybdump_pool c7t0d0 spare device
cannot clear errors for c7t0d0: device is reserved as a hot spare
# zpool remove sybdump_pool c7t0d0
# zpool status -
On Mon, Feb 09, 2009 at 11:56:54AM -0800, Gordon Johnson wrote:
> I hope this thread catches someone's attention. I've reviewed the root pool
> recovery guide as posted. It presupposes a certain level of network support,
> for backup and restore, that many opensolaris users may not have.
I di
> What filesystem likes it when disks are pulled out from a LIVE
> filesystem? Try that on UFS and you're f** up too.
Pulling a disk from a live filesystem is the same as pulling the power
from the computer. All modern filesystems can handle that just fine.
UFS with logging on do not even need fsc
> On Mon, 09 Feb 2009 01:46:01 PST
> "D. Eckert" wrote:
>
> > after working for 1 month with ZFS on 2 external
> USB drives I have
> > experienced, that the all new zfs filesystem is the
> most unreliable
> > FS I have ever seen.
> >
> > Since working with the zfs, I have lost datas from:
> >
>
On Mon, 09 Feb 2009 01:46:01 PST
"D. Eckert" wrote:
> after working for 1 month with ZFS on 2 external USB drives I have
> experienced, that the all new zfs filesystem is the most unreliable
> FS I have ever seen.
>
> Since working with the zfs, I have lost datas from:
>
> 1 80 GB external Driv
Hi,
There was a requirement to measure all the OSYNC writes.
Attached is a simple DTrace script which does this using the
fsinfo provider and fbt::fop_write.
I was wondering if this accurate enough or if I missed any other cases.
I am sure this can be improved in many ways.
Thanks and regards,
> > There is no substitute for cord-yank tests - many
> and often. The
> > weird part is, the ZFS design team simulated
> millions of them.
> > So the full explanation remains to be uncovered?
>
> We simulated power failure; we did not simulate disks
> that simply
> blow off write ordering. Any
74 matches
Mail list logo