On 19.01.09 12:09, Tom Bird wrote:
Toby Thain wrote:
On 18-Jan-09, at 6:12 PM, Nathan Kroenert wrote:
Hey, Tom -
Correct me if I'm wrong here, but it seems you are not allowing ZFS any
sort of redundancy to manage.
Every other file system out there runs fine on a single LUN, when things
go wrong you have a fsck utility that patches it up and the world keeps
on turning.
I can't find anywhere that will sell me a 48 drive SATA JBOD with all
the drives presented on a single SAS channel, so running on a single
giant LUN is a real world scenario that ZFS should be able to cope with,
as this is how the hardware I am stuck with is arranged.
Which is particularly catastrophic when one's 'content' is organized as
a monolithic file, as it is here - unless, of course, you have some way
of scavenging that file based on internal structure.
No, it's not a monolithic file, the point I was making there is that no
files are showing up.
r...@cs4:~# find /content
/content
r...@cs4:~# (yes that really is it)
This issue (and previous one reported by Tom) has got some publicity
recently - see here
http://www.uknof.org.uk/uknof13/Bird-Redux.pdf
So i feel like i need to provide a little bit more information about the
outcome (sorry that it is delayed and not as full as previous one).
First, it looked like this:
r...@cs4:~# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
content 62.5T 59.9T 2.63T 95% ONLINE -
r...@cs4:~# zpool status -v
pool: content
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
content ONLINE 0 0 32
c2t8d0 ONLINE 0 0 32
errors: Permanent errors have been detected in the following files:
content:<0x0>
content:<0x2c898>
First permanent error means that root block of the filesystem named
'content' was corrupted (all copies), so it was not possible to open it
and access any content of that filesystem.
Fortunately enough, there were not too much activity on the pool, so we
decided to try previous states of the pool. I do not remember exact txg
number we tried, but it was something like hundred txg back or so. We
checked it with zdb and discovered that that state was more or less good
- at least filesystem content was openable and it was possible to access
its content, so we decided to reactivate that previous state. Pool
imported fine and contents of 'content' was there. Subsequent scrub did
find some errors but I do not remember exactly how much. Tom may have
exact number.
Victor
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss