Do you have a backup copy of your zpool.cache file?
If you have that file, ZFS will happily mount a pool on boot without its slog
device - it'll just flag the slog as faulted and you can do your normal
replace. I used that for a long while on a test server with a ramdisk slog -
and I never nee
Ray, if you don't mind me asking, what was the original problem you had on your
system that makes you think the checksum type is the problem?
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mai
Ray, if you use -o it sets properties for the pool. If you use -O (capital),
it sets the filesystem properties for the default filesystem created with the
pool.
zpool -O can use any valid zfs filesystem option.
But I agree, it's not very clearly documented.
--
This message posted from opensol
Interesting answer, thanks :)
I'd like to dig a little deeper if you don't mind, just to further my own
understanding (which is usually rudimentary compared to a lot of the guys on
here). My belief is that ZFS stores two copies of the metadata for any block,
so corrupt metadata really shouldn'
Good news, great to hear you got your data back.
Victor is a legend, I for one am very glad he's been around to fix these kind
of problems. I imagine he's looking forward to a well earned rest once the
automatic recovery tools become available :)
--
This message posted from opensolaris.org
___
> It is currently difficult to remove slog devices so it is safer to add
> them if you determine they will help rather than reduce performance.
>
> Bob
Fixed in 125 bob:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6574286
--
This message posted from opensolaris.org
__
Hi Tim, that doesn't help in this case - it's a complete lockup apparently
caused by driver issues.
However, the good news ofr Insom is that the bug is closed because the problem
now appears fixed. I tested it and found that it's no longer occuring in
OpenSolaris 2008.11 or 2009.06.
If you mo
What are you running there? snv or OpenSolaris?
Could you try an OpenSolaris 2009.06 live disc and boot directly from that.
Once I was running that build every single hot plug I tried worked flawlessly.
I tried for several hours to replicate the problems that caused me to log that
bug report
Actually, I think this is a case of crossed wires. This issue was reported a
while back on a news site for the X25-M G2. Somebody pointed out that these
devices have 8GB of cache, which is exactly the dataset size they use for the
iops figures.
The X25-E datasheet however states that while wr
Thanks for the update Adam, that's good to hear. Do you have a bug ID number
for this, or happen to know which build it's fixed in?
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensol
Why are you sending from s1? If you've already sent that, the logical thing to
do is send from s3 the next time.
If you really do need to send from the start every time, you can do that with
the -f option on zfs receive, to force it to overwrite newer changes, but you
are going to be sending f
Double WOHOO! Thanks Victor!
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> "I can't boot into an older version because the last
> version I had was b118 which doesn't have zfs version
> 19 support. I've been looking to see if there's a
> way to downgrade via IPS but that's turned up a lot
> of nothing."
>
> If someone can tell me which files are needed for the
> drive
> Isn't dedupe in some ways the antithesis of setting
> copies > 1? We go to a lot of trouble to create redundancy (n-way
> mirroring, raidz-n, copies=n, etc) to make things as robust as
> possible and then we reduce redundancy with dedupe and compression
But are we reducing redundancy? I don't
There was an announcement made in November about auto snapshots being made
obsolete in build 128, I assume major changes are afoot:
http://www.opensolaris.org/jive/thread.jspa?messageID=437516&tstart=0#437516
--
This message posted from opensolaris.org
___
Thanks for the update, it's no help to you of course, but I'm watching your
progress with interest. Your progress updates are very much appreciated.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
ht
> I initialized a new whole-disk pool on an external
> USB drive, and then did zfs send from my big data pool and zfs recv onto the
> new external pool.
> Sometimes this fails, but this time it completed.
That's the key bit for me - zfs send /receive should not just fail at random.
It sounds li
I don't know if there's any easy way to delete them, but I can offer a
suggestion for the future.
Tim Foster has a ZFS automatic snapshot script that would be worth taking a
look at:
http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_0_10
That script lets you fairly easily set up snapshots
however.
If anybody from the UK has a spare card I could buy, I'd love to hear from you.
Failing that we'll go live without one for now and hope Sun release a PCI-e
card we can buy when they launch the flash stuff towards the end of the year.
Ross
This messa
Err... I think you guys are talking cross purposes here. Either I'm missing
the point completely, or are you not aware that the Thumper2 includes a flash
storage slot so the OS doesn't use any of the 48 disks any more?
No, it's not mirrored, but Richard was trying to point out that it's likely
Just came across this myself, the command you want to enable just the web admin
interface is:
# svccfg
svc:> select system/webconsole
svc:/system/webconsole> setprop options/tcp_listen=true
svc:/system/webconsole> quit
# svcadm restart system/webconsole
This message posted from opensolaris
"Caution - You must follow these steps before removing a disk from service.
Failure to follow the procedure can corrupt your data or render your file
system inoperable."
Ross
This message posted from opensolaris.org
___
zfs-discuss
27;m a bit concerned about how they
handle hardware failures.
thanks,
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Great news, thanks for the update :)
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Yeah, I thought of the storage forum today and found somebody else with the
problem, and since my post a couple of people have reported similar issues on
Thumpers.
I guess the storage thread is the best place for this now:
http://www.opensolaris.org/jive/thread.jspa?threadID=42507&tstart=0
T
Have any of you guys reported this to Sun? A quick search of the bug database
doesn't bring up anything that appears related to sata drives and hanging or
hot swapping.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discu
> Ross,
>
> The X4500 uses 6x Marvell 88SX SATA controllers for
> its internal disks. They are not Supermicro
> controllers. The new X4540 uses an LSI chipset
> instead of the Marvell chipset.
>
> --Brett
Yup, and the Supermicro card uses the "Marvell Hercules-2
Heh, yup, memory errors are among the worst to diagnose. Glad you got to the
bottom of it, and it's good to see ZFS again catching faults that otherwise
seem to be missed.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-di
Ok, after doing a lot more testing of this I've found it's not the Supermicro
controller causing problems. It's purely ZFS, and it causes some major
problems! I've even found one scenario that appears to cause huge data loss
without any warning from ZFS - up to 30,000 files and 100MB of data m
Well yeah, this is obviously not a valid setup for my data, but if you read my
first e-mail, the whole point of this test was that I had seen Solaris hang
when a drive was removed from a fully redundant array (five sets of three way
mirrors), and wanted to see what was going on.
So I started wi
Are you running the Solaris CIFS Server by any chance?
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Just checking, are you planning to have the receiving ZFS system read only?
I'm not sure how ZFS receive works on a system if changes have been made, but I
would expect they would be overwritten.
This message posted from opensolaris.org
___
zfs-dis
ly safe way to work.
The question is whether I can make a server I can be confident in. I'm now
planning a very basic OpenSolaris server just using ZFS as a NFS server, is
there anybody out there who can re-assure me that such a server can work well
and handle real life drive failures?
th
>Going back to your USB remove test, if you protect that disk
>at the ZFS level, such as a mirror, then when the disk is removed
>then it will be detected as removed and zfs status will show its
>state as "removed" and the pool as "degraded" but it will continue
>to function, as expected.
>-- richa
I'd expect that to work personally, although I'd just drop one of your boot
mirrors in myself. That leaves the second drive untouched for your other
server. It also means that if it works you could just wipe the old snv_70b and
re-establish the boot mirrors on each server with them.
This m
Wipe the snv_70b disks I meant.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
e on the system.
Ross
PS. In your first post you said you had no time to copy the filesystem, so why
are you trying to use send/receive? Both rsync and send/receive will take a
long time to complete.
This message posted from opensolaris.org
to move your data drives to the new chassis and hope it's not a bad drive
causing the fault.
Either way, let me know how you get on.
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Ok, I haven't actually tested this with Samba, but I've done a lot of looking
into this and spoken with Ed in the past.
First of all you need the Shadow Copy Client from Microsoft installed if you're
running Windows 2000 or XP. Vista has it included already.
Then you need to map a network dri
What does zpool status say?
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
m,
and then pulling the drive to see if that does cure the warnings you're getting
in the logs.
Whatever you do, please keep me posted. Your post has already made me realise
it would be a good idea to have a script watching log file sizes to catch
problems lik
Tell you what, go away, do some work on your own reading around the topics
you're interested in, and when you come back try asking some more intelligent
questions instead of expecting us to do everything for you.
You'll find you get much more helpful replies that way. Right now you come
across
Did anybody ever get this card working? SuperMicro only have Windows and Linux
drivers listed on their site. Do Sun's generic drivers work with this card?
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris
Did anybody ever find out if this option worked? Does setting this hidden
option mean that drives are always listed and named according to the order they
are physically connected?
Ross
> Hi Kent:
>
> So, in using the lsiutil utility, what do I find,
> but the following optio
o, doesn't that make this a potential issue when people do start using
separate media for the ZIL? If the ZIL hardware fails for any reason do you
loose access to your entire pool? What if you're using battery backed nvram
but power is off long enough to wipe the device? What if a serv
r a pool if
we know that X is corrupted", where X refers to any of the core pieces ZFS
needs on its disk(s).
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> In the most recent code base (both OpenSolaris/Nevada and S10Ux with patches)
> all the known marvell88sx problems have long ago been dealt with.
I'd dispute that. My testing appears to show major hot plug problems with the
marvell driver in snv_94.
This message posted from opensolaris.org
Hmm... it appears that my e-mail to the zfs list covering the problems has
disappeared. I will send it again and cross my fingers.
The basic problem I found was that with the Supermicro AOC-SAT2-MV8 card (using
the marvell chipset), drive removals are not detected consistently by Solaris.
The
Yup, and plenty of other interested spectators ;-)
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Ok, I've now reported most of the problems I found, but have additional
information to add to bugs 6667199 and 667208. Can anybody tell me how I go
about reporting that to Sun?
thanks,
Ross
This message posted from opensolaris.org
__
omputers won't behave in odd ways.
I do appreciate my diagnosis may be wrong as I have very limited knowledge of
Solaris' internals, but that is my best guess right now.
Ross
This message posted from opensolaris.org
___
zfs-dis
Huh? Now I'm confused, I thought b95 was just the latest build of OpenSolaris,
I didn't realise that OpenSolaris 2008.05 was different, I thought it was just
an older, more stable build that was updated less often.
Is there anything else I'm missing out on by using snv_94 instead of
OpenSolari
couple of dozen
removals, pulling individual drives, or as many as half a dozen at once. I've
even gone as far as to immediately pull a drive I only just connected. Windows
has no problems at all.
Unfortunately for me, Windows doesn't support ZFS... right now it's looking a
I wouldn't know about using newer ZFS with older builds, but I can tell you
that b94 looks rock solid to me. I've been running it for a few weeks on a
live server and haven't had any crashing or instability problems at all.
Ordinarily, if you're having problems, the first thing I would try woul
lol, I got bored after 13 pages and a whole day of going back through my notes
to pick out the relevant information.
Besides, I did mention that I was using cfgadm to see what was connected :-p.
If you're really interested, most of my troubleshooting notes have been posted
to the forum, but un
are comparing these with 5400rpm drives, and planning them for
the laptop market I can't see them being too expensive. Certainly worth the
money. for ZFS.
Personally I'd like to see something that mounts on a PCIe card, but if I need
to I'll happily start bolting 2.5" SS
Without more specifics it's hard to suggest more. You say it's a P45
motherboard, which one? Is it on the Solaris HCL?
What exact problems are you having when you login with a clean install of b95?
You say everything becomes completely unstable, but what exactly do you mean by
everything? S
Yes, I read that thread.
The problem is nothing you're describing sounds like a problem with ZFS. You
say internet dies suddenly, but what do you mean? Is firefox crashing, are
pages just not loading any more?
Also, you're still saying Wine dies on startup. That's not a standard part of
S
to read.
While hunting for that, I did find this chap who had a couple of million files
in a directory, it might be worth you touching base with him to see how he's
getting on:
http://www.opensolaris.org/jive/thread.jspa?messageID=200632𰾸
Ross
This message posted
es of 7 disks each, with
one drive left over as a hot spare.
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
mory I think it's something around a 10% chance of an error for every
5TB you're rebuilding.
Of course, that could well be just a single bit error, but I've seen a fair few
raid-5 arrays die and experienced a couple of near misses, so I'm very paranoid
now when i
You're seeing exactly the same behaviour I found on my server, using a
Supermicro AOC-SAT2-MV8 SATA controller. It's detailed on the forums under the
topics "Supermicro AOC-SAT2-MV8 hang when drive removed", but unfortunately
that topic split into 3 or 4 pieces so it's a pain to find.
I also r
PS. Does your system definitely support SATA hot swap? Could you for example
test it under windows to see if it runs fine there?
I suspect this is a Solaris driver problem, but it would be good to have
confirmation that the hardware handles this fine.
This message posted from opensolaris.o
Can anybody help me get this pool online. During my testing I've been removing
and re-attaching disks regularly, and it appears that I've attached a disk that
used to be part of the pool, but that doesn't contain up to date data.
Since I've used the same pool name a number of times, it's possib
I've just gotten a pool back online after the server booted with it
unavailable, but found that NFS shares were not automatically restarted when
the pool came online.
Although the pool was online and sharenfs was set, sharemgr was showing no
pools shared, and the NFS server service was disabled
Yes, you should be able to use it on another computer. All the zfs information
is stored on disk. The one thing you need to be aware of is the version of ZFS
your pool is using. Systems can read versions older than the one they support
fine, but they won't be able to mount newer ones. You ca
I might have found an even bigger problem.
It appears that the pool wasn't even mounted correctly. My 'rc-pool' must have
crashed in the past, leaving the mountpoint /rc-pool behind.
On reboot the pool hadn't mounted because the /rc-pool folder was already
there, and it appears that is why the
t to create the new filesystem for the
user, and add it to the list of filesystems your service is backing up?
thanks,
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/m
nce they *definately* are
designed to handle failures.
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
tely cope with what you're wanting to do. I think the sales
literature is a little misleading in that sense.
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ility as it
does to data integrity. A bad checksum makes ZFS read the data from elsewhere,
why shouldn't a timeout do the same thing?
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
?id=3132
Ross
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
a good pool response time for any kind of failure, and it looks
to me like it would work well for any kind of storage device.
I'm looking forward to seeing what holes can be knocked in these ideas with the
next set of replies :)
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ucks_blog/2008/08/your-storage-mi.html
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ad some problems on a home built box after
pulling disks, I suspect a proper raid array will cope fine but haven't been
able to get that tested yet.
thanks,
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss
I'm pretty sure you just need the zpool replace command:
# zpool replace
Run that for the disk you want to replace and let it resilver. Once it's done,
you can unconfigure the old disk with cfgadm and remove it.
If you have multiple mirror vdev's, you'll need to run the command a few times.
Gaah, my command got nerfed by the forum, sorry, should have previewed. What
you want is:
# zpool replace poolname olddisk newdisk
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolar
e
two types of quota had been implemented.
I would have thought those quotas, plus the auto-delete ability of Tim's
snapshot tools should fulfil most needs.
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing li
Hey folks,
Well, there haven't been any more comments knocking holes in this idea, so I'm
wondering now if I should log this as an RFE?
Is this something others would find useful?
Ross
--
This message posted from opensolaris.org
___
z
n use, but it was an interesting test, just a
256MB slog gave us an immediate 12x performance boost over NFS.
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
d knows it's sending to a file, it could check the
integrity of the file once the operation completes, which would probably have
helped these guys.
It might also be useful to output a text file containing the checksum so the
integrity of the file can be verified at a later date.
Ro
This might be a dumb question, but isn't that going to be a big problem on a
busy pool?
I'm planning to use ZFS as a sync NFS fileserver and I'm expecting a fair bit
of traffic. Does that bug mean ZFS is going to have problems resilvering
failed disks on my system?
--
This message posted from
There is an unsupported driver available though, I posted it to the forums a
few months ago:
http://opensolaris.org/jive/thread.jspa?messageID=260281
The big problem appears to be getting your hands on these cards. Although I
have the drivers now my first supplier let me down, and while the sec
I was looking into something like that last year, I was mirroring two iSCSI
drives using ZFS. The only real problem I found was that ZFS hangs the pool
for 3 minutes if an iSCSI device gets disconnected, unfortunately ZFS waits
for the iSCSI timeout before it realises something has happened.
Great to hear the driver works, I'll cross my fingers that my cards eventually
turn up.
I'm pretty sure I read something about work being done on ZFS to smooth out
those lumpy writes, I'm not 100% sure it's the same problem but it did sound
very similar. I can't remember the RFE / bug number o
What version of Solaris are you running there? For a long while the default
response on encountering unrecoverable errors was to panic, but I believe that
has been improved in newer builds.
Also, part of your problem may be down to running with just a single disk.
With just one disk, ZFS stil
It's a capital "i", not an "L". For sending all intermediate snapshots to a
destination.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I agree, it looks like it would be perfect, but unfortunately without Solaris
drivers it's pretty much a non starter.
That hasn't stopped me pestering Fusion-IO wherever I can though to see if they
are willing to develop Solaris drivers, almost everywhere I've seen these
reviewed there have bee
Regarding Solaris, or just a generic cold call? Either way it's interesting to
hear that they're making calls. The impression I've had from all the press is
that they've been struggling to meet demand.
--
This message posted from opensolaris.org
___
z
ht a years worth of data would be
enough, something like:
15 mins: keep 8
hourly: keep 48
daily: keep 31
weekly: keep 10
monthly: keep 12
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolari
will even be ready for inclusion in ON, so while it's in the works it's likely
to be some time before we see it.
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.openso
already? Can anybody confirm that it will behave as I am hoping in that the
script will take the next snapshot, but the send / receive will fail and
generate an e-mail alert?
thanks,
Ross
--
This message posted from opensolaris.org
___
zfs-discuss mai
the error:
# zpool create test /dev/rdsk/c3d1p0
cannot use '/dev/rdsk/c3d1p0': must be a block device or regular file
Unfortunately I'm once again at the limits of my Solaris knowledge, so any help
you folks can give will be very much appreciated.
thanks,
Ross
--
This message po
Oh, ok. So /dev/rdsk is never going to work then. Mind if I pick your brain a
little more then while I try to understand this properly.
The man pages for the nvram card state that /dev/rdsk will normally be the
preferred way to access these devices, since /dev/dsk is cached by the kernel,
whi
Wohoo! Best news I've heard all week. Thanks for posting David :)
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Just a thought, will we be able to split the ioDrive into slices and use it
simultaneously as a ZIL and slog device? 5GB of write cache and 75GB of read
cache sounds to me like a nice way to use the 80GB model.
--
This message posted from opensolaris.org
_
Very interesting idea, thanks for sharing it.
Infiniband would definately be worth looking at for performance, although I
think you'd need iSER to get the benefits and that might still be a little new:
http://www.opensolaris.org/os/project/iser/Release-notes/.
It's also worth bearing in mind
D'oh, meant ZIL / slog and L2ARC device. Must have posted that before my early
morning cuppa!
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> Or would they? A box dedicated to being a RAM based
> slog is going to be
> faster than any SSD would be. Especially if you make
> the expensive jump
> to 8Gb FC.
Not necessarily. While this has some advantages in terms of price &
performance, at ~$2400 the 80GB ioDrive would give it a run f
As far as I can tell, it all comes down to whether ZFS detects the failure
properly, and what commands you use as it's recovering.
Running "zpool status" is a complete no no if your array is degraded in any
way. This is capable of locking up zfs even when it would otherwise have
recovered itse
1 - 100 of 696 matches
Mail list logo