Re: Has anyone usd hast in production yet - opinions ?
On Mon, Oct 04, 2010 at 09:37:40AM +0100, Pete French wrote: > I've been testing out hast intrenally here, and as a replacement for the > gmirror+ggate stack which we are currently using it seems excellent. I am > actually about to take the plunge and deploy this in production for our > main database server as it seems to operate fine in testing, but before I > do has anyone else done this and are there any interesting things I need > to know ? Please see the freebsd-fs mailing list, which has quite a large number of problem reports/issues being posted to it on a regular basis (and patches are often provided). http://lists.freebsd.org/pipermail/freebsd-fs/ -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Has anyone usd hast in production yet - opinions ?
I've been testing out hast intrenally here, and as a replacement for the gmirror+ggate stack which we are currently using it seems excellent. I am actually about to take the plunge and deploy this in production for our main database server as it seems to operate fine in testing, but before I do has anyone else done this and are there any interesting things I need to know ? cheers, -pete. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs send/receive: is this slow?
Try using zfs receive with the -v flag (gives you some stats at the end): # zfs send storage/bac...@transfer | zfs receive -v storage/compressed/bacula And use the following sysctl (you may set that in /boot/loader.conf, too): # sysctl vfs.zfs.txg.write_limit_override=805306368 I have good results with the 768MB writelimit on systems with at least 8GB RAM. With 4GB ram, you might want to try to set the TXG write limit to a lower threshold (e.g. 256MB): # sysctl vfs.zfs.txg.write_limit_override=268435456 You can experiment with that setting to get the best results on your system. A value of 0 means using calculated default (which is very high). During the operation you can observe what your disks actually do: a) via ZFS pool I/O statistics: # zpool iostat -v 1 b) via GEOM: # gstat -a mm Dňa 4. 10. 2010 4:06, Artem Belevich wrote / napísal(a): > On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille wrote: >> I'm rerunning my test after I had a drive go offline[1]. But I'm not >> getting anything like the previous test: >> >> time zfs send storage/bac...@transfer | mbuffer | zfs receive >> storage/compressed/bacula-buffer >> >> $ zpool iostat 10 10 >> capacity operationsbandwidth >> pool used avail read write read write >> -- - - - - - - >> storage 6.83T 5.86T 8 31 1.00M 2.11M >> storage 6.83T 5.86T207481 25.7M 17.8M > > It may be worth checking individual disk activity using gstat -f 'da.$' > > Some time back I had one drive that was noticeably slower than the > rest of the drives in RAID-Z2 vdev and was holding everything back. > SMART looked OK, there were no obvious errors and yet performance was > much worse than what I'd expect. gstat clearly showed that one drive > was almost constantly busy with much lower number of reads and writes > per second than its peers. > > Perhaps previously fast transfer rates were due to caching effects. > I.e. if all metadata already made it into ARC, subsequent "zfs send" > commands would avoid a lot of random seeks and would show much better > throughput. > > --Artem > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: out of HDD space - zfs degraded
Alexander Leidinger wrote: > On Sat, 02 Oct 2010 22:25:18 -0400 Steve Polyack > wrote: > >> I thin its worth it to think about TLER (or the absence of it) here - >> http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery . Your >> consumer / SATA Hitachi drives likely do not put a limit on the time >> the drive may block on a command while handling inernal errors. If >> we consider that gpt/gisk06-live encountered some kind of error and >> had to relocate a significant number of blocks or perform some other >> error recovery, then it very well may have timed out long enough for >> siis(4) to drop the device. I have no idea what the timeouts are set >> to in the siis(4) driver, nor does anything in your SMART report >> stick out to me (though I'm certainly no expert with SMART data, and >> my understanding is that many drive manufacturers report the various >> parameters in different ways). Timeouts for commands usually defined by ada(4) peripheral driver and ATA transport layer of CAM. Most of timeouts set to 30 seconds. Only time value defined by siis(4) is hard reset time - 15 seconds now. As soon as drive didn't reappeared after `camcontrol reset/rescan ...` done after significant period of time, but required power cycle, I have doubt that any timeout value could help it. It may be also theoretically possible that it was controller firmware stuck, not drive. It would be interesting to power cycle specific drive if problem repeats. > IIRC mav@ (CCed) made a commit regarding this to -current in the not so > distant past. I do not know about the MFC status of this, or if it may > have helped or not in this situation. My last commit to siis(4) 2 weeks ago (merged recently) fixed specific bug in timeout handling, leading to system crash. I don't see alike symptoms here. If there was any messages before "Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device", they could give some hints about original problem. Messages after it could be consequence. Enabling verbose kernel messages could give some more information about what happened there. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Panic: attempted pmap_enter on 2MB page
Alan Cox writes: > I'm afraid that I can't offer much insight without a stack trace. At > initialization time, we map the kernel with 2MB pages. I suspect that > something within the kernel is later trying to change one those mappings. > If I had to guess, it's related to the mfs root. Here is the stack trace. The machine is sitting here in KDB if you need me to extract any information from it. I db> bt Tracing pid 0 tid 0 td 0x80c67140 kdb_enter() at kdbenter+0x3d panic() at panic+0x17b pmap_enter() at pmap_enter+0x641 kmem_malloc() at kmem_malloc+0x1b5 uma_large_malloc() at uma_large_malloc+0x4a malloc() at malloc+0xd7 acpi_alloc_wakeup_handler() at acpi_alloc_wakeup_handler+0x82 mi_startup() at mi_startup+0x59 btext() at btext+0x2c db> -- Dave Hayes - Consultant - Altadena CA, USA - d...@jetcafe.org >>> The opinions expressed above are entirely my own <<< Only one who is seeking certainty can be uncertain. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: out of HDD space - zfs degraded
On Mon, Oct 04, 2010 at 05:03:47PM +0300, Alexander Motin wrote: > Alexander Leidinger wrote: > > On Sat, 02 Oct 2010 22:25:18 -0400 Steve Polyack > > wrote: > > > >> I thin its worth it to think about TLER (or the absence of it) here - > >> http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery . Your > >> consumer / SATA Hitachi drives likely do not put a limit on the time > >> the drive may block on a command while handling inernal errors. If > >> we consider that gpt/gisk06-live encountered some kind of error and > >> had to relocate a significant number of blocks or perform some other > >> error recovery, then it very well may have timed out long enough for > >> siis(4) to drop the device. I have no idea what the timeouts are set > >> to in the siis(4) driver, nor does anything in your SMART report > >> stick out to me (though I'm certainly no expert with SMART data, and > >> my understanding is that many drive manufacturers report the various > >> parameters in different ways). > > Timeouts for commands usually defined by ada(4) peripheral driver and > ATA transport layer of CAM. Most of timeouts set to 30 seconds. Only > time value defined by siis(4) is hard reset time - 15 seconds now. > > As soon as drive didn't reappeared after `camcontrol reset/rescan ...` > done after significant period of time, but required power cycle, I have > doubt that any timeout value could help it. > > It may be also theoretically possible that it was controller firmware > stuck, not drive. It would be interesting to power cycle specific drive > if problem repeats. FWIW, I agree with m...@. I also wanted to talk a bit about TLER, since it wouldn't have helped in this situation. TLER wouldn't have helped because the drive (either mechanically or the firmware) went catatonic and required a power-cycle. TLER would have caused siis(4) to witness a proper ATA error code for the read/write it submit to the drive, and the timeout would (ideally) be shorter than what of the siis(4) or ada(4) layers. So that ATA command would have failed and the OS, almost certainly, would have continued to submit more requests, resulted in an error response, etc... Depending on how the drivers were written, this situation could cause the storage driver to get stuck in an infinite loop trying to read or write to a device that's catatonic. I've seen this happen on Solaris with, again, those Fujitsu disks (the drives return an error status in response to the CDB, the OS/driver says "that's nice" and continues to submit commands to the drive because it's still responding). What I'm trying to say: what ada(4) and/or siis(4) did was the Correct Thing(tm) in my opinion. This is one of the reasons I don't blindly enable TLER on WDC Black disks that I have. Someone would need to quite honestly test the behaviour of TLER with FreeBSD (testing both disk catatonic state as well as transient error + recovery) to see how things behave. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: out of HDD space - zfs degraded
Quoting Dan Langille (from Sun, 03 Oct 2010 08:08:19 -0400): Overnight, the following appeared in /var/log/messages: Oct 2 21:56:46 kraken root: ZFS: checksum mismatch, zpool=storage path=/dev/gpt/disk06-live offset=123103157760 size=1024 Oct 2 21:56:47 kraken root: ZFS: checksum mismatch, zpool=storage path=/dev/gpt/disk06-live offset=123103159808 size=1024 [...] Given the outage from yesterday when ada0 was offline for several hours, I'm guessing that checksum mismatches on that drive are expected. Yes, /dev/gpt/disk06-live == ada0. If you have the possibility to run a scrub of the pool, there should be no additional checksum errors accouring *after* the scrub is *finished*. If checksum errors still appear on this disk after the scrub is finished, you should have a look at the hardware (cable/disk) and take appropriate replacement actions. Bye, Alexander. -- If your mother knew what you're doing, she'd probably hang her head and cry. http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Has anyone usd hast in production yet - opinions ?
> Please see the freebsd-fs mailing list, which has quite a large number > of problem reports/issues being posted to it on a regular basis (and > patches are often provided). Thanks have signed up - I was signed up to 'geom' but not that one. A large number of problem reports is not quite what I was hping for, but good to know,a nd maybe I shall hold off for a while :-) cheers, -pete. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)?
Is this becuase the behavior of "FS=" was changed to match the behavior of awk -F On 2010-10-02 09:58:27PM +0200, Miroslav Lachman wrote: > I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on > 6.4 i386 and 7.3 i386) > > I have this simple test case, where I want 2 columns from GeoIP CSV file: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv > > It should produce output like this: > > # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > (above is taken from FreeBSD 8.1 i386) > > On FreeBSD 6.4 and 7.3 it results in broken first line: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > There are no errors in CSV file, it doesn't metter if I delete the > affected first line from the file. > > It is reproducible with handmade file: > > # cat test.csv > "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia" > "1.10.10.0","1.10.10.255","17435136","17435391","AU","Australia" > "1.11.0.0","1.11.255.255","17498112","17563647","KR","Korea, Republic of" > "1.12.0.0","1.15.255.255","17563648","17825791","CN","China" > "1.16.0.0","1.19.255.255","17825792","18087935","KR","Korea, Republic of" > "1.21.0.0","1.21.255.255","18153472","18219007","JP","Japan" > > > # awk 'FS="," { print $1"-"$2 }' test.csv > "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia"- > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > "1.16.0.0"-"1.19.255.255" > "1.21.0.0"-"1.21.255.255" > > > As it works in 8.1, can it be fixed in 7-STABLE? > (I don't know if it was purposely fixed or if it is coincidence of newer > version of AWK in 8.x) > > Should I file PR for it? > > Miroslav Lachman > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" -- === Peter C. Lai | Bard College at Simon's Rock Systems Administrator| 84 Alford Rd. Information Technology Svcs. | Gt. Barrington, MA 01230 USA peter AT simons-rock.edu | (413) 528-7428 === ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs send/receive: is this slow?
On Mon, October 4, 2010 3:27 am, Martin Matuska wrote: > Try using zfs receive with the -v flag (gives you some stats at the end): > # zfs send storage/bac...@transfer | zfs receive -v > storage/compressed/bacula > > And use the following sysctl (you may set that in /boot/loader.conf, too): > # sysctl vfs.zfs.txg.write_limit_override=805306368 > > I have good results with the 768MB writelimit on systems with at least > 8GB RAM. With 4GB ram, you might want to try to set the TXG write limit > to a lower threshold (e.g. 256MB): > # sysctl vfs.zfs.txg.write_limit_override=268435456 > > You can experiment with that setting to get the best results on your > system. A value of 0 means using calculated default (which is very high). I will experiment with the above. In the meantime: > During the operation you can observe what your disks actually do: > a) via ZFS pool I/O statistics: > # zpool iostat -v 1 > b) via GEOM: > # gstat -a The following output was produced while the original copy was underway. $ sudo gstat -a -b -I 20s dT: 20.002s w: 20.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 7452387 248019.5 64 21287.1 79.4 ada0 7452387 248019.5 64 21287.2 79.4 ada0p1 4492427 246556.7 64 21286.6 63.0 ada1 4494428 246916.9 65 21276.6 66.9 ada2 8379313 24798 13.5 65 21277.5 78.6 ada3 5372306 24774 14.2 64 21277.5 77.6 ada4 10355291 24741 15.9 63 21277.4 79.6 ada5 4380316 24807 13.2 64 21287.7 77.0 ada6 7452387 248019.5 64 21287.4 79.7 gpt/disk06-live 4492427 246556.7 64 21286.7 63.1 ada1p1 4494428 246916.9 65 21276.6 66.9 ada2p1 8379313 24798 13.5 65 21277.6 78.6 ada3p1 5372306 24774 14.2 64 21277.6 77.6 ada4p1 10355291 24741 15.9 63 21277.5 79.6 ada5p1 4380316 24807 13.2 64 21287.8 77.0 ada6p1 4492427 246556.8 64 21286.9 63.4 gpt/disk01-live 4494428 246916.9 65 21276.8 67.2 gpt/disk02-live 8379313 24798 13.5 65 21277.7 78.8 gpt/disk03-live 5372306 24774 14.2 64 21277.8 77.8 gpt/disk04-live 10355291 24741 15.9 63 21277.7 79.8 gpt/disk05-live 4380316 24807 13.2 64 21288.0 77.2 gpt/disk07-live $ zpool ol iostat 10 capacity operationsbandwidth pool used avail read write read write -- - - - - - - storage 8.08T 4.60T364161 41.7M 7.94M storage 8.08T 4.60T926133 112M 5.91M storage 8.08T 4.60T738164 89.0M 9.75M storage 8.08T 4.60T 1.18K179 146M 8.10M storage 8.08T 4.60T 1.09K193 135M 9.94M storage 8.08T 4.60T 1010185 122M 8.68M storage 8.08T 4.60T 1.06K184 131M 9.65M storage 8.08T 4.60T867178 105M 11.8M storage 8.08T 4.60T 1.06K198 131M 12.0M storage 8.08T 4.60T 1.06K185 131M 12.4M Yeterday's write bandwidth was more 80-90M. It's down, a lot. I'll look closer this evening. > > mm > > DÅa 4. 10. 2010 4:06, Artem Belevich wrote / napÃsal(a): >> On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille wrote: >>> I'm rerunning my test after I had a drive go offline[1]. But I'm not >>> getting anything like the previous test: >>> >>> time zfs send storage/bac...@transfer | mbuffer | zfs receive >>> storage/compressed/bacula-buffer >>> >>> $ zpool iostat 10 10 >>> capacity operationsbandwidth >>> pool used avail read write read write >>> -- - - - - - - >>> storage 6.83T 5.86T 8 31 1.00M 2.11M >>> storage 6.83T 5.86T207481 25.7M 17.8M >> >> It may be worth checking individual disk activity using gstat -f 'da.$' >> >> Some time back I had one drive that was noticeably slower than the >> rest of the drives in RAID-Z2 vdev and was holding everything back. >> SMART looked OK, there were no obvious errors and yet performance was >> much worse than what I'd expect. gstat clearly showed that one drive >> was almost constantly busy with much lower number of reads and writes >> per second than its peers. >> >> Perhaps previously fast transfer rates were due to caching effects. >> I.e. if all metadata already made it into ARC, subsequent "zfs send" >> commands would avoid a lot of random seeks and would show much better >> throughput. >> >> --Artem >> ___ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
Re: zfs send/receive: is this slow?
On Mon, Oct 04, 2010 at 01:31:07PM -0400, Dan Langille wrote: > > On Mon, October 4, 2010 3:27 am, Martin Matuska wrote: > > Try using zfs receive with the -v flag (gives you some stats at the end): > > # zfs send storage/bac...@transfer | zfs receive -v > > storage/compressed/bacula > > > > And use the following sysctl (you may set that in /boot/loader.conf, too): > > # sysctl vfs.zfs.txg.write_limit_override=805306368 > > > > I have good results with the 768MB writelimit on systems with at least > > 8GB RAM. With 4GB ram, you might want to try to set the TXG write limit > > to a lower threshold (e.g. 256MB): > > # sysctl vfs.zfs.txg.write_limit_override=268435456 > > > > You can experiment with that setting to get the best results on your > > system. A value of 0 means using calculated default (which is very high). > > I will experiment with the above. In the meantime: > > > During the operation you can observe what your disks actually do: > > a) via ZFS pool I/O statistics: > > # zpool iostat -v 1 > > b) via GEOM: > > # gstat -a > > The following output was produced while the original copy was underway. > > $ sudo gstat -a -b -I 20s > dT: 20.002s w: 20.000s > L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name > 7452387 248019.5 64 21287.1 79.4 ada0 > 7452387 248019.5 64 21287.2 79.4 ada0p1 > 4492427 246556.7 64 21286.6 63.0 ada1 > 4494428 246916.9 65 21276.6 66.9 ada2 > 8379313 24798 13.5 65 21277.5 78.6 ada3 > 5372306 24774 14.2 64 21277.5 77.6 ada4 >10355291 24741 15.9 63 21277.4 79.6 ada5 > 4380316 24807 13.2 64 21287.7 77.0 ada6 > 7452387 248019.5 64 21287.4 79.7 gpt/disk06-live > 4492427 246556.7 64 21286.7 63.1 ada1p1 > 4494428 246916.9 65 21276.6 66.9 ada2p1 > 8379313 24798 13.5 65 21277.6 78.6 ada3p1 > 5372306 24774 14.2 64 21277.6 77.6 ada4p1 >10355291 24741 15.9 63 21277.5 79.6 ada5p1 > 4380316 24807 13.2 64 21287.8 77.0 ada6p1 > 4492427 246556.8 64 21286.9 63.4 gpt/disk01-live > 4494428 246916.9 65 21276.8 67.2 gpt/disk02-live > 8379313 24798 13.5 65 21277.7 78.8 gpt/disk03-live > 5372306 24774 14.2 64 21277.8 77.8 gpt/disk04-live >10355291 24741 15.9 63 21277.7 79.8 gpt/disk05-live > 4380316 24807 13.2 64 21288.0 77.2 gpt/disk07-live > > $ zpool ol iostat 10 >capacity operationsbandwidth > pool used avail read write read write > -- - - - - - - > storage 8.08T 4.60T364161 41.7M 7.94M > storage 8.08T 4.60T926133 112M 5.91M > storage 8.08T 4.60T738164 89.0M 9.75M > storage 8.08T 4.60T 1.18K179 146M 8.10M > storage 8.08T 4.60T 1.09K193 135M 9.94M > storage 8.08T 4.60T 1010185 122M 8.68M > storage 8.08T 4.60T 1.06K184 131M 9.65M > storage 8.08T 4.60T867178 105M 11.8M > storage 8.08T 4.60T 1.06K198 131M 12.0M > storage 8.08T 4.60T 1.06K185 131M 12.4M > > Yeterday's write bandwidth was more 80-90M. It's down, a lot. > > I'll look closer this evening. > > > > > > mm > > > > DÅa 4. 10. 2010 4:06, Artem Belevich wrote / napÃsal(a): > >> On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille wrote: > >>> I'm rerunning my test after I had a drive go offline[1]. But I'm not > >>> getting anything like the previous test: > >>> > >>> time zfs send storage/bac...@transfer | mbuffer | zfs receive > >>> storage/compressed/bacula-buffer > >>> > >>> $ zpool iostat 10 10 > >>> capacity operationsbandwidth > >>> pool used avail read write read write > >>> -- - - - - - - > >>> storage 6.83T 5.86T 8 31 1.00M 2.11M > >>> storage 6.83T 5.86T207481 25.7M 17.8M > >> > >> It may be worth checking individual disk activity using gstat -f 'da.$' > >> > >> Some time back I had one drive that was noticeably slower than the > >> rest of the drives in RAID-Z2 vdev and was holding everything back. > >> SMART looked OK, there were no obvious errors and yet performance was > >> much worse than what I'd expect. gstat clearly showed that one drive > >> was almost constantly busy with much lower number of reads and writes > >> per second than its peers. > >> > >> Perhaps previously fast transfer rates were due to caching effects. > >> I.e. if all metadata already made it into ARC, subsequent "zf
Re: out of HDD space - zfs degraded
On 10/4/2010 7:19 AM, Alexander Leidinger wrote: Quoting Dan Langille (from Sun, 03 Oct 2010 08:08:19 -0400): Overnight, the following appeared in /var/log/messages: Oct 2 21:56:46 kraken root: ZFS: checksum mismatch, zpool=storage path=/dev/gpt/disk06-live offset=123103157760 size=1024 Oct 2 21:56:47 kraken root: ZFS: checksum mismatch, zpool=storage path=/dev/gpt/disk06-live offset=123103159808 size=1024 [...] Given the outage from yesterday when ada0 was offline for several hours, I'm guessing that checksum mismatches on that drive are expected. Yes, /dev/gpt/disk06-live == ada0. If you have the possibility to run a scrub of the pool, there should be no additional checksum errors accouring *after* the scrub is *finished*. If checksum errors still appear on this disk after the scrub is finished, you should have a look at the hardware (cable/disk) and take appropriate replacement actions. For the record, -just finished a scrub. $ zpool status pool: storage state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed after 13h48m with 0 errors on Mon Oct 4 16:54:15 2010 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz2 ONLINE 0 0 0 gpt/disk01-live ONLINE 0 0 0 gpt/disk02-live ONLINE 0 0 0 gpt/disk03-live ONLINE 0 0 0 gpt/disk04-live ONLINE 0 0 0 gpt/disk05-live ONLINE 0 0 0 gpt/disk06-live ONLINE 0 0 141K 3.47G repaired gpt/disk07-live ONLINE 0 0 0 errors: No known data errors -- Dan Langille - http://langille.org/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs send/receive: is this slow?
On 10/4/2010 2:10 PM, Jeremy Chadwick wrote: On Mon, Oct 04, 2010 at 01:31:07PM -0400, Dan Langille wrote: On Mon, October 4, 2010 3:27 am, Martin Matuska wrote: Try using zfs receive with the -v flag (gives you some stats at the end): # zfs send storage/bac...@transfer | zfs receive -v storage/compressed/bacula And use the following sysctl (you may set that in /boot/loader.conf, too): # sysctl vfs.zfs.txg.write_limit_override=805306368 I have good results with the 768MB writelimit on systems with at least 8GB RAM. With 4GB ram, you might want to try to set the TXG write limit to a lower threshold (e.g. 256MB): # sysctl vfs.zfs.txg.write_limit_override=268435456 You can experiment with that setting to get the best results on your system. A value of 0 means using calculated default (which is very high). I will experiment with the above. In the meantime: During the operation you can observe what your disks actually do: a) via ZFS pool I/O statistics: # zpool iostat -v 1 b) via GEOM: # gstat -a The following output was produced while the original copy was underway. $ sudo gstat -a -b -I 20s dT: 20.002s w: 20.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 7452387 248019.5 64 21287.1 79.4 ada0 7452387 248019.5 64 21287.2 79.4 ada0p1 4492427 246556.7 64 21286.6 63.0 ada1 4494428 246916.9 65 21276.6 66.9 ada2 8379313 24798 13.5 65 21277.5 78.6 ada3 5372306 24774 14.2 64 21277.5 77.6 ada4 10355291 24741 15.9 63 21277.4 79.6 ada5 4380316 24807 13.2 64 21287.7 77.0 ada6 7452387 248019.5 64 21287.4 79.7 gpt/disk06-live 4492427 246556.7 64 21286.7 63.1 ada1p1 4494428 246916.9 65 21276.6 66.9 ada2p1 8379313 24798 13.5 65 21277.6 78.6 ada3p1 5372306 24774 14.2 64 21277.6 77.6 ada4p1 10355291 24741 15.9 63 21277.5 79.6 ada5p1 4380316 24807 13.2 64 21287.8 77.0 ada6p1 4492427 246556.8 64 21286.9 63.4 gpt/disk01-live 4494428 246916.9 65 21276.8 67.2 gpt/disk02-live 8379313 24798 13.5 65 21277.7 78.8 gpt/disk03-live 5372306 24774 14.2 64 21277.8 77.8 gpt/disk04-live 10355291 24741 15.9 63 21277.7 79.8 gpt/disk05-live 4380316 24807 13.2 64 21288.0 77.2 gpt/disk07-live $ zpool ol iostat 10 capacity operationsbandwidth pool used avail read write read write -- - - - - - - storage 8.08T 4.60T364161 41.7M 7.94M storage 8.08T 4.60T926133 112M 5.91M storage 8.08T 4.60T738164 89.0M 9.75M storage 8.08T 4.60T 1.18K179 146M 8.10M storage 8.08T 4.60T 1.09K193 135M 9.94M storage 8.08T 4.60T 1010185 122M 8.68M storage 8.08T 4.60T 1.06K184 131M 9.65M storage 8.08T 4.60T867178 105M 11.8M storage 8.08T 4.60T 1.06K198 131M 12.0M storage 8.08T 4.60T 1.06K185 131M 12.4M Yeterday's write bandwidth was more 80-90M. It's down, a lot. I'll look closer this evening. mm Dňa 4. 10. 2010 4:06, Artem Belevich wrote / napÃsal(a): On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille wrote: I'm rerunning my test after I had a drive go offline[1]. But I'm not getting anything like the previous test: time zfs send storage/bac...@transfer | mbuffer | zfs receive storage/compressed/bacula-buffer $ zpool iostat 10 10 capacity operationsbandwidth pool used avail read write read write -- - - - - - - storage 6.83T 5.86T 8 31 1.00M 2.11M storage 6.83T 5.86T207481 25.7M 17.8M It may be worth checking individual disk activity using gstat -f 'da.$' Some time back I had one drive that was noticeably slower than the rest of the drives in RAID-Z2 vdev and was holding everything back. SMART looked OK, there were no obvious errors and yet performance was much worse than what I'd expect. gstat clearly showed that one drive was almost constantly busy with much lower number of reads and writes per second than its peers. Perhaps previously fast transfer rates were due to caching effects. I.e. if all metadata already made it into ARC, subsequent "zfs send" commands would avoid a lot of random seeks and would show much better throughput. --Artem Please read all of the following items before responding in-line. Some are just informational items for other people reading the thread. Than
Re: Panic: attempted pmap_enter on 2MB page
Dave Hayes wrote: Alan Cox writes: I'm afraid that I can't offer much insight without a stack trace. At initialization time, we map the kernel with 2MB pages. I suspect that something within the kernel is later trying to change one those mappings. If I had to guess, it's related to the mfs root. Here is the stack trace. The machine is sitting here in KDB if you need me to extract any information from it. I db> bt Tracing pid 0 tid 0 td 0x80c67140 kdb_enter() at kdbenter+0x3d panic() at panic+0x17b pmap_enter() at pmap_enter+0x641 kmem_malloc() at kmem_malloc+0x1b5 uma_large_malloc() at uma_large_malloc+0x4a malloc() at malloc+0xd7 acpi_alloc_wakeup_handler() at acpi_alloc_wakeup_handler+0x82 mi_startup() at mi_startup+0x59 btext() at btext+0x2c db> Thanks. There are two pieces of information that might be helpful: the value of the global variable "kernel_vm_end" and the virtual address that was passed to pmap_enter(). Is this problem reproducible? I don't recall if you mentioned that earlier. Can you take a crash dump? Alan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs send/receive: is this slow?
On Mon, Oct 04, 2010 at 07:39:16PM -0400, Dan Langille wrote: > On 10/4/2010 2:10 PM, Jeremy Chadwick wrote: > >On Mon, Oct 04, 2010 at 01:31:07PM -0400, Dan Langille wrote: > >> > >>On Mon, October 4, 2010 3:27 am, Martin Matuska wrote: > >>>Try using zfs receive with the -v flag (gives you some stats at the end): > >>># zfs send storage/bac...@transfer | zfs receive -v > >>>storage/compressed/bacula > >>> > >>>And use the following sysctl (you may set that in /boot/loader.conf, too): > >>># sysctl vfs.zfs.txg.write_limit_override=805306368 > >>> > >>>I have good results with the 768MB writelimit on systems with at least > >>>8GB RAM. With 4GB ram, you might want to try to set the TXG write limit > >>>to a lower threshold (e.g. 256MB): > >>># sysctl vfs.zfs.txg.write_limit_override=268435456 > >>> > >>>You can experiment with that setting to get the best results on your > >>>system. A value of 0 means using calculated default (which is very high). > >> > >>I will experiment with the above. In the meantime: > >> > >>>During the operation you can observe what your disks actually do: > >>>a) via ZFS pool I/O statistics: > >>># zpool iostat -v 1 > >>>b) via GEOM: > >>># gstat -a > >> > >>The following output was produced while the original copy was underway. > >> > >>$ sudo gstat -a -b -I 20s > >>dT: 20.002s w: 20.000s > >> L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name > >> 7452387 248019.5 64 21287.1 79.4 ada0 > >> 7452387 248019.5 64 21287.2 79.4 ada0p1 > >> 4492427 246556.7 64 21286.6 63.0 ada1 > >> 4494428 246916.9 65 21276.6 66.9 ada2 > >> 8379313 24798 13.5 65 21277.5 78.6 ada3 > >> 5372306 24774 14.2 64 21277.5 77.6 ada4 > >>10355291 24741 15.9 63 21277.4 79.6 ada5 > >> 4380316 24807 13.2 64 21287.7 77.0 ada6 > >> 7452387 248019.5 64 21287.4 79.7 > >> gpt/disk06-live > >> 4492427 246556.7 64 21286.7 63.1 ada1p1 > >> 4494428 246916.9 65 21276.6 66.9 ada2p1 > >> 8379313 24798 13.5 65 21277.6 78.6 ada3p1 > >> 5372306 24774 14.2 64 21277.6 77.6 ada4p1 > >>10355291 24741 15.9 63 21277.5 79.6 ada5p1 > >> 4380316 24807 13.2 64 21287.8 77.0 ada6p1 > >> 4492427 246556.8 64 21286.9 63.4 > >> gpt/disk01-live > >> 4494428 246916.9 65 21276.8 67.2 > >> gpt/disk02-live > >> 8379313 24798 13.5 65 21277.7 78.8 > >> gpt/disk03-live > >> 5372306 24774 14.2 64 21277.8 77.8 > >> gpt/disk04-live > >>10355291 24741 15.9 63 21277.7 79.8 > >> gpt/disk05-live > >> 4380316 24807 13.2 64 21288.0 77.2 > >> gpt/disk07-live > >> > >>$ zpool ol iostat 10 > >>capacity operationsbandwidth > >>pool used avail read write read write > >>-- - - - - - - > >>storage 8.08T 4.60T364161 41.7M 7.94M > >>storage 8.08T 4.60T926133 112M 5.91M > >>storage 8.08T 4.60T738164 89.0M 9.75M > >>storage 8.08T 4.60T 1.18K179 146M 8.10M > >>storage 8.08T 4.60T 1.09K193 135M 9.94M > >>storage 8.08T 4.60T 1010185 122M 8.68M > >>storage 8.08T 4.60T 1.06K184 131M 9.65M > >>storage 8.08T 4.60T867178 105M 11.8M > >>storage 8.08T 4.60T 1.06K198 131M 12.0M > >>storage 8.08T 4.60T 1.06K185 131M 12.4M > >> > >>Yeterday's write bandwidth was more 80-90M. It's down, a lot. > >> > >>I'll look closer this evening. > >> > >> > >>> > >>>mm > >>> > >>>Dňa 4. 10. 2010 4:06, Artem Belevich wrote / napÃsal(a): > On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille wrote: > >I'm rerunning my test after I had a drive go offline[1]. But I'm not > >getting anything like the previous test: > > > >time zfs send storage/bac...@transfer | mbuffer | zfs receive > >storage/compressed/bacula-buffer > > > >$ zpool iostat 10 10 > > capacity operationsbandwidth > >pool used avail read write read write > >-- - - - - - - > >storage 6.83T 5.86T 8 31 1.00M 2.11M > >storage 6.83T 5.86T207481 25.7M 17.8M > > It may be worth checking individual disk activity using gstat -f 'da.$' > > Some time back I had one drive that was noticeably slower than the > rest of the drives in RAID-Z2 vdev and was holding everything back. > SMART looked OK, there were no obvious errors and yet perfor