Re: Any success stories for HAST + ZFS?
On Sat, 26 Mar 2011 10:52:08 -0700 Freddie Cash wrote: FC> hastd backtrace is here: FC> http://www.sd73.bc.ca/downloads/crash/hast-backtrace.png It is not a hastd crash, but a kernel crash triggered by hastd process. I am not sure I got the same crash as you but apparently the race is possible in g_gate on device creation. I got the following crash starting many hast providers simultaneously: fault virtual address = 0x0 #8 0xc0c11adc in calltrap () at /usr/src/sys/i386/i386/exception.s:168 #9 0xc086ac6b in g_gate_ioctl (dev=0xc6a24300, cmd=3374345472, addr=0xc9fec000 "\002", flags=3, td=0xc7ff0b80) at /usr/src/sys/geom/gate/g_gate.c:410 #10 0xc0853c5b in devfs_ioctl_f (fp=0xc9b9e310, com=3374345472, data=0xc9fec000, cred=0xc8c9c200, td=0xc7ff0b80) at /usr/src/sys/fs/devfs/devfs_vnops.c:678 #11 0xc09210cd in kern_ioctl (td=0xc7ff0b80, fd=3, com=3374345472, data=0xc9fec000 "\002") at file.h:262 #12 0xc0921254 in ioctl (td=0xc7ff0b80, uap=0xf5edbcec) at /usr/src/sys/kern/sys_generic.c:679 #13 0xc0916616 in syscallenter (td=0xc7ff0b80, sa=0xf5edbce4) at /usr/src/sys/kern/subr_trap.c:315 #14 0xc0c2b9ff in syscall (frame=0xf5edbd28) at /usr/src/sys/i386/i386/trap.c:1086 #15 0xc0c11b71 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:266 Or just creating many ggate devices simultaneously: for i in `jot 100`; do ./ggiocreate $i& done ggiocreate.c is attached. In my case the kernel crashes in g_gate_create() when checking for name collisions in strcmp(): /* Check for name collision. */ for (unit = 0; unit < g_gate_maxunits; unit++) { if (g_gate_units[unit] == NULL) continue; if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0) continue; mtx_unlock(&g_gate_units_lock); mtx_destroy(&sc->sc_queue_mtx); free(sc, M_GATE); return (EEXIST); } I think the issue is the following. When preparing sc we take g_gate_units_lock, check for name collision, fill sc fields except sc->sc_provider, and registers sc in g_gate_units[unit]. sc_provider is filled later, when g_gate_units_lock is released. So the scenario is possible: 1) Thread A registers sc in g_gate_units[unit] with g_gate_units[unit]->sc_provider still null and releases g_gate_units_lock. 2) Thread B traverses g_gate_units[] when checking for name collision and craches accessing g_gate_units[unit]->sc_provider->name. The attached patch fixes the issue in my case. -- Mikolaj Golub ggiocreate.c Description: Binary data Index: sys/geom/gate/g_gate.c === --- sys/geom/gate/g_gate.c (revision 220050) +++ sys/geom/gate/g_gate.c (working copy) @@ -407,13 +407,14 @@ g_gate_create(struct g_gate_ctl_create *ggio) for (unit = 0; unit < g_gate_maxunits; unit++) { if (g_gate_units[unit] == NULL) continue; - if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0) + if (strcmp(name, g_gate_units[unit]->sc_name) != 0) continue; mtx_unlock(&g_gate_units_lock); mtx_destroy(&sc->sc_queue_mtx); free(sc, M_GATE); return (EEXIST); } + sc->sc_name = name; g_gate_units[sc->sc_unit] = sc; g_gate_nunits++; mtx_unlock(&g_gate_units_lock); @@ -432,6 +433,9 @@ g_gate_create(struct g_gate_ctl_create *ggio) sc->sc_provider = pp; g_error_provider(pp, 0); g_topology_unlock(); + mtx_lock(&g_gate_units_lock); + sc->sc_name = sc->sc_provider->name; + mtx_unlock(&g_gate_units_lock); if (sc->sc_timeout > 0) { callout_reset(&sc->sc_callout, sc->sc_timeout * hz, Index: sys/geom/gate/g_gate.h === --- sys/geom/gate/g_gate.h (revision 220050) +++ sys/geom/gate/g_gate.h (working copy) @@ -76,6 +76,7 @@ * 'P:' means 'Protected by'. */ struct g_gate_softc { + char *sc_name; /* P: (read-only) */ int sc_unit; /* P: (read-only) */ int sc_ref; /* P: g_gate_list_mtx */ struct g_provider *sc_provider; /* P: (read-only) */ @@ -96,7 +97,6 @@ struct g_gate_softc { LIST_ENTRY(g_gate_softc) sc_next; /* P: g_gate_list_mtx */ char sc_info[G_GATE_INFOSIZE]; /* P: (read-only) */ }; -#define sc_name sc_provider->geom->name #define G_GATE_DEBUG(lvl, ...) do { \ if (g_gate_debug >= (lvl)) { \ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Any success stories for HAST + ZFS?
On Sun, 27 Mar 2011 15:16:15 +0300 Mikolaj Golub wrote to Freddie Cash: MG> The attached patch fixes the issue in my case. The patch is committed to current. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Any success stories for HAST + ZFS?
On Mon, 28 Mar 2011 10:47:22 +0100 Pete French wrote: >> It is not a hastd crash, but a kernel crash triggered by hastd process. >> >> I am not sure I got the same crash as you but apparently the race is >> possible >> in g_gate on device creation. >> >> I got the following crash starting many hast providers simultaneously: PF> This is very interestng to me - my successful ZFS+HAST only had PF> a single drive, but in my new setup I am intending to use two PF> HAST processes and then mirror across thhem under ZFS, so I am PF> likely to hit this bug. Are the processes stable once launched ? Yes, you may hit it only on hast devices creation. The workaround is to avoid using 'hastctl role primary all', start providers one by one instead. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: way for failover zpool (no HAST needed)
On Tue, 29 Mar 2011 13:17:01 +0200 Denny Schierz wrote: DS> hi, DS> my two nodes are running fine with 8.2-stable and the LSI 9200-8e and DS> now, I want to build a failover for the Zpool (and later ISCSI target) DS> Both nodes are connected to the same disks (jbod) and now I need a way, DS> to get the zpool(s) running on the node with the CARP public IP. You don't need HAST but might you want to try net-mgmt/hastmon? :-) I wrote it because didn't like much failovering with CARP. For hastmon you need at least 3 hosts: 2 cluster nodes (primary/secondary) and watchdog. Watchdog is polling the states of the cluster nodes. Secondary decides to failover when: 1) There is no connection with primary. 2) There are complaints from watchdog. The configuration is simple and would look like below (on all 3 hosts): resource iscsi { exec /etc/iscsi.sh on hostA { remote hostB priority 0 } on hostB { remote hostA priority 1 } on hostW { remote hostA hostB } } /etc/iscsi.sh script should support at least 3 arguments: start -- switch node to primary (iscsi up, IP up, etc); stop -- switch node to secondary; status -- return current status (0 - UP, 1 - DOWN, 2 - UNKNOWN). You can find more information in README: http://code.google.com/p/hastmon/wiki/README -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Any success stories for HAST + ZFS?
On Fri, 01 Apr 2011 11:40:11 +0100 Pete French wrote: >> Yes, you may hit it only on hast devices creation. The workaround is to >> avoid >> using 'hastctl role primary all', start providers one by one instead. PF> Interesting to note that I just hit a lockup in hast (the discs froze PF> up - could not run hastctl or zpool import, and could not kill PF> them). I have two hast devices instead of one, but I am starting them PF> individually instead of using 'all'. The copde includes all the latest PF> patches which have gone into STABLE over the last few days, none of which PF> look particularly controversial! PF> I havent tried your atch yet, nor been able to reporduce the lockup, but PF> thought you might be interested to know that I also had problems with PF> multiple providers. This looks like a different problem. If you have this again please provide the output of 'procstat -kka'. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: geli(4) memory leak
On Fri, 1 Apr 2011 19:43:54 +0200 Victor Balada Diaz wrote: VBD> On Sat, Mar 26, 2011 at 01:33:48AM +0100, Victor Balada Diaz wrote: >> Hello, >> >> I'm trying to setup a new geli disk and i'm seeing what looks like a memory >> leak. >> After initializing the device i've tried to do the dd command from >> /dev/random >> like this one: >> >> dd if=/dev/random of=/dev/da0p1.eli bs=1m >> VBD> Hello again, VBD> I've found the cause of the memory leak and i attach a patch to fix it. I hope VBD> the patch is good enough to get committed or at least helps someone made a better VBD> patch and commit it. Patched file is src/sys/geom/eli/g_eli.c VBD> The problem happens when you're using data integrity verification and you need VBD> to write more than MAXPHYS. If you look at g_eli_integrity.c:314 you'll VBD> see that geli creates a second request to write all that's needed. VBD> Each of the request get the callback to g_eli_write_done once they're done. The VBD> first request will get up to g_eli.c:209 and find that there are still requests VBD> pending so instead of calling g_io_deliver to notify it's written data, it just VBD> returns and waits until all requests are done to say everything's OK. The problem VBD> is that once you return, you're leaking this g_bio. You can see with vmstat -z how VBD> g_bio increases and never releases memory. VBD> I just destroy the current bio before returning and that prevents the memory leak. For me your patch look correct. But the same issue is for read :-). Also, to avoid the leak I think we can just do g_destroy_bio() before "all sectors" check. See the attached patch (had some testing). -- Mikolaj Golub Index: sys/geom/eli/g_eli.c === --- sys/geom/eli/g_eli.c (revision 220168) +++ sys/geom/eli/g_eli.c (working copy) @@ -160,13 +160,13 @@ g_eli_read_done(struct bio *bp) pbp = bp->bio_parent; if (pbp->bio_error == 0) pbp->bio_error = bp->bio_error; + g_destroy_bio(bp); /* * Do we have all sectors already? */ pbp->bio_inbed++; if (pbp->bio_inbed < pbp->bio_children) return; - g_destroy_bio(bp); sc = pbp->bio_to->geom->softc; if (pbp->bio_error != 0) { G_ELI_LOGREQ(0, pbp, "%s() failed", __func__); @@ -202,6 +202,7 @@ g_eli_write_done(struct bio *bp) if (bp->bio_error != 0) pbp->bio_error = bp->bio_error; } + g_destroy_bio(bp); /* * Do we have all sectors already? */ @@ -215,7 +216,6 @@ g_eli_write_done(struct bio *bp) pbp->bio_error); pbp->bio_completed = 0; } - g_destroy_bio(bp); /* * Write is finished, send it up. */ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: geli(4) memory leak
On Sat, 2 Apr 2011 12:17:50 +0200 Pawel Jakub Dawidek wrote: PJD> On Sat, Apr 02, 2011 at 12:04:09AM +0300, Mikolaj Golub wrote: >> For me your patch look correct. But the same issue is for read :-). Also, to >> avoid the leak I think we can just do g_destroy_bio() before "all sectors" >> check. See the attached patch (had some testing). PJD> The patch looks good. Please commit. Commited, thanks. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: geli(4) memory leak
On Mon, 4 Apr 2011 01:51:24 +0200 Victor Balada Diaz wrote: VBD> On Sun, Apr 03, 2011 at 08:43:45PM +0300, Mikolaj Golub wrote: >> >> On Sat, 2 Apr 2011 12:17:50 +0200 Pawel Jakub Dawidek wrote: >> >> PJD> On Sat, Apr 02, 2011 at 12:04:09AM +0300, Mikolaj Golub wrote: >> >> For me your patch look correct. But the same issue is for read :-). >> Also, to >> >> avoid the leak I think we can just do g_destroy_bio() before "all >> sectors" >> >> check. See the attached patch (had some testing). >> >> PJD> The patch looks good. Please commit. >> >> Commited, thanks. VBD> I've been out all the weekend, so i've been unable to answer before. I'm glad VBD> it got commited and it's great you discovered and fixed the same problem on the VBD> read path. VBD> Are there any plans to MFC this? Approximately after one week. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Any success stories for HAST + ZFS?
On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote: FC> On Sat, Apr 2, 2011 at 1:44 AM, Pawel Jakub Dawidek wrote: >> >> I just committed a fix for a problem that might look like a deadlock. >> With trociny@ patch and my last fix (to GEOM GATE and hastd) do you >> still have any issues? FC> Just to confirm, this is commit r220264, 220265, 220266 to -CURRENT? Yes, r220264 and 220266. As it is stated in the commit log MFC is planned after 1 week. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Any success stories for HAST + ZFS?
On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote: FC> Once the deadlock patches above are MFC'd to -STABLE, I can do an FC> upgrade cycle and test them. Committed to STABLE. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: geli(4) memory leak
On Sun, 03 Apr 2011 20:43:45 +0300 Mikolaj Golub wrote to Pawel Jakub Dawidek: MG> On Sat, 2 Apr 2011 12:17:50 +0200 Pawel Jakub Dawidek wrote: PJD>> On Sat, Apr 02, 2011 at 12:04:09AM +0300, Mikolaj Golub wrote: >>> For me your patch look correct. But the same issue is for read :-). Also, >>> to >>> avoid the leak I think we can just do g_destroy_bio() before "all sectors" >>> check. See the attached patch (had some testing). PJD>> The patch looks good. Please commit. MG> Commited, thanks. In STABLE too. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Any success stories for HAST + ZFS?
On Mon, 11 Apr 2011 11:26:15 -0700 Freddie Cash wrote: FC> On Sun, Apr 10, 2011 at 12:36 PM, Mikolaj Golub wrote: >> On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote: >> FC> Once the deadlock patches above are MFC'd to -STABLE, I can do an >> FC> upgrade cycle and test them. >> >> Committed to STABLE. FC> Updated src tree to r220537. Recompiled world, kernel, etc. FC> Installed world, kernel, etc. ZFSv28 patch was not affected. FC> Everything is detected correctly, everything comes up correctly. See FC> a new option (reload) in the RC script for hast. FC> Can create/change role for 24 hast devices simultaneously. FC> Can switch between master/slave modes. FC> Have 5 rsyncs running in parallel without any issues, transferring FC> 80-120 Mbps over the network (just under 100 Mbps seems to be the FC> average right now). FC> Switching roles while the rsyncs are running succeeds without FC> deadlocking (obviously, rsync complains a whole bunch while the switch FC> happens as the pool disappears out from underneath it, but it picks up FC> again when the pool is back in place). FC> Hitting the reset switch on the box while the rsyncs are running FC> doesn't affect the hast devices or the pool, beyond losing the last 5 FC> seconds of writes. FC> It's only been a couple of hours of testing and hammering, but so far FC> things are much more stable/performant than before. Cool! Thanks for reporting! FC> Anything else I should test? Nothing particular, but any tests and reports are appreciated. E.g. ones of the recent features Pawel has added are checksum and compression. You could try different options and compare :-) -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: buildworld FAIL.
On Sat, 23 Apr 2011 09:38:39 -0500 Matthew D. Fuller wrote: MDF> On Sat, Apr 23, 2011 at 05:52:47AM -0700 I heard the voice of MDF> Jeremy Chadwick, and lo! it spake thus: >> On Sat, Apr 23, 2011 at 09:04:42AM +0200, Pawel Tyll wrote: >> > So was NO_OPENSSL deprecated or something? >> >> I think he's implying that hast indirectly relies upon OpenSSL. MDF> There's some conditionalization on MK_OPENSSL in the Makefile (and via MDF> that, in the code), but it's incomplete. Whether that means it MDF> _should_ be buildable without OpenSSL and is just insufficiently MDF> tested, or whether it really just flat needs OpenSSL and the MDF> conditionalization is vestigial, I don't know. pjd@ cc'd. The attached patch should fix this. -- Mikolaj Golub Index: sbin/hastd/hast_proto.c === --- sbin/hastd/hast_proto.c (revision 221054) +++ sbin/hastd/hast_proto.c (working copy) @@ -69,7 +69,9 @@ struct hast_pipe_stage { static struct hast_pipe_stage pipeline[] = { { "compression", compression_send, compression_recv }, +#ifdef HAVE_CRYPTO { "checksum", checksum_send, checksum_recv } +#endif }; /* ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: buildworld FAIL.
On Tue, 26 Apr 2011 18:25:09 +0200 Pawel Jakub Dawidek wrote: PJD> On Tue, Apr 26, 2011 at 12:44:31PM +0300, Mikolaj Golub wrote: >> >> On Sat, 23 Apr 2011 09:38:39 -0500 Matthew D. Fuller wrote: >> >> MDF> On Sat, Apr 23, 2011 at 05:52:47AM -0700 I heard the voice of >> MDF> Jeremy Chadwick, and lo! it spake thus: >> >> On Sat, Apr 23, 2011 at 09:04:42AM +0200, Pawel Tyll wrote: >> >> > So was NO_OPENSSL deprecated or something? >> >> >> >> I think he's implying that hast indirectly relies upon OpenSSL. >> >> MDF> There's some conditionalization on MK_OPENSSL in the Makefile (and via >> MDF> that, in the code), but it's incomplete. Whether that means it >> MDF> _should_ be buildable without OpenSSL and is just insufficiently >> MDF> tested, or whether it really just flat needs OpenSSL and the >> MDF> conditionalization is vestigial, I don't know. pjd@ cc'd. >> >> The attached patch should fix this. PJD> The patch looks good. Please commit. Thanks. Committed. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: way for failover zpool (no HAST needed)
On Wed, 27 Apr 2011 14:05:11 +0200 Denny Schierz wrote: DS> hi, DS> Am Dienstag, den 29.03.2011, 23:36 +0300 schrieb Mikolaj Golub: >> >> 2) There are complaints from watchdog. DS> what happens, if the watchdog isn't available and one or both nodes are DS> rebooting or something else? Without receiving complaints secondary wont switch to primary. This is done intentionally, so the node does not make decision on its own. But you can have several watchdogs if this really worries you. DS> the other thing what could happen: the connection between the host and DS> the SAS switch is death. DS> carp, ifstate and hastmon looking for the reachable IP, but not, if the DS> local storage is available. So I have a closer look to devd and zfs and hastmon isn't just looking if the IP reachable. watchdog connects to a cluster node and ask its status. So the result depends on how smart the script one use to get the status. DS> shutdown in case of problems the carp interface / or whole machine, to DS> force a switch. DS> cu denny -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: way for failover zpool (no HAST needed): hastmon
Oops, just noticed this mail :-) Denny sent me another message privately and I hope I answered his questions but will answer to this message too, in case someone is interested. On Thu, 28 Apr 2011 15:22:22 +0200 Denny Schierz wrote: DS> hi, DS> ok, here we go: I've installed hastmon and both FreeBSD nodes and one on DS> Linux Debian as watchdog: DS> Simple setup: DS> DS> # cat /etc.local/hastmon.conf DS> resource sanip { DS> exec /usr/local/_rbg/bin/san-ip DS> friends iscsihead-m iscsihead-s nos DS> on iscsihead-m { DS> remote tcp4://iscsihead-s DS> priority 0 DS> } DS> on iscsihead-s { DS> remote tcp4://iscsihead-m DS> priority 1 DS> } DS> on linux { DS> remote tcp4://iscsihead-m tcp4://iscsihead-s DS> } DS> } DS> It works only half. DS> The simple script adds/remove an alias for the em0 and for status it DS> does a ping -c 1 to the global ip. After tell every host, what is role DS> is, I get on the primary "state unknown", in the secondary "state run" DS> and watchdog for the Linux host. It is difficult to tell without additional information what happened. It might be that your '/usr/local/_rbg/bin/san-ip status' was returning unknown status. In this case running manually /usr/local/_rbg/bin/san-ip status; echo $? might be helpful. And logs too :-). DS> Than I rebooted the primary, the secondary take over and executed the DS> script. After the primary was reachable again, he doesn't get the DS> secondary role, but init/unknown. DS> The same happens, in the opposite: DS> from Linux: DS> hastmonctl status DS> sanip: DS> role: watchdog DS> exec: /usr/local/_rbg/bin/san-ip DS> remote: DS> tcp4://iscsihead-m (primary/run) DS> tcp4://iscsihead-s (init/unknown) DS> state: run DS> attempts: 0 from 5 DS> complaints: 0 for last 60 sec (threshold 3) DS> heartbeat: 10 sec DS> from iscsihead-s: DS> hastmonctl status DS> sanip: DS> role: init DS> exec: /usr/local/_rbg/bin/san-ip DS> remote: DS> tcp4://iscsihead-m DS> state: unknown DS> attempts: 0 from 5 DS> complaints: 0 for last 60 sec (threshold 3) DS> heartbeat: 10 sec DS> and last from iscsihead-m DS> hastmonctl status DS> sanip: DS> role: primary DS> exec: /usr/local/_rbg/bin/san-ip DS> remote: DS> tcp4://iscsihead-s (disconnected) DS> state: run DS> attempts: 0 from 5 DS> complaints: 0 for last 60 sec (threshold 3) DS> heartbeat: 10 sec DS> If I take a look into the logfile from the iscsihead-m: DS> [sanip] (primary) Remote node acts as init for the resource and not as DS> secondary. DS> [sanip] (primary) Handshake header from tcp4://iscsihead-s has no DS> 'token' field. DS> Do I have missed something? DS> cu denny This is expected behavior. After start hastmon is in init role. You need to setup the role you want manually or via a startup script. This is because you might want different configurations depending on your requirenments: 1) After start the role is set manually by administrator (useful e.g. if you prefer to investigate crashed host before returning it back to cluster). 2) After star the node is switched to secondary automatically (by rc script). If all cluster nodes are configured to be in secondary on startup, and all started simultaneously watchdog will figure out that there is no primary and will send complaints to all secondary nodes. The nodes will be trying to switch to master simultaneously and the node with highest priority will win. 3) One node that has highest priority configures is set on startup always to primary. All others are to secondary. With this configuration if the primary fails, secondary switches to primary, then when the initial primary comes back it becomes primary again automatically. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Mon, 30 May 2011 17:43:04 +0300 Daniel Kalchev wrote: DK> Some further investigation: DK> The HAST nodes do not disconnect when checksum is enabled (either DK> crc32 or sha256). DK> One strange thing is that there is never established TCP connection DK> between both nodes: DK> tcp4 0 0 10.2.101.11.48939 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 1288 10.2.101.11.57008 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.46346 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 90648 10.2.101.11.13916 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.8457 *.*LISTEN It is normal. hastd uses the connections only in one direction so it calls shutdown to close unused directions. DK> When using sha256 one CPU core is 100% utilized by each hastd process, DK> while 70-80MB/sec per HAST resource is being transferred (total of up DK> to 140 MB/sec traffic for both); DK> When using crc32 each CPU core is at 22% utilization; DK> When using none as checksum, CPU usage is under 10% I suppose when checksum is enabled the bottleneck is cpu, the triffic rate is lower and the problem is not triggered. DK> Eventually after many hours, got corrupted communication: DK> May 30 17:32:35 b1b hastd[9827]: [data0] (secondary) Hash mismatch. "Hash mismatch" message suggests that actually you were using checksum then, weren't you? DK> May 30 17:32:35 b1b hastd[9827]: [data0] (secondary) Unable to receive DK> request data: No such file or directory. DK> May 30 17:32:38 b1b hastd[9397]: [data0] (secondary) Worker process DK> exited ungracefully (pid=9827, exitcode=75). DK> and DK> May 30 17:32:27 b1a hastd[1837]: [data0] (primary) Unable to receive DK> reply header: Operation timed out. DK> May 30 17:32:30 b1a hastd[1837]: [data0] (primary) Disconnected from DK> 10.2.101.12. DK> May 30 17:32:30 b1a hastd[1837]: [data0] (primary) Unable to send DK> request (Broken pipe): WRITE(99128470016, 131072). It looks a little different than in your fist message. Do you have clock in sync on both nodes? I would like to look at full logs for some rather large period, with several cases, from both primary and secondary (and be sure about synchronized time). Also, it might worth checking that there is no network packet corruption (some strange things in netstat -di, netstat -s, may be copying large files via net and comparing checksums). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Mon, 30 May 2011 17:43:04 +0300 Daniel Kalchev wrote: DK> tcp4 0 0 10.2.101.11.48939 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 1288 10.2.101.11.57008 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.46346 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 90648 10.2.101.11.13916 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.8457 *.*LISTEN Also, it might be useful to see if you normally have full receive buffers like above or only when the issue is observed, running netstat in loop, something like below: while sleep 5; do t=`date '+%F %H:%M:%S'`; netstat -na | grep 8457 | while read l; do echo "$t $l"; done; done > /tmp/netstat.log -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Tue, 31 May 2011 15:51:07 +0300 Daniel Kalchev wrote: DK> On 30.05.11 21:42, Mikolaj Golub wrote: >> DK> One strange thing is that there is never established TCP connection >> DK> between both nodes: >> >> DK> tcp4 0 0 10.2.101.11.48939 10.2.101.12.8457 >> FIN_WAIT_2 >> DK> tcp4 0 1288 10.2.101.11.57008 10.2.101.12.8457 >> CLOSE_WAIT >> DK> tcp4 0 0 10.2.101.11.46346 10.2.101.12.8457 >> FIN_WAIT_2 >> DK> tcp4 0 90648 10.2.101.11.13916 10.2.101.12.8457 >> CLOSE_WAIT >> DK> tcp4 0 0 10.2.101.11.8457 *.* >> LISTEN >> >> It is normal. hastd uses the connections only in one direction so it calls >> shutdown to close unused directions. DK> So the TCP connections are all too short-lived that I can never see a DK> single one in ESTABLISHED state? 10Gbit Ethernet is indeed fast, so DK> this might well be possible... No the connections are persistent, just only one (unused) direction of communication is closed. See shutdown(2) for further info. >> I would like to look at full logs for some rather large period, with several >> cases, from both primary and secondary (and be sure about synchronized >> time). DK> I have made sure clocks are synchronized and am currently running on a freshly rebooted nodes (with two additional SATA drives at each node) -- DK> so far some interesting findings, like I get hash errors and DK> disconnects much more frequent now. Will post when an bonnie++ run on DK> the ZFS filesystem on top of the HAST resources finishes. As I wrote privately, it would be nice to see both netstat and hast logs (from both nodes) for the same rather long period, when several cases occured. It would be good to place them somewere on web so other guys could access them too, as I will be offline for 7-10 days and will not be able to help you until I am back. DK> One additional note: while playing with this setup, I tried to DK> simulate local disk going away in the hope HAST will switch to using DK> the remote disk. Instead of asking someone at the site to pull out the DK> drive, I just issued on the primary DK> hastctl role init data0 DK> which resulted in kernel panic. Unfortunately, there was no sufficient DK> dump space for 48GB. I will re-run this again with more drives for the DK> crash dump. Anything you want me to look for in particular? (kernels DK> have no KDB compiled in yet) Well, removing physical disk (device /dev/gpt/data0 consumed by hastd dissapears) and switching a resource to init role (devive /dev/hast/data0 consumed by FS dissapears) are two different things. Sure you should not normally change the resource role (destroy hast device) before unmounting (exporting) FS. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast syncronization speed issue
On Thu, 2 Jun 2011 11:47:26 +0300 Yurius Radomskyi wrote: YR> Hi, YR> I have a HAST device set up between two systems. I experience very low YR> speed with dirty blocks synchronization after split-brain condition YR> been recovered: it's 200KB/s average on 1Gbit link. On the other side, YR> when i copy a big file to the zfs partition that is created on top of YR> the hast device the synchronization speed between the host is 50MB/s YR> (wich is not too high for 1Gbit link, but acceptable.) Could you please try the patch (the kernel needs rebuilding)? http://people.freebsd.org/~trociny/uipc_socket.c.patch The patch was committed to current (r222454) and is going to be MFCed after some time. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Unusable hastd in FreeBSD 8.2
Hi, On Mon, 6 Jun 2011 16:46:55 +0200 Victor Balada Diaz wrote: VBD> Hello, VBD> Hastd in it's current form is not usable on FreeBSD 8.2-RELEASE or in 8-STABLE. You VBD> can see why in this thread: VBD> http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010752.html VBD> You can see the committed fix in: VBD> http://svnweb.freebsd.org/base?view=revision&revision=219721 VBD> But it's never been MFCd. Is it possible to MFC it to 8-STABLE and maybe VBD> do an errata notice for RELENG_8_2? Actually, it was MFCed. In r220151. Also, I don't think this is an issue that makes hastd unusable in FreeBSD 8.2 :-). The issue is the following. Before switching the node to primary the failover (third-party) script is checking if secondary process is still alive (assuming that in this case the primary on another node is still alive too) and fails if it is -- some protection against split brain. But before r219721 secondary might not die automatically when primary host was down. This can be workarounded. E.g. by removing the check in the script :-). Or setting net.inet.tcp.keepidle to some small value (e.g. 10 seconds) -- this should make secondary notice that another end is dead after this interval. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Fri, 03 Jun 2011 19:18:29 +0300 Daniel Kalchev wrote: DK> Well, apparently my HAST joy was short. On a second run, I got stuck with DK> Jun 3 19:08:16 b1a hastd[1900]: [data2] (primary) Unable to receive DK> reply header: Operation timed out. DK> on the primary. No messages on the secondary. DK> On primary: DK> # netstat -an | grep 8457 DK> tcp4 0 0 10.2.101.11.42659 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.11.62058 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.34646 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.11.11419 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.37773 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.11.21911 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.11.40169 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 97749 10.2.101.11.44360 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.8457 *.*LISTEN DK> on secondary DK> # netstat -an | grep 8457 DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.42659 CLOSE_WAIT DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.62058 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.34646 CLOSE_WAIT DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.11419 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.37773 CLOSE_WAIT DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.21911 CLOSE_WAIT DK> tcp4 0 0 10.2.101.12.8457 10.2.101.11.40169 FIN_WAIT_2 DK> tcp4 66415 0 10.2.101.12.8457 10.2.101.11.44360 FIN_WAIT_2 DK> tcp4 0 0 10.2.101.12.8457 *.*LISTEN DK> on primary DK> # hastctl status DK> data0: DK> role: primary DK> provname: data0 DK> localpath: /dev/gpt/data0 DK> extentsize: 2097152 (2.0MB) DK> keepdirty: 64 DK> remoteaddr: 10.2.101.12 DK> sourceaddr: 10.2.101.11 DK> replication: fullsync DK> status: complete DK> dirty: 0 (0B) DK> data1: DK> role: primary DK> provname: data1 DK> localpath: /dev/gpt/data1 DK> extentsize: 2097152 (2.0MB) DK> keepdirty: 64 DK> remoteaddr: 10.2.101.12 DK> sourceaddr: 10.2.101.11 DK> replication: fullsync DK> status: complete DK> dirty: 0 (0B) DK> data2: DK> role: primary DK> provname: data2 DK> localpath: /dev/gpt/data2 DK> extentsize: 2097152 (2.0MB) DK> keepdirty: 64 DK> remoteaddr: 10.2.101.12 DK> sourceaddr: 10.2.101.11 DK> replication: fullsync DK> status: complete DK> dirty: 6291456 (6.0MB) DK> data3: DK> role: primary DK> provname: data3 DK> localpath: /dev/gpt/data3 DK> extentsize: 2097152 (2.0MB) DK> keepdirty: 64 DK> remoteaddr: 10.2.101.12 DK> sourceaddr: 10.2.101.11 DK> replication: fullsync DK> status: complete DK> dirty: 0 (0B) DK> Sits in this state for over 10 minutes. DK> Unfortunately, no KDB in kernel. Any ideas what other to look for? Could you please try this patch? http://people.freebsd.org/~trociny/hastd.no_shutdown.patch After patching you need to rebuild hastd and restart it (I expect only on secondary is enough but it is better to do this on both nodes). No server restart is needed. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Fri, 10 Jun 2011 20:05:43 +0300 Mikolaj Golub wrote to Daniel Kalchev: MG> Could you please try this patch? MG> http://people.freebsd.org/~trociny/hastd.no_shutdown.patch Sure you still have to have your kernel patched with uipc_socket.c.patch :-) -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Tue, 14 Jun 2011 16:39:11 +0300 Daniel Kalchev wrote: DK> On 10.06.11 20:07, Mikolaj Golub wrote: >> On Fri, 10 Jun 2011 20:05:43 +0300 Mikolaj Golub wrote to Daniel Kalchev: >> >> MG> Could you please try this patch? >> >> MG> http://people.freebsd.org/~trociny/hastd.no_shutdown.patch >> >> Sure you still have to have your kernel patched with uipc_socket.c.patch :-) >> DK> It is now running for about a day with both patches applied, without DK> disconnects. DK> Also, now TCP/IP connections always stay in ESTABLISHED state. As I DK> believe they should. Primary to secondary drain quickly on switching DK> form init to primary etc. No troubles without checksums as DK> well. Kernel is as of Thanks! It has turned out that automatic receive buffer sizing works only for connections in ESTABLISHED state. And with small receive buffer the connection might stuck sending data only via TCP window probes -- one byte every few seconds (see "Scenario to make recv(MSG_WAITALL) stuck" in net@ for details). hastd.no_shutdown.patch disables closing of unused directions so the connections remain in ESTABLISHED state and automatic receive buffer sizing works again. uipc_socket.c.patch has been committed to CURRENT and I am going to MFC soon. DK> FreeBSD b1a 8.2-STABLE FreeBSD 8.2-STABLE #1: Mon Jun 13 11:32:38 EEST DK> 2011 root@b1a:/usr/obj/usr/src/sys/GENERIC amd64 DK> Daniel -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST + ZFS: no action on drive failure
On Thu, 30 Jun 2011 20:02:19 -0700 Timothy Smith wrote: TS> First posting here, hopefully I'm doing it right =) TS> I also posted this to the FreeBSD forum, but I know some hast folks monitor TS> this list regularly and not so much there, so... TS> Basically, I'm testing failure scenarios with HAST/ZFS. I got two nodes, TS> scripted up a bunch of checks and failover actions between the nodes. TS> Looking good so far, though more complex that I expected. It would be cool TS> to post it somewher to get some pointers/critiques, but that's another TS> thing. TS> Anyway, now I'm just seeing what happens when a drive fails on primary node. TS> Oddly/sadly, NOTHING! TS> Hast just keeps on a ticking, and doesn't change the state of the failed TS> drive, so the zpool has no clue the drive is offline. The TS> /dev/hast/ remains. The hastd does log some errors to the system TS> log like this, but nothing more. TS> messages.0:Jun 30 18:39:59 nas1 hastd[11066]: [ada6] (primary) Unable to TS> flush activemap to disk: Device not configured. TS> messages.0:Jun 30 18:39:59 nas1 hastd[11066]: [ada6] (primary) Local request TS> failed (Device not configured): WRITE(4736512, 512). Although the request to local drive failed it succeeded on remote node, so data was not lost, it was considered as successful, and no error was returned to ZFS. TS> So, I guess the question is, "Do I have to script a cronjob to check for TS> these kinds of errors and then change the hast resource to 'init' or TS> something to handle this?" Or is there some kind of hastd config setting TS> that I need to set? What's the SOP for this? Currently the only way to know is monitoring logs. It is not difficult to hook event for these errors in the HAST code (like it is done for connect/disconnect, syncstart/done etc) so one could script what to do on an error occurrence but I am not sure it is a good idea -- the errors may be generated with high rate. TS> As something related too, when the zpool in FreeBSD does finally notice that TS> the drive is missing because I have manually changed the hast resource to TS> INIT (so the /dev/hast/ is gone), my zpool (raidz2) hot spare doesn't TS> engage, even with "autoreplace=on". The zpool status of the degraded pool TS> seems to indicate that I should manually replace the failed drive. If that's TS> the case, it's not really a "hot spare". Does this mean the "FMA Agent" TS> referred to in the ZFS manual is not implemented in FreeBSD? TS> thanks! TS> ___ TS> freebsd-stable@freebsd.org mailing list TS> http://lists.freebsd.org/mailman/listinfo/freebsd-stable TS> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST + ZFS: no action on drive failure
On Sat, 2 Jul 2011 14:43:15 -0700 Timothy Smith wrote: TS> Hello Mikolaj, TS> So, just to be clear, if a local drive fails in my pool, but the TS> corresponding remote drive remains available, then hastd will both write to TS> and read from the remote drive? That's really very cool! Yes. TS> I looked more closely at the hastd(8) man page. There is some indication of TS> what you say, but not so clear: TS> "Read operations (BIO_READ) are handled locally unless I/O error occurs or local TS> version of the data is not up-to-date yet (synchronization is in progress)." This is about READ operations, and for WRITE we have just above: Every write, delete and flush operation (BIO_WRITE, BIO_DELETE, BIO_FLUSH) is send to local component and synchronously replicated to the remote (secondary) node if it is available. There might be things that should be improved in documetation but I don't feel capable to do this :-) TS> Perhaps this can be modified a bit? Adding, "or the local disk is TS> unavailable. In such a case, the I/O operation will be handled by the remote TS> resource." TS> It does makes sense however, since HAST is base on the idea of raid. This TS> feature increases the redundancy of the system greatly. My boss will be TS> very impressed, as am I! TS> I did notice however that when the pulled drive is reinserted, I need to TS> change the associated hast resource to init, then back to primary to allow TS> hastd to once again use it (perhaps the same if the secondary drive is TS> failed?). Unless it will do this on it's own after some time? I did not wait TS> more than a few minutes. But this is easy enough to script or to monitor the TS> log and present a notification to admin at such a time. When you are reinserting the drive the resource should be in init state. Remember, some data was updated on secondary only, so the right sequence of operations could be: 1) Failover (switch primary to init and secondary to primary). 2) Fix the disk issue. 3) If this is a new drive, recreate HAST metadata on it with hastctl utility. 4) Switch the repaired resource to secondary and wait until the new primary connects to it and updates metadata. After this synchronization is started. 5) You can switch to the previous primary before the synchronization is complete -- it will continue in right direction, but then you should expect performance degradation until the synchronization is complete -- the READ requests will go to remote node. So it might be better to wait until the synchronization is complete before switching back. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Sun, 18 Sep 2011 08:47:13 +0200 Ronald Klop wrote: RK> On Sun, 18 Sep 2011 07:39:01 +0200, Jeremy Chadwick RK> wrote: >> On Sun, Sep 18, 2011 at 12:54:13AM -0400, Jason Hellenthal wrote: >>> On Sun, Sep 18, 2011 at 01:49:15AM +0200, Ronald Klop wrote: >>> > Hi, >>> > >>> > I'm running portupgrade in screen to update all the ports for >>> > 9-BETA2/9-CURRENT on amd64. While doing this script eats 100% cpu. >>> > Because portupgrade -fa crashed I'm running this command to update the >>> > remaining non-updates ports. >>> > find /var/db/pkg -name +DESC -mtime +2 |cut -d / -f 5 | xargs >>> time nice -n >>> > 20 portupgrade -f >>> > >>> > The output of truss -p `pgrep script` is this: >>> > clock_gettime(13,{1316301104.0 })= 0 (0x0) >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) >>> > write(4,0x7fffcdf0,0)= 0 (0x0) >>> > clock_gettime(13,{1316301104.0 })= 0 (0x0) >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) >>> > write(4,0x7fffcdf0,0)= 0 (0x0) >>> > clock_gettime(13,{1316301104.0 })= 0 (0x0) >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) >>> > write(4,0x7fffcdf0,0)= 0 (0x0) >>> > clock_gettime(13,{1316301104.0 })= 0 (0x0) >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) >>> > write(4,0x7fffcdf0,0)= 0 (0x0) >>> > >>> > So it is really fast in reading and writing 0 bytes most of the time. >>> > >>> > I also found >>> http://web.archiveorange.com/archive/v/6ETvLvjo60Gj9geAUAb6 >>> > and I think I am better of by rewriting my command so stdin/stdout is >>> > still the terminal. Although the link is a couple of years old. >>> > >>> > Is this known? Can somebody explain me why my xargs command is >>> not working >>> > well? >>> > >>> >>> Are you absolutely sure that its script(1) causing this ? 100% CPU usage >>> has been a known side effect of screen(1) for quite some time. Rebuild >>> it and try again. >> >> Jason's referring to this, I believe: >> http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/screen/Makefile#rev1.55 >> >> To clarify the what the commit message means: it does not mean "when the >> package is installed the installation takes up 100% CPU". It means >> "once the package is installed and screen is used, screen takes up 100% >> CPU". I know because I've seen this behaviour in the past (one of the >> many, many reasons I build ports from source). >> >> However: >> http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/screen/Makefile#rev1.78 >> >> So: If a binary package is being installed through your above >> portupgrade command, and you're seeing this problem, then it sounds to >> me like commit revision 1.78 is a regression and NO_PACKAGE should be >> put back into place + packages removed from all mirrors. >> >> There are many reasons to not use GNU screen at all, or if you must have >> something like it, use tmux. I recently had to provide an analysis of >> how GNU screen destroys one's terminal[1]; so if the above problem turns >> out to be caused by GNU screen as well, I'll just add it to my >> ever-growing list of reasons the software should be nuked from orbit. >> >> Otherwise, if this turns out to be a problem with portupgrade (which you >> found some evidence supporting such), then the solution is simple: stop >> using portupgrade, use portmaster (if it lacks things you need ask Doug >> Barton, he's incredibly receptive to adding new features/fixing things). >> Two databases that aren't compatible, ruby shims, and other crap = not >> worth it. Think the database ordeal is long over with/fixed/whatever? >> It isn't[2]. >> >> [1]: >> http://lists.freebsd.org/pipermail/freebsd-stable/2011-June/063052.html >> [2]: &
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Sun, 18 Sep 2011 13:25:26 +0200 Ronald Klop wrote: RK> It is a while since I programmed C, but why will writing 0 bytes give RK> the reader an end-of-file? Shouldn't the fd be closed to indicate RK> end-of-file? AFAIR, this trick with writing 0 to emulate EOF because we can't close the fd -- we still want to read from it. Poor shutdown(2) for non-socket :-). Colin might tell more... -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Sun, 18 Sep 2011 20:24:23 +0300 Kostik Belousov wrote: KB> On Sun, Sep 18, 2011 at 02:54:34PM +0300, Mikolaj Golub wrote: >> >> On Sun, 18 Sep 2011 13:25:26 +0200 Ronald Klop wrote: >> >> RK> It is a while since I programmed C, but why will writing 0 bytes give >> RK> the reader an end-of-file? Shouldn't the fd be closed to indicate >> RK> end-of-file? >> >> AFAIR, this trick with writing 0 to emulate EOF because we can't close the >> fd >> -- we still want to read from it. Poor shutdown(2) for non-socket :-). >> >> Colin might tell more... KB> Please note that interpreting the receiving of 0 bytes on the terminal KB> as EOF is only a convention. If done absolutely properly, script shall KB> not interpret zero-byte read as EOF. Might be, the reasonable thing to KB> do would be to only look at the stdin once in a second after receiving KB> zero-bytes, and switching it back to normal mode if something is read. Ok. I see. Below is the patch that does something like this. -- Mikolaj Golub Index: usr.bin/script/script.c === --- usr.bin/script/script.c (revision 225653) +++ usr.bin/script/script.c (working copy) @@ -53,6 +53,7 @@ static const char sccsid[] = "@(#)script.c 8.1 (Be #include #include #include +#include #include #include #include @@ -86,6 +87,7 @@ main(int argc, char *argv[]) char ibuf[BUFSIZ]; fd_set rfd; int flushtime = 30; + bool readstdin; aflg = kflg = 0; while ((ch = getopt(argc, argv, "aqkt:")) != -1) @@ -155,19 +157,20 @@ main(int argc, char *argv[]) doshell(argv); close(slave); - if (flushtime > 0) - tvp = &tv; - else - tvp = NULL; - - start = time(0); - FD_ZERO(&rfd); + start = tvec = time(0); + readstdin = true; for (;;) { + FD_ZERO(&rfd); FD_SET(master, &rfd); - FD_SET(STDIN_FILENO, &rfd); - if (flushtime > 0) { - tv.tv_sec = flushtime; + if (readstdin) + FD_SET(STDIN_FILENO, &rfd); + if (!readstdin || flushtime > 0) { + tv.tv_sec = !readstdin ? 1 : flushtime - (tvec - start); tv.tv_usec = 0; + tvp = &tv; + readstdin = true; + } else { + tvp = NULL; } n = select(master + 1, &rfd, 0, 0, tvp); if (n < 0 && errno != EINTR) @@ -176,8 +179,10 @@ main(int argc, char *argv[]) cc = read(STDIN_FILENO, ibuf, BUFSIZ); if (cc < 0) break; - if (cc == 0) + if (cc == 0) { (void)write(master, ibuf, 0); +readstdin = false; + } if (cc > 0) { (void)write(master, ibuf, cc); if (kflg && tcgetattr(master, &stt) >= 0 && ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Sun, Sep 18, 2011 at 1:58 PM, Mikolaj Golub wrote: > > On Sun, 18 Sep 2011 08:47:13 +0200 Ronald Klop wrote: > > RK> On Sun, 18 Sep 2011 07:39:01 +0200, Jeremy Chadwick > RK> wrote: > > >> On Sun, Sep 18, 2011 at 12:54:13AM -0400, Jason Hellenthal wrote: > >>> On Sun, Sep 18, 2011 at 01:49:15AM +0200, Ronald Klop wrote: > >>> > Hi, > >>> > > >>> > I'm running portupgrade in screen to update all the ports for > >>> > 9-BETA2/9-CURRENT on amd64. While doing this script eats 100% cpu. > >>> > Because portupgrade -fa crashed I'm running this command to update the > >>> > remaining non-updates ports. > >>> > find /var/db/pkg -name +DESC -mtime +2 |cut -d / -f 5 | xargs > >>> time nice -n > >>> > 20 portupgrade -f > >>> > > >>> > The output of truss -p `pgrep script` is this: > >>> > clock_gettime(13,{1316301104.0 }) = 0 (0x0) > >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) > >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) > >>> > write(4,0x7fffcdf0,0) = 0 (0x0) > >>> > clock_gettime(13,{1316301104.0 }) = 0 (0x0) > >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) > >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) > >>> > write(4,0x7fffcdf0,0) = 0 (0x0) > >>> > clock_gettime(13,{1316301104.0 }) = 0 (0x0) > >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) > >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) > >>> > write(4,0x7fffcdf0,0) = 0 (0x0) > >>> > clock_gettime(13,{1316301104.0 }) = 0 (0x0) > >>> > select(5,{0 4},0x0,0x0,{30.00 }) = 1 (0x1) > >>> > read(0,0x7fffcdf0,1024) = 0 (0x0) > >>> > write(4,0x7fffcdf0,0) = 0 (0x0) > >>> > > >>> > So it is really fast in reading and writing 0 bytes most of the time. > >>> > > >>> > I also found > >>> http://web.archiveorange.com/archive/v/6ETvLvjo60Gj9geAUAb6 > >>> > and I think I am better of by rewriting my command so stdin/stdout is > >>> > still the terminal. Although the link is a couple of years old. > >>> > > >>> > Is this known? Can somebody explain me why my xargs command is > >>> not working > >>> > well? > >>> > > >>> > >>> Are you absolutely sure that its script(1) causing this ? 100% CPU usage > >>> has been a known side effect of screen(1) for quite some time. Rebuild > >>> it and try again. > >> > >> Jason's referring to this, I believe: > >> > http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/screen/Makefile#rev1.55 > >> > >> To clarify the what the commit message means: it does not mean "when the > >> package is installed the installation takes up 100% CPU". It means > >> "once the package is installed and screen is used, screen takes up 100% > >> CPU". I know because I've seen this behaviour in the past (one of the > >> many, many reasons I build ports from source). > >> > >> However: > >> > http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/screen/Makefile#rev1.78 > >> > >> So: If a binary package is being installed through your above > >> portupgrade command, and you're seeing this problem, then it sounds to > >> me like commit revision 1.78 is a regression and NO_PACKAGE should be > >> put back into place + packages removed from all mirrors. > >> > >> There are many reasons to not use GNU screen at all, or if you must have > >> something like it, use tmux. I recently had to provide an analysis of > >> how GNU screen destroys one's terminal[1]; so if the above problem turns > >> out to be caused by GNU screen as well, I'll just add it to my > >> ever-growing list of reasons the software should be nuked from orbit. > >> > >> Otherwise, if this turns out to be a problem with portupgrade (which you > >> found some evidence supporting such), then the solution is simple: stop > >> using portupgrade, use po
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Tue, 04 Oct 2011 18:34:07 +0200 Michiel Boland wrote: MB> On 10/04/2011 13:15, Mikolaj Golub wrote: >> On Sun, Sep 18, 2011 at 1:58 PM, Mikolaj Golub wrote: MB> [...] >>> >>> I believe the behaviour is after this commit: >>> >>> http://svnweb.freebsd.org/base?view=revision&revision=125848 >>> >>> I think we should skip select on STDIN after reading EOF from it, like in >>> the >>> patch below. >> >> For the record. The issue has been fixed in CURRENT and the fix has >> been merged to STABLE. >> >> Thanks Kostik and Chris for their comments and suggestions. >> MB> Does this mean that bin/72501 can be closed? Yes, thanks for pointing out. Closed. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Wed, 12 Oct 2011 23:25:35 +0100 Adrian Wontroba wrote: AW> On Sat, Oct 08, 2011 at 01:27:07AM +0100, Adrian Wontroba wrote: >> I won't be in a position to create a simpler test case, raise a PR or >> try patches till Tuesday evening (UK) at the earliest. AW> So far I have been unable to reproduce the problem with portupgrade (and AW> will probably move to portmaster). AW> I have however found a different but possibly related problem with the AW> new version of script in RELENG_8, for which I have raised this PR: AW> misc/161526: script outputs corrupt if input is not from a terminal As Jilles wrote ^D\b\b are echoed by the terminal when the script sends VEOF to the program being script. In my recent commit r225809 the intention was to sent VEOF only once if STDIN was not terminal. Unfortunately the fix was incorrect and for flushtime > 0 it keeps sending VEOF. That is why you are observing series of ^D\b\b characters. I am going to commit the attached patch to HEAD, that fixes this. But we will still have one ^D\b\b in the output. -- Mikolaj Golub Index: usr.bin/script/script.c === --- usr.bin/script/script.c (revision 226349) +++ usr.bin/script/script.c (working copy) @@ -163,12 +163,15 @@ main(int argc, char *argv[]) FD_SET(master, &rfd); if (readstdin) FD_SET(STDIN_FILENO, &rfd); - if ((!readstdin && ttyflg) || flushtime > 0) { - tv.tv_sec = !readstdin && ttyflg ? 1 : - flushtime - (tvec - start); + if (!readstdin && ttyflg) { + tv.tv_sec = 1; tv.tv_usec = 0; tvp = &tv; readstdin = 1; + } else if (flushtime > 0) { + tv.tv_sec = flushtime - (tvec - start); + tv.tv_usec = 0; + tvp = &tv; } else { tvp = NULL; } ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Fri, 14 Oct 2011 14:03:37 +0200 Jilles Tjoelker wrote: JT> On Wed, Oct 12, 2011 at 11:25:35PM +0100, Adrian Wontroba wrote: >> On Sat, Oct 08, 2011 at 01:27:07AM +0100, Adrian Wontroba wrote: >> > I won't be in a position to create a simpler test case, raise a PR or >> > try patches till Tuesday evening (UK) at the earliest. >> So far I have been unable to reproduce the problem with portupgrade (and >> will probably move to portmaster). >> I have however found a different but possibly related problem with the >> new version of script in RELENG_8, for which I have raised this PR: >> misc/161526: script outputs corrupt if input is not from a terminal >> Blast, should of course been bin/ JT> The extra ^D\b\b are the EOF character being echoed. These EOF JT> characters are being generated by the new script(1) to pass through the JT> EOF condition on stdin. JT> One fix would be to change the termios settings temporarily to disable JT> the echoing but this may cause problems if the application is changing JT> termios settings concurrently and generally feels bad. JT> It may be best to remove writing EOF characters, perhaps adding an JT> option to enable it again if there is a concrete use case for it. Without passing EOF to the to the program being scripted the following command will hang forever: echo 1 |script /tmp/script.out cat -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Fri, 14 Oct 2011 22:50:32 +0200 Stefan Bethke wrote: SB> I finally figured out why my ports aren't updating anymore: when running portupgrade -a --batch from cron, stdin is /dev/null, and that produces the gobs of ^D in the output, as well as the script file that portupgrade creates. What's worse is that the upgrade never completes. SB> You can easily see this for yourself: SB> # portupgrade -a --batch This is on 8-stable from October 5th. Could you please try the patch I attached to another my mail in this thread to see if it helps? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: /usr/bin/script eating 100% cpu with portupgrade and xargs
On Sat, 15 Oct 2011 11:50:22 +0200 Stefan Bethke wrote: SB> Am 15.10.2011 um 09:36 schrieb Mikolaj Golub: >> >> On Fri, 14 Oct 2011 22:50:32 +0200 Stefan Bethke wrote: >> >> SB> I finally figured out why my ports aren't updating anymore: when >> running portupgrade -a --batch from cron, stdin is /dev/null, and that >> produces the gobs of ^D in the output, as well as the script file that >> portupgrade creates. What's worse is that the upgrade never completes. >> >> SB> You can easily see this for yourself: >> SB> # portupgrade -a --batch > >> SB> This is on 8-stable from October 5th. >> >> Could you please try the patch I attached to another my mail in this thread >> to >> see if it helps? SB> Seems to do the trick, thanks! Thanks for testing! Committed. I am going to MFC it soon. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with hast
Hi, On Wed, 18 Jan 2012 20:23:25 +0200 Artem Kajalainen wrote: AK> Hello, AK> I'm trying to setup hastd on two servers and got error, which I can't AK> understand. Box is running as primary, then i reboot it, another box AK> get primary role by carp events, then 1st box at boot tries to set up AK> primary role on own hast instance and fails with this: AK> Jan 18 22:13:03 gw_chlb_2 hastd[1387]: [storage0] (primary) AK> G_GATE_CMD_DONE failed: No such file or directory. AK> Jan 18 22:13:08 gw_chlb_2 hastd[1004]: [storage0] (primary) Worker AK> process exited ungracefully (pid=1387, exitcode=71). AK> I thought that geom_gate module can be problem, so i compiled it in AK> kernel. As you can see - it doesn't help. Both servers are AK> FreeBSD9.0-stable, updated 1 week ago. Hastd use whole disk. More info AK> from hastd: AK> gw_chlb_2# hastd -dF -c /etc/hast.conf AK> [INFO] Started successfully, running protocol version 1. AK> [DEBUG][1] Listening on control address /var/run/hastctl. AK> [INFO] Listening on address 192.168.0.1:8457. AK> [INFO] [storage0] (init) Role changed to primary. AK> [DEBUG][1] [storage0] (primary) Obtained info about /dev/ada2. AK> [DEBUG][1] [storage0] (primary) Locked /dev/ada2. AK> [INFO] [storage0] (primary) Device hast/storage0 created. AK> [DEBUG][1] [storage0] (primary) Privileges successfully dropped using AK> jail+setgid+setuid. AK> [INFO] [storage0] (primary) Privileges successfully dropped. AK> [INFO] [storage0] (primary) Connected to tcp4://192.168.0.2. AK> [INFO] [storage0] (primary) Synchronization started. 6.0MB to go. AK> [ERROR] [storage0] (primary) G_GATE_CMD_DONE failed: No such file or directory. AK> [INFO] [storage0] (primary) Received cancel from the kernel, exiting. AK> [DEBUG][1] Unable to receive event header: Socket is not connected. AK> [ERROR] [storage0] (primary) Worker process exited ungracefully AK> (pid=1452, exitcode=71). AK> [INFO] [storage0] (primary) Changing resource role back to init. AK> Any thoughts? Sorry, Artem, I read your email only today. Investigating, it looks after r226859, when 'async' mode was added, we have 2 issues with synchronization from secondary to master (rather very rear case normally): 1) When the synchronization from secondary to master is running and primary gets READ request, the request should be sent to the secondary but actually it is lost. As a result READ operation gets stuck. After the syncronization is complete the following READ requests, which now can be served by primary, work ok. 2) In async mode, for syncronization requests, write_complete() function, which sends G_GATE_CMD_DONE command to ggate, is called twice and the second call fails. Artem, did you run async mode? If you did then I suppose you observed the second issue. Could you please try the attached patch? -- Mikolaj Golub Index: sbin/hastd/primary.c === --- sbin/hastd/primary.c (revision 230661) +++ sbin/hastd/primary.c (working copy) @@ -1255,7 +1255,7 @@ ggate_recv_thread(void *arg) pjdlog_debug(2, "ggate_recv: (%p) Moving request to the send queues.", hio); refcount_init(&hio->hio_countdown, ncomps); - for (ii = ncomp; ii < ncomps; ii++) + for (ii = ncomp; ncomps != 0; ncomps--, ii++) QUEUE_INSERT1(hio, send, ii); } /* NOTREACHED */ @@ -1326,7 +1326,7 @@ local_send_thread(void *arg) } else { hio->hio_errors[ncomp] = 0; if (hio->hio_replication == -HAST_REPLICATION_ASYNC) { +HAST_REPLICATION_ASYNC && !ISSYNCREQ(hio)) { ggio->gctl_error = 0; write_complete(res, hio); } ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9.0 Stable unable to buildworld, missing KERN_PROC_ENV in kvm_proc.c
On Sun, 5 Feb 2012 20:09:08 +1100 Dewayne wrote: D> Unfortunately 9.0 Stable fails to compile due to missing declaration of D> KERN_PROC_ENV in /usr/src/lib/libkvm/kvm_proc.c. csup'ed from today. D> Please refer to the following changes on 30-Jan-2012: D> http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libkvm/kvm_proc.c.diff?r1=1.106.2.1;r2=1.106.2.2;f=h D> Compile error reads: D> cc -O2 -pipe -pipe -O2 -g0 -DSTRIP_FBSDID -UDEBUGGING -march=prescott -mtune=prescott -DLIBC_SCCS -I/usr/src/lib/libkvm D> -DNDEBUG -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes D> -Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-pointer-sign -c /usr/src/lib/libkvm/kvm_proc.c D> /usr/src/lib/libkvm/kvm_proc.c: In function 'kvm_argv': D> /usr/src/lib/libkvm/kvm_proc.c:663: error: 'KERN_PROC_ENV' undeclared (first use in this function) D> /usr/src/lib/libkvm/kvm_proc.c:663: error: (Each undeclared identifier is reported only once D> /usr/src/lib/libkvm/kvm_proc.c:663: error: for each function it appears in.) D> Am I the last person using i386 architecture? ;) I'm half joking. The D> buildworld completes successfully for architecture=amd64. And there should not be problems with i386 too. The error does not look like architecture specific. Could you please recheck your sources and building procedure and give more details if the error still exists. KERN_PROC_ENV is declared in sys/sys/sysctl.h, and this was MFCed in r230754, before the MFC lib/libkvm (r230780) you are referring to. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with hast
On Sun, 5 Feb 2012 10:27:54 +0100 Pawel Jakub Dawidek wrote: PJD> The analysis and fixes look good to me, please go ahead and commit PJD> (small nits below). Thanks. Committed. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Sun, 11 Mar 2012 19:54:57 +0100 Phil Regnauld wrote: PR> Hi, PR> I've got a fairly simple setup: two hosts running 9.0-R (will upgrade to stable PR> if told to, but want to check here first), ZFS and HAST. HAST is configured to PR> run on top of zvols configured on each host, as illustrated: PR> FS FS PR>+--++--+ PR>| hvol | < hastd -> | hvol | PR>+--++--+ PR>| zvol || zvol | PR>+--++--+ PR>| zfs || zfs | PR>+--++--+ PR> h1 h2 PR> Connection is gigabit to the same switch. No issues with large TCP PR> transfers such as SCP/FTP. PR> Config is vanilla: PR> # zfs create -V 10G zfs/hvol PR> hast.conf: PR> resource hvol { PR> on h1 { PR> local /dev/zvol/zfs/hvol PR> remote tcp4://192.168.1.100 PR> } PR> on h2 { PR> local /dev/zvol/zfs/hvol PR> remote tcp4://192.168.1.200 PR> } PR> } PR> h1 is behaving fine as primary, either with h2 turned off or in init - PR> but as soon as I set the role to secondary for h2, the receiver PR> repeatedly crashes and restarts - see the traces below. PR> Primary: PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Disconnected from tcp4://192.168.1.200. PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Unable to write synchronization data: Cannot allocate memory. PR> Mar 11 02:02:41 h1 hastd[2282]: [hvol] (primary) Unable to send request (Cannot allocate memory): WRITE(31642091520, 131072). 31642091520 looks like rather large offset for 10Gb volume... Just to be more confident that this is a HAST issue could you please try the following experiment? 1) Stop hastd on h2. 2) On h1 run something like below: dd if=/dev/zvol/zfs/hvol bs=131072 | ssh h2 dd bs=131072 of=/dev/zvol/zfs/hvol (copy hvol from h1 to h2 without hastd to see if it will succeed). Note: you will need to recreate HAST provider on secondary after this. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Mon, 12 Mar 2012 15:31:27 +0100 Phil Regnauld wrote: PR> Phil Regnauld (regnauld) writes: >> >> 7) ktrace on the destination dd: >> >> fstat(0,{ mode=p- ,inode=5,size=16384,blksize=4096 }) = 0 (0x0) >> lseek(0,0x0,SEEK_CUR)ERR#29 'Illegal seek' PR> [...] >> Illegal seek, eh ? Any clues ? >> >> The boxes are identical (HP DL380 G6), though the RAM config is >> different. >> >> Summary: >> >> - ssh works fine >> - h1 zvol to h2 zvol over ssh fails >> - h1 zvol to h2 /tmp/x over ssh is fine >> - h2 /dev/zero locally to h2 zvol is fine >> - h2 /tmp/x locally to h2 zvol fails at first, but works afterwards... PR> A few more data points: dd from a local zvol to a local zvol on either PR> machine works fine. PR> Using nc instead of ssh, this time it's the sender nc dying: PR> ktrace on the sender: PR> 47704 nc CALL write(0x3,0x7fff5450,0x800) PR> 47704 nc RET write -1 errno 32 Broken pipe PR> 47704 nc PSIG SIGPIPE SIG_DFL code=0x10006 PR> truss on the sender: PR> poll({3/POLLIN 0/POLLIN},2,-1) = 2 (0x2) PR> read(3,0x7fff5450,2048) ERR#54 'Connection reset by peer' PR> close(3) = 0 (0x0) PR> On tcpdump, I do see the receiver send a FIN when using nc. PR> When using ssh, the sender is sending the FIN. PR> Anything else I can look for ? It looks like in the case of hastd this was send(2) who returned ENOMEM, but it would be good to check. Could you please start synchronization again, ktrace primary worker process when ENOMEM errors are observed and show output here? If it is send(2) who fails then monitoring netstat and network driver statistics might be helpful. Something like netstat -nax netstat -naT netstat -m netstat -nid sysctl -a dev. And may be vmstat -m vmstat -z -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Tue, 13 Mar 2012 00:22:23 +0100 Phil Regnauld wrote: PR> Mikolaj Golub (to.my.trociny) writes: >> >> It looks like in the case of hastd this was send(2) who returned ENOMEM, but >> it would be good to check. Could you please start synchronization again, >> ktrace primary worker process when ENOMEM errors are observed and show >> output >> here? PR> Ok, took a little while, as running ktrace on the hastd does slow it down PR> significantly, and the error normally occurs at 30-90 sec intervals. PR>0x0f90 b2f3 3ad5 e657 7f0f 3e50 698f 5deb 12af |..:..W..>Pi.]...| PR>0x0fa0 740d c343 6e80 75f3 e1a7 bfdf a4c1 f6a6 |t..Cn.u.| PR>0x0fb0 ea85 655d e423 bd5e 42f7 7e9a 05d2 363a |..e].#.^B.~...6:| PR>0x0fc0 025e a7b5 0956 417c f31c a6eb 2cd9 d073 |.^...VA|,..s| PR>0x0fd0 2589 e8c0 d76a 889f 8345 eeaf f2a0 c2d6 |%j...E..| PR>0x0fe0 b89e aaef fee2 6593 e515 7271 88aa cf66 |..e...rq...f| PR>0x0ff0 d272 411a 7289 d6c9 6643 bdbe 3c8c 8ae8 |.rA.r...fC..<...| PR> 50959 hastdRET sendto 32768/0x8000 PR> 50959 hastdCALL sendto(0x6,0x8024bf000,0x8000,0x2,0,0) PR> 50959 hastdRET sendto -1 errno 12 Cannot allocate memory PR> 50959 hastdCALL clock_gettime(0xd,0x7f3f86f0) PR> 50959 hastdRET clock_gettime 0 PR> 50959 hastdCALL getpid PR> 50959 hastdRET getpid 50959/0xc70f PR> 50959 hastdCALL sendto(0x3,0x7f3f8780,0x84,0,0,0) PR> 50959 hastdGIO fd 3 wrote 132 bytes PR>"<27>Mar 12 23:42:43 hastd[50959]: [hvol] (primary) Unable to sen\ PR> d request (Cannot allocate memory): WRITE(8626634752, 131072)." PR> 50959 hastdRET sendto 132/0x84 PR> 50959 hastdCALL close(0x7) PR> 50959 hastdRET close 0 Ok. So it is send(2). I suppose the network driver could generate the error. Did you tell what network adaptor you had? >> If it is send(2) who fails then monitoring netstat and network driver >> statistics might be helpful. Something like >> >> netstat -nax >> netstat -naT >> netstat -m >> netstat -nid PR> I could run this in a loop, but that would be a lot of data, and might PR> not be appropriate to paste here. PR> I didn't see any obvious errors, but I'm not sure what I'm looking for. PR> netstat -m didn't show anything close to running out of buffers or PR> clusters... >> sysctl -a dev. >> >> And may be >> >> vmstat -m >> vmstat -z PR> No obvious errors there either, but again what should I look out for ? I would look at sysctl -a dev. statistics and try to find if there is correlation between ENOMEM failures and growing of error counters. PR> In the meantime, I've also experimented with a few different scenarios, and PR> I'm quite puzzled. PR> For instance, I configured one of the other gigabit cards on each host to PR> provide a dedicated replication network. The main difference is that up PR> until now this has been running using tagged vlans. To be on the safe side, PR> I decided to use an untagged interface (the second gigabit adapter in each PR> machine). PR> PR> Here's where I observed, and it is very odd: PR> PR> - doing a dd ... | ssh dd fails in the same fashion as before PR> - I created a second zvol + hast resource of just 1 GB, and it replicated PR> without any problems, peaking at 75 MB / sec (!) - maybe 1GB is too small PR> ? PR> PR> (side note: hastd doesn't pick up configuration changes even with SIGHUP, PR>which makes it hard to provision new resources on the fly) PR> - I restarted replication on the 100 G hast resource, and it's currently PR> replicating without any problems over the second ethernet, but it's PR> dragging along at 9-10 MB/sec, peaking at 29 MB/sec occasionally. Looking at buffer usage from 'netstat -nax' output ran during synchronization (on both hosts) could provide useful info where the bottleneck is. top -HS output might be useful too. PR> Earlier, I was observing peaks at 65-70 MB sec in between failures... PR> So I don't really know what to conclude :-| -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Tue, 13 Mar 2012 22:19:28 +0100 Phil Regnauld wrote: PR> dev.bce.0.l2fhdr_error_count: 0 PR> dev.bce.0.stat_emac_tx_stat_dot3statsinternalmactransmiterrors: 0 PR> dev.bce.0.stat_Dot3StatsCarrierSenseErrors: 0 PR> dev.bce.0.stat_Dot3StatsFCSErrors: 0 PR> dev.bce.0.stat_Dot3StatsAlignmentErrors: 0 What about failed counters like mbuf_alloc_failed_count, dma_map_addr_rx_failed_count, dma_map_addr_tx_failed_count? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Tue, 13 Mar 2012 00:22:23 +0100 Phil Regnauld wrote: PR> (side note: hastd doesn't pick up configuration changes even with SIGHUP, PR>which makes it hard to provision new resources on the fly) I just tried to reproduce this and failed. For me a new recource was added without problems on reload. Mar 17 20:04:24 kopusha hastd[52678]: Reloading configuration... Mar 17 20:04:24 kopusha hastd[52678]: Keep listening on address 0.0.0.0:7771. Mar 17 20:04:24 kopusha hastd[52678]: Resource rtest added. Mar 17 20:04:24 kopusha hastd[52678]: Configuration reloaded successfully. You sent SIGHUP to master process and on both hosts, didn't you? Could you please provide more details if you still fail to add new resources on the fly (configuration, log messages). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: svn commit: r233953 - stable/8/usr.bin/procstat
On Sun, 8 Apr 2012 17:12:18 -0400 Jason Hellenthal wrote: JH> This commit in action does not seem to be doing the correct thing even JH> though it does report an error when kern.proc.pathname is not known. JH> Running procstat -a -b produces: JH> [...] JH> 1848 ksh803500 /bin/ksh JH> procstat: sysctl: kern.proc.pathname: 2208: No such file or directory Plese note, the error was generated by kern.proc.pathname sysctl, not kern.proc.osrel. I suppose it is because /bin/ksh binary had been reinstalled and 1848 ran the old binary. My commit has not touched the kern.proc.pathname part and the same bahavior was before the change. JH> 2210 ksh803500 /bin/ksh JH> [...] JH> While procstat -a produces: JH> [...] JH> 1848 1846 1848 1848 1848 1 jhellenthal wait FreeBSD ELF32 ksh JH> 2208 1814 2208 2208 0 1 jhellenthal selectFreeBSD ELF32 xterm JH> 2210 2208 2210 2210 2210 1 jhellenthal wait FreeBSD ELF32 ksh JH> [...] JH> If process 2208 can be seen during (procstat -a) I do not see a reason JH> to bailout and print an error when (-b) is used. Just print the JH> orrelease as (0) and print the rest of the information that should be JH> seen... JH> Could someone have a closer look at this? JH> On Fri, Apr 06, 2012 at 04:32:29PM +, Mikolaj Golub wrote: >> Author: trociny >> Date: Fri Apr 6 16:32:29 2012 >> New Revision: 233953 >> URL: http://svn.freebsd.org/changeset/base/233953 >> >> Log: >> MFC r233390: >> >> When displaying binary information show also osreldate. >> >> Suggested by:kib >> >> Modified: >> stable/8/usr.bin/procstat/procstat.1 >> stable/8/usr.bin/procstat/procstat_bin.c >> Directory Properties: >> stable/8/usr.bin/procstat/ (props changed) >> >> Modified: stable/8/usr.bin/procstat/procstat.1 >> == >> --- stable/8/usr.bin/procstat/procstat.1Fri Apr 6 16:31:29 2012 >> (r233952) >> +++ stable/8/usr.bin/procstat/procstat.1Fri Apr 6 16:32:29 2012 >> (r233953) >> @@ -25,7 +25,7 @@ >> .\" >> .\" $FreeBSD$ >> .\" >> -.Dd March 7, 2010 >> +.Dd March 23, 2012 >> .Dt PROCSTAT 1 >> .Os >> .Sh NAME >> @@ -98,6 +98,8 @@ Display the process ID, command, and pat >> process ID >> .It COMM >> command >> +.It OSREL >> +osreldate for process binary >> .It PATH >> path to process binary (if available) >> .El >> >> Modified: stable/8/usr.bin/procstat/procstat_bin.c >> == >> --- stable/8/usr.bin/procstat/procstat_bin.cFri Apr 6 16:31:29 >> 2012(r233952) >> +++ stable/8/usr.bin/procstat/procstat_bin.cFri Apr 6 16:32:29 >> 2012(r233953) >> @@ -42,11 +42,11 @@ void >> procstat_bin(pid_t pid, struct kinfo_proc *kipp) >> { >> char pathname[PATH_MAX]; >> -int error, name[4]; >> +int error, osrel, name[4]; >> size_t len; >> >> if (!hflag) >> -printf("%5s %-16s %-53s\n", "PID", "COMM", "PATH"); >> +printf("%5s %-16s %8s %s\n", "PID", "COMM", "OSREL", >> "PATH"); >> >> name[0] = CTL_KERN; >> name[1] = KERN_PROC; >> @@ -64,7 +64,19 @@ procstat_bin(pid_t pid, struct kinfo_pro >> if (len == 0 || strlen(pathname) == 0) >> strcpy(pathname, "-"); >> >> +name[2] = KERN_PROC_OSREL; >> + >> +len = sizeof(osrel); >> +error = sysctl(name, 4, &osrel, &len, NULL, 0); >> +if (error < 0 && errno != ESRCH) { >> +warn("sysctl: kern.proc.osrel: %d", pid); >> +return; >> +} >> +if (error < 0) >> +return; >> + >> printf("%5d ", pid); >> printf("%-16s ", kipp->ki_comm); >> +printf("%8d ", osrel); >> printf("%s\n", pathname); >> } >> ___ >> svn-src-stabl...@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/svn-src-stable-8 >> To unsubscribe, send any mail to "svn-src-stable-8-unsubscr...@freebsd.org" JH> -- JH> ;s =; -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfe0 loses network connectivity (8.0-RELEASE-p2)
On Mon, 7 Jun 2010 16:06:11 +0200 Olaf Seibert wrote: OS> I do get the impression there is a mbuf leak somehow. On a much older OS> file server (FreeBSD 6.1, serves a bit of NFS but has no ZFS) the mbuf OS> cluster useage is much lower, despite a longer uptime: OS> 256/634/890/25600 mbuf clusters in use (current/cache/total/max) OS> Also, it shows signs that measures are taken in case of mbuf shortage: OS> 2259806/466391/598621 requests for mbufs denied (mbufs/clusters/mbuf+clusters) OS> 1016 calls to protocol drain routines OS> whereas the FreeBSD 8.0 machine has zero or very low numbers: OS> 0/3956/1959 requests for mbufs denied (mbufs/clusters/mbuf+clusters) OS> 0 calls to protocol drain routines OS> and useage keeps growing: OS> 26122/1782/27904/32768 mbuf clusters in use (current/cache/total/max) It looks like the issue that has been fixed in STABLE. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/144330 -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: freeBSD nullfs together with nfs and "silly rename"
On Sun, 6 Jun 2010 16:44:43 +0200 Leon Meßner wrote: LM> Hi, LM> I hope this is not the wrong list to ask. Didn't get any answers on LM> -questions. LM> When you try to do the following inside a nullfs mounted directory, LM> where the nullfs origin is itself mounted via nfs you get an error: LM> # foo LM> # tail -f foo& LM> # rm -f foo LM> tail: foo: Stale NFS file handle LM> # fg LM> This is really a problem when running services inside jails and using LM> NFS as storage. As of [2] it looks like this problem is known for a LM> while. On a normal NFS mount this does not happen as "silly renaming" LM> [1] works there (producing nasty little .nfs files). nfs_sillyrename() is called when vnode's usecount is more then 1. It is expected that unlink() syscall increases vnode's usecount in namei() and if the file has been already opened usecount will be more then 1. But with nullfs layer present the reference counts are held by the upper node, not the lower (nfs) one, so when unlink() is called it increases usecount of the upper vnode, not nfs vnode and nfs_sillyrename() is never called. The strightforward solution looks like to implement null_remove() that will increase lower vnode's refcount before calling null_bypass() and then decrement it after the call. See the attached patch (it works for me on both 8-STABLE and CURRENT). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: freeBSD nullfs together with nfs and "silly rename"
On Sat, 12 Jun 2010 11:56:10 +0300 Mikolaj Golub wrote to Leon Meßner: MG> See the attached patch (it works for me on both 8-STABLE and CURRENT). Sorry, actually here is the patch. -- Mikolaj Golub Index: sys/fs/nullfs/null_vnops.c === --- sys/fs/nullfs/null_vnops.c (revision 208960) +++ sys/fs/nullfs/null_vnops.c (working copy) @@ -499,6 +499,23 @@ } /* + * Increasing refcount of lower vnode is needed at least for the case + * when lower FS is NFS to do sillyrename if the file is in use. + */ +static int +null_remove(struct vop_remove_args *ap) +{ + int retval; + struct vnode *lvp; + + lvp = NULLVPTOLOWERVP(ap->a_vp); + VREF(lvp); + retval = null_bypass(&ap->a_gen); + vrele(lvp); + return (retval); +} + +/* * We handle this to eliminate null FS to lower FS * file moving. Don't know why we don't allow this, * possibly we should. @@ -809,6 +826,7 @@ .vop_open = null_open, .vop_print = null_print, .vop_reclaim = null_reclaim, + .vop_remove = null_remove, .vop_rename = null_rename, .vop_setattr = null_setattr, .vop_strategy = VOP_EOPNOTSUPP, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Has anyone usd hast in production yet - opinions ?
On Mon, 04 Oct 2010 16:55:05 +0100 Pete French wrote: >> Please see the freebsd-fs mailing list, which has quite a large number >> of problem reports/issues being posted to it on a regular basis (and >> patches are often provided). PF> Thanks have signed up - I was signed up to 'geom' but not that one. PF> A large number of problem reports is not quite what I was hping for, but PF> good to know,a nd maybe I shall hold off for a while :-) Being the author of many problem reports I can say that most of them were not critical and for marginal cases (like some issues with hooks or a race that showed up when changing HAST role in loop -- you would never do this in production). And fixes were committed in several days after a report. I don't know any open issue. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast vs ggate+gmirror sychrnoisation speed
On Thu, 21 Oct 2010 13:25:34 +0100 Pete French wrote: PF> Well, I bit the bullet and moved to using hast - all went beautifully, PF> and I migrated the pool with no downtime. The one thing I do notice, PF> however, is that the synchronisation with hast is much slower PF> than the older ggate+gmirror combination. It's about half the PF> speed in fact. PF> When I orginaly setup my ggate configuration I did a lot of tweaks to PF> get the speed good - these copnsisted of expanding the send and PF> receive space for the sockets using sysctl.conf, and then providing PF> large buffers to ggate. Is there a way to control this with hast ? PF> I still have the sysctls set (as the machines have not rebooted) PF> but I cant see any options in hast.conf which are equivalent to the PF> "-S 262144 -R 262144" which I use with ggate PF> Any advice, or am I barking up the wrong tree here ? Currently there are no options in hast.conf to change send and receive buffer size. They are hardcoded in sbin/hastd/proto_tcp4.c: val = 131072; if (setsockopt(tctx->tc_fd, SOL_SOCKET, SO_SNDBUF, &val, sizeof(val)) == -1) { pjdlog_warning("Unable to set send buffer size on %s", addr); } val = 131072; if (setsockopt(tctx->tc_fd, SOL_SOCKET, SO_RCVBUF, &val, sizeof(val)) == -1) { pjdlog_warning("Unable to set receive buffer size on %s", addr); } You could change the values and recompile hastd :-). It would be interesting to know about the results of your experiment (if you do). Also note there is another hardcoded value in sbin/hastd/proto_common.c /* Maximum size of packet we want to use when sending data. */ #define MAX_SEND_SIZE 32768 that looks like might affect synchronization speed too. Previously we had 128kB here but this has been changed to 32Kb because it was reported about slow synchronization with MAX_SEND_SIZE=128kB. http://svn.freebsd.org/viewvc/base?view=revision&revision=211452 I wonder couldn't slow synchronization with MAX_SEND_SIZE=131072 be due to SO_SNDBUF/SO_RCVBUF be equal to this size? May be increasing SO_SNDBUF/SO_RCVBUF we could reach better performance with MAX_SEND_SIZE=128kB? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast vs ggate+gmirror sychrnoisation speed
On Mon, 25 Oct 2010 11:55:34 +0100 Pete French wrote: >> You could change the values and recompile hastd :-). It would be interesting >> to know about the results of your experiment (if you do). PF> I changed the buffer sizes to the same as I was using for ggate, but the speed PF> is still the same - 44meg/second (about half of what the link can do) You can check if the queue size is an issue monitoring with netstat Recv-Q and Send-Q for hastd connections during the test. Running something like below: while sleep 1; do netstat -na |grep '\.8457.*ESTAB'; done Also tcpdump may help :-) -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast vs ggate+gmirror sychrnoisation speed
On Tue, 26 Oct 2010 17:01:01 +0100 Pete French wrote: PF> Actually, I just llooked I dmesg on the secondary - it is full PF> of messages thus: PF> Oct 26 15:44:59 serpentine-passive hastd[10394]: [serp0] (secondary) Unable to receive request header: RPC version wrong. PF> Oct 26 15:45:00 serpentine-passive hastd[782]: [serp0] (secondary) Worker process exited ungracefully (pid=10394, exitcode=75). PF> Oct 26 15:46:59 serpentine-passive hastd[10421]: [serp0] (secondary) Unable to receive request header: RPC version wrong. PF> Oct 26 15:47:04 serpentine-passive hastd[782]: [serp0] (secondary) Worker process exited ungracefully (pid=10421, exitcode=75). I saw this too but only sporadic messages so I forgot and did not investigate then this :-). Now running synchronization I see them too (but again only sporadic). Setting the assertion and looking at the received header: (gdb) list 309 goto fail; 310 311 if (hdr.version != HAST_PROTO_VERSION) { 312 assert(0); 313 errno = ERPCMISMATCH; 314 goto fail; 315 } 316 317 hdr.size = le32toh(hdr.size); 318 (gdb) p/x hdr $2 = {version = 0x9, size = 0x65657266} So it looks like garbage. In hast_proto_send() we send header and then data. Couldn't it be that remote_send and sync threads interfere and their packets are mixed? May be some synchronization is needed here? I set sleep(1) in hast_proto_send() between proto_send(header) and proto_send(data). The error started to occur frequently. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast vs ggate+gmirror sychrnoisation speed
On Thu, 28 Oct 2010 18:30:36 +0200 Pawel Jakub Dawidek wrote: PJD> On Wed, Oct 27, 2010 at 10:05:20PM +0300, Mikolaj Golub wrote: >> In hast_proto_send() we send header and then data. Couldn't it be that >> remote_send and sync threads interfere and their packets are mixed? May be >> some >> synchronization is needed here? >> >> I set sleep(1) in hast_proto_send() between proto_send(header) and >> proto_send(data). The error started to occur frequently. PJD> Synchronization requests are sent through the remote thread just like PJD> regular I/O requests, exactly because of races that can occur. PJD> I looked at the code and the keepalive packets arbe sent from another PJD> thread. Could you try turning them off in primary.c and see if that PJD> helps? At first I set RETRY_SLEEP to 1 sec to have more keepalive packets. The errors started to observe frequently: Oct 28 21:35:53 bolek hastd[1709]: [storage] (secondary) Unable to receive request header: RPC version wrong. Oct 28 21:35:54 bolek hastd[1632]: [storage] (secondary) Worker process exited ungracefully (pid=1709, exitcode=75). Oct 28 21:36:12 bolek hastd[1722]: [storage] (secondary) Unable to receive request header: RPC version wrong. Oct 28 21:36:12 bolek hastd[1632]: [storage] (secondary) Worker process exited ungracefully (pid=1722, exitcode=75). ... Now I have been running synchronization for more then a half an hour with keepalive_send disabled and have not seen any error. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast vs ggate+gmirror sychrnoisation speed
On Thu, 28 Oct 2010 22:08:54 +0300 Mikolaj Golub wrote to Pawel Jakub Dawidek: PJD>> I looked at the code and the keepalive packets arbe sent from another PJD>> thread. Could you try turning them off in primary.c and see if that PJD>> helps? MG> At first I set RETRY_SLEEP to 1 sec to have more keepalive packets. The errors MG> started to observe frequently: MG> Oct 28 21:35:53 bolek hastd[1709]: [storage] (secondary) Unable to receive request header: RPC version wrong. MG> Oct 28 21:35:54 bolek hastd[1632]: [storage] (secondary) Worker process exited ungracefully (pid=1709, exitcode=75). MG> Oct 28 21:36:12 bolek hastd[1722]: [storage] (secondary) Unable to receive request header: RPC version wrong. MG> Oct 28 21:36:12 bolek hastd[1632]: [storage] (secondary) Worker process exited ungracefully (pid=1722, exitcode=75). MG> ... MG> Now I have been running synchronization for more then a half an hour with MG> keepalive_send disabled and have not seen any error. So :-) What do you think about sending keepalive in remote_send_thread() to avoid this problem and sending them only when a connection is idle (it looks like there is no much use to send them all the time)? Something like in the patch below (it works for me). -- Mikolaj Golub Index: sbin/hastd/primary.c === --- sbin/hastd/primary.c (revision 214550) +++ sbin/hastd/primary.c (working copy) @@ -190,6 +190,19 @@ static pthread_mutex_t metadata_lock; hio_next[(ncomp)]); \ mtx_unlock(&hio_##name##_list_lock[(ncomp)]); \ } while (0) +#define QUEUE_TRY1(hio, name, ncomp) do { \ + mtx_lock(&hio_##name##_list_lock[(ncomp)]); \ + (hio) = TAILQ_FIRST(&hio_##name##_list[(ncomp)]); \ + if (hio == NULL) { \ + cv_timedwait(&hio_##name##_list_cond[(ncomp)], \ + &hio_##name##_list_lock[(ncomp)], RETRY_SLEEP); \ + hio = TAILQ_FIRST(&hio_##name##_list[(ncomp)]); \ + }\ + if (hio != NULL) \ + TAILQ_REMOVE(&hio_##name##_list[(ncomp)], hio, \ + hio_next[(ncomp)]); \ + mtx_unlock(&hio_##name##_list_lock[(ncomp)]); \ +} while (0) #define QUEUE_TAKE2(hio, name) do { \ mtx_lock(&hio_##name##_list_lock);\ while (((hio) = TAILQ_FIRST(&hio_##name##_list)) == NULL) { \ @@ -1176,6 +1189,38 @@ local_send_thread(void *arg) return (NULL); } +static void +keepalive_send(struct hast_resource *res, unsigned int ncomp) +{ + struct nv *nv; + + if (!ISCONNECTED(res, ncomp)) + return; + + assert(res->hr_remotein != NULL); + assert(res->hr_remoteout != NULL); + + nv = nv_alloc(); + nv_add_uint8(nv, HIO_KEEPALIVE, "cmd"); + if (nv_error(nv) != 0) { + nv_free(nv); + pjdlog_debug(1, + "keepalive_send: Unable to prepare header to send."); + return; + } + if (hast_proto_send(res, res->hr_remoteout, nv, NULL, 0) < 0) { + pjdlog_common(LOG_DEBUG, 1, errno, + "keepalive_send: Unable to send request"); + nv_free(nv); + rw_unlock(&hio_remote_lock[ncomp]); + remote_close(res, ncomp); + rw_rlock(&hio_remote_lock[ncomp]); + return; + } + nv_free(nv); + pjdlog_debug(2, "keepalive_send: Request sent."); +} + /* * Thread sends request to secondary node. */ @@ -1184,6 +1229,7 @@ remote_send_thread(void *arg) { struct hast_resource *res = arg; struct g_gate_ctl_io *ggio; + time_t lastcheck, now; struct hio *hio; struct nv *nv; unsigned int ncomp; @@ -1194,10 +1240,19 @@ remote_send_thread(void *arg) /* Remote component is 1 for now. */ ncomp = 1; + lastcheck = time(NULL); for (;;) { pjdlog_debug(2, "remote_send: Taking request."); - QUEUE_TAKE1(hio, send, ncomp); + QUEUE_TRY1(hio, send, ncomp); + if (hio == NULL) { + now = time(NULL); + if (lastcheck + RETRY_SLEEP <= now) { +keepalive_send(res, ncomp); +lastcheck = now; + } + continue; + } pjdlog_debug(2, "remote_send: (%p) Got request.", hio); ggio = &hio->hio_ggio; switch (ggio->gctl_cmd) { @@ -1883,32 +1938,6 @@ failed: } static void -keepalive_send(struct hast_resource *res, unsigned int ncomp) -{ - struct nv *nv; - - nv = nv_alloc(); - nv_add_uint8(nv, HIO_KEEPALIVE, "cmd"); - if (nv_error(nv) != 0) { - nv_free(nv); - pjdlog_debug(1, - "keepalive_send: Unable to prepare header to send."); - return; - } - if (hast_proto_send(res, res->hr_remoteout, nv, NULL, 0) < 0) { - pjdlog_common(LOG_DEBUG, 1, errno, - "keepalive_send: Unable to send request"); - nv_free(nv); - rw_unlock(&hio_remote_lock[ncomp]); - remote_close(res, ncomp); - rw_rlock(&hio_remote_lock[ncomp]); - return; - } - nv_free(nv); - pjdlog_debug(2, "keepalive_send: Request sent."); -} - -static void guard_one(struct hast_resource *res, unsigned int ncomp) { struct proto_conn *in, *out; @@ -192
Re: hast vs ggate+gmirror sychrnoisation speed
On Mon, 1 Nov 2010 12:01:00 +0100 Pawel Jakub Dawidek wrote: PJD> I like your patch and I agree of course it is better to send keepalive PJD> packets only when connection is idle. The only thing I'd change is to PJD> modify QUEUE_TAKE1() macro to take additional argument 'timeout' - if we PJD> don't want it to time out, we pass 0. Could you modify your patch? Sure :-). Could you look at the updated version? Note. So far I have only tested that hastd with this updated patch is compilable and runnable. I will do normal testing today later when I have access to my test instances and will report about the results. -- Mikolaj Golub Index: sbin/hastd/primary.c === --- sbin/hastd/primary.c (revision 214624) +++ sbin/hastd/primary.c (working copy) @@ -180,14 +180,20 @@ static pthread_mutex_t metadata_lock; if (_wakeup) \ cv_signal(&hio_##name##_list_cond); \ } while (0) -#define QUEUE_TAKE1(hio, name, ncomp) do {\ +#define QUEUE_TAKE1(hio, name, ncomp, timeout) do { \ + bool _last; \ + \ mtx_lock(&hio_##name##_list_lock[(ncomp)]); \ - while (((hio) = TAILQ_FIRST(&hio_##name##_list[(ncomp)])) == NULL) { \ - cv_wait(&hio_##name##_list_cond[(ncomp)], \ - &hio_##name##_list_lock[(ncomp)]); \ + _last = false; \ + while (((hio) = TAILQ_FIRST(&hio_##name##_list[(ncomp)])) == NULL && !_last) { \ + cv_timedwait(&hio_##name##_list_cond[(ncomp)], \ + &hio_##name##_list_lock[(ncomp)], (timeout)); \ + if ((timeout) != 0) \ + _last = true; \ }\ - TAILQ_REMOVE(&hio_##name##_list[(ncomp)], (hio), \ - hio_next[(ncomp)]); \ + if (hio != NULL) \ + TAILQ_REMOVE(&hio_##name##_list[(ncomp)], (hio), \ + hio_next[(ncomp)]); \ mtx_unlock(&hio_##name##_list_lock[(ncomp)]); \ } while (0) #define QUEUE_TAKE2(hio, name) do { \ @@ -1112,7 +1118,7 @@ local_send_thread(void *arg) for (;;) { pjdlog_debug(2, "local_send: Taking request."); - QUEUE_TAKE1(hio, send, ncomp); + QUEUE_TAKE1(hio, send, ncomp, 0); pjdlog_debug(2, "local_send: (%p) Got request.", hio); ggio = &hio->hio_ggio; switch (ggio->gctl_cmd) { @@ -1176,6 +1182,38 @@ local_send_thread(void *arg) return (NULL); } +static void +keepalive_send(struct hast_resource *res, unsigned int ncomp) +{ + struct nv *nv; + + if (!ISCONNECTED(res, ncomp)) + return; + + assert(res->hr_remotein != NULL); + assert(res->hr_remoteout != NULL); + + nv = nv_alloc(); + nv_add_uint8(nv, HIO_KEEPALIVE, "cmd"); + if (nv_error(nv) != 0) { + nv_free(nv); + pjdlog_debug(1, + "keepalive_send: Unable to prepare header to send."); + return; + } + if (hast_proto_send(res, res->hr_remoteout, nv, NULL, 0) < 0) { + pjdlog_common(LOG_DEBUG, 1, errno, + "keepalive_send: Unable to send request"); + nv_free(nv); + rw_unlock(&hio_remote_lock[ncomp]); + remote_close(res, ncomp); + rw_rlock(&hio_remote_lock[ncomp]); + return; + } + nv_free(nv); + pjdlog_debug(2, "keepalive_send: Request sent."); +} + /* * Thread sends request to secondary node. */ @@ -1184,6 +1222,7 @@ remote_send_thread(void *arg) { struct hast_resource *res = arg; struct g_gate_ctl_io *ggio; + time_t lastcheck, now; struct hio *hio; struct nv *nv; unsigned int ncomp; @@ -1194,10 +1233,19 @@ remote_send_thread(void *arg) /* Remote component is 1 for now. */ ncomp = 1; + lastcheck = time(NULL); for (;;) { pjdlog_debug(2, "remote_send: Taking request."); - QUEUE_TAKE1(hio, send, ncomp); + QUEUE_TAKE1(hio, send, ncomp, RETRY_SLEEP); + if (hio == NULL) { + now = time(NULL); + if (lastcheck + RETRY_SLEEP <= now) { +keepalive_send(res, ncomp); +lastcheck = now; + } + continue; + } pjdlog_debug(2, "remote_send: (%p) Got request.", hio); ggio = &hio->hio_ggio; switch (ggio->gctl_cmd) { @@ -1883,32 +1931,6 @@ failed: } static void -keepalive_send(struct hast_resource *res, unsigned int ncomp) -{ - struct nv *nv; - - nv = nv_alloc(); - nv_add_uint8(nv, HIO_KEEPALIVE, "cmd"); - if (nv_error(nv) != 0) { - nv_free(nv); - pjdlog_debug(1, - "keepalive_send: Unable to prepare header to send."); - return; - } - if (hast_proto_send(res, res->hr_remoteout, nv, NULL, 0) < 0) { - pjdlog_common(LOG_DEBUG, 1, errno, - "keepalive_send: Unable to send request"); - nv_free(nv); - rw_unlock(&hio_remote_lock[ncomp]); - remote_close(res, ncomp); - rw_rlock(&hio_remote_lock[ncomp]); - return; - } - nv_free(nv); - pjdlog_debug(2, "keepalive_send: Request sent."); -} - -static void guard_one(struct hast_resource *res, unsigned int ncomp) { struct proto_conn *in, *out; @@ -1926,12 +1948,6 @@ guard_one(struct
Re: hast vs ggate+gmirror sychrnoisation speed
On Mon, 01 Nov 2010 17:06:49 +0200 Mikolaj Golub wrote: MG> On Mon, 1 Nov 2010 12:01:00 +0100 Pawel Jakub Dawidek wrote: PJD>> I like your patch and I agree of course it is better to send keepalive PJD>> packets only when connection is idle. The only thing I'd change is to PJD>> modify QUEUE_TAKE1() macro to take additional argument 'timeout' - if we PJD>> don't want it to time out, we pass 0. Could you modify your patch? MG> Sure :-). Could you look at the updated version? MG> Note. So far I have only tested that hastd with this updated patch is MG> compilable and runnable. I will do normal testing today later when I have MG> access to my test instances and will report about the results. Tested. It works for me. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: can't disable hyperthreading on 7.1
On Wed, 24 Dec 2008 15:36:10 +0200 Alexander Melnik wrote: AM> Hi AM> I have several computers with 2 xeon processors with hyperthreading under FreeBSD 7.1-RC2 and in any case can not turn off hyperthreading: AM> [...@vmat ~]$ cat /boot/loader.conf AM> machdep.hyperthreading_allowed="0" AM> machdep.hlt_logical_cpus="1" AM> [...@vmat ~]$ sysctl machdep.hyperthreading_allowed AM> machdep.hyperthreading_allowed: 0 AM> [...@vmat ~]$ sysctl machdep.hlt_logical_cpus AM> machdep.hlt_logical_cpus: 1 AM> [...@vmat ~]$ sysctl hw.ncpu AM> hw.ncpu: 4 AM> If machdep.hyperthreading_allowed = "0", the hw.ncpu must be equal to 2? AM> [...@vmat ~]$ top -nd 1 AM> last pid: 825; load averages: 0.00, 0.00, 0.00 up 0+00:21:19 15:22:24 AM> 17 processes: 1 running, 16 sleeping AM> Mem: 6228K Active, 6984K Inact, 20M Wired, 9520K Buf, 960M Free AM> Swap: 2048M Total, 2048M Free AM> PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND AM> 762 root1 40 8428K 3936K sbwait 2 0:00 0.00% sshd AM> 767 old 1 80 4396K 2212K wait 2 0:00 0.00% bash AM> 765 old 1 440 8428K 3952K select 0 0:00 0.00% sshd AM> 571 root1 440 3184K 1200K select 1 0:00 0.00% syslogd AM> 706 root1 440 5876K 3196K select 0 0:00 0.00% sendmail AM> 716 root1 80 3212K 1276K nanslp 2 0:00 0.00% cron AM> 759 root1 50 3184K 1088K ttyin 2 0:00 0.00% getty AM> 758 root1 50 3184K 1088K ttyin 3 0:00 0.00% getty AM> 760 root1 50 3184K 1088K ttyin 0 0:00 0.00% getty AM> 700 root1 440 5752K 3276K select 0 0:00 0.00% sshd AM> 710 smmsp 1 200 5876K 3200K pause 2 0:00 0.00% sendmail AM> 297 root1 960 3128K 1208K select 0 0:00 0.00% dhclient AM> 737 root1 960 3240K 1152K select 3 0:00 0.00% inetd AM> 163 root1 200 1380K 804K pause 0 0:00 0.00% adjkerntz AM> 512 root1 440 1888K 564K select 0 0:00 0.00% devd AM> 313 _dhcp 1 440 3128K 1320K select 0 0:00 0.00% dhclient AM> 825 old 1 440 3496K 1656K CPU0 0 0:00 0.00% top AM> AM> If machdep.hlt_logical_cpus = "1" in the output top in any case should not be seen processors 2 and 3? You can run vmstat -i | grep cpu to see how many CPUs are actually used. I also observe on some hosts (6.3) with machdep.hlt_logical_cpus=1 that in C column of top output there appear CPU numbers for CPUs that are actually halted according to vmstat -i and I am curious too what this means. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
pthread.h: typo in #define pthread_cleanup_push/pthread_cleanup_pop
Hi, I have problems with compiling our application under 8.0. It fails due to these definitions in pthread.h that look like a typo or incorrectly applied patch: 170 #define pthread_cleanup_push(cleanup_routine, cleanup_arg) \ 171 { \ 172 struct _pthread_cleanup_info __cleanup_info__; \ 173 __pthread_cleanup_push_imp(cleanup_routine, cleanup_arg,\ 174 &__cleanup_info__); \ 175 { 176 177 #define pthread_cleanup_pop(execute) \ 178 } \ 179 __pthread_cleanup_pop_imp(execute); \ 180 } This patch fixes the problem for me: --- pthread.h.orig2009-11-24 16:44:13.0 +0200 +++ pthread.h 2009-11-24 16:44:45.0 +0200 @@ -172,10 +172,10 @@ struct _pthread_cleanup_info __cleanup_info__; \ __pthread_cleanup_push_imp(cleanup_routine, cleanup_arg,\ &__cleanup_info__); \ - { + } #definepthread_cleanup_pop(execute) \ - } \ + { \ __pthread_cleanup_pop_imp(execute); \ } -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: pthread.h: typo in #define pthread_cleanup_push/pthread_cleanup_pop
On Tue, 24 Nov 2009 16:53:35 +0200 Mikolaj Golub wrote: > Hi, > > I have problems with compiling our application under 8.0. > > It fails due to these definitions in pthread.h that look like a typo or > incorrectly applied patch: > > 170 #define pthread_cleanup_push(cleanup_routine, cleanup_arg) > \ > 171 { > \ > 172 struct _pthread_cleanup_info > __cleanup_info__; \ > 173 __pthread_cleanup_push_imp(cleanup_routine, > cleanup_arg,\ > 174 &__cleanup_info__); > \ > 175 { > 176 > 177 #define pthread_cleanup_pop(execute) > \ > 178 } > \ > 179 __pthread_cleanup_pop_imp(execute); > \ > 180 } > > > This patch fixes the problem for me: I was hurry when said that the patch fixed the problem. The application compiled but later it crashed in pthread_cleanup_pop: (gdb) bt #0 0xbf4f9ee0 in ?? () #1 0x287d18c9 in __pthread_cleanup_pop_imp () from /lib/libthr.so.3 #2 0x287d18ed in pthread_cleanup_pop () from /lib/libthr.so.3 #3 0x287d123c in pthread_exit () from /lib/libthr.so.3 #4 0x287c7757 in pthread_getprio () from /lib/libthr.so.3 #5 0x in ?? () So, I don't know what these macros actually were supposed to be. They were introduced in r179662: Revision 1.43: download - view: text, markup, annotated - select for diffs Mon Jun 9 01:14:10 2008 UTC (17 months, 2 weeks ago) by davidxu Branches: MAIN Diff to: previous 1.42: preferred, colored Changes since revision 1.42: +21 -2 lines SVN rev 179662 on 2008-06-09 01:14:10Z by davidxu Make pthread_cleanup_push() and pthread_cleanup_pop() as a pair of macros, use stack space to keep cleanup information, this eliminates overhead of calling malloc() and free() in thread library. Discussed on: thread@ > --- pthread.h.orig2009-11-24 16:44:13.0 +0200 > +++ pthread.h 2009-11-24 16:44:45.0 +0200 > @@ -172,10 +172,10 @@ > struct _pthread_cleanup_info __cleanup_info__; > \ > __pthread_cleanup_push_imp(cleanup_routine, > cleanup_arg,\ > &__cleanup_info__); > \ > - { > + } > > #definepthread_cleanup_pop(execute) > \ > - } > \ > + { \ > __pthread_cleanup_pop_imp(execute); > \ > } -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: pthread.h: typo in #define pthread_cleanup_push/pthread_cleanup_pop
On Tue, 24 Nov 2009 17:34:22 +0200 Kostik Belousov wrote: > pthread_cleanup_push/pop are supposed to be used from the common > lexical scope. Citation from SUSv4: > > These functions may be implemented as macros. The application shall > ensure that they appear as statements, and in pairs within the same > lexical scope (that is, the pthread_cleanup_push() macro may be > thought to expand to a token list whose first token is '{' with > pthread_cleanup_pop() expanding to a token list whose last token is the > corresponding '}' ). > > Your change is wrong. > > Basically, the code should do > pthread_cleanup_push(some_func, arh); > something ... > pthread_cleanup_pop(1); > (1 denotes that some_func should be called). I see. Thank you. So it really looks like a bug in our application as pthread_cleanup_pop(1) is missed. I will tell our developers :-) -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD 7.1: QUOTA: kernel panics in jailed()
Hi, Today we have observed the panic on our FreeBSD7.1 box build with QUOTA support. According to backtrace ffs_truncate() called chkdq() with NOCRED but later jailed() was called and the system crashed dereferencing cred->cr_prison. GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0x64 fault code = supervisor read, page not present instruction pointer = 0x20:0xc07a1d26 stack pointer = 0x28:0xedb2d8b8 frame pointer = 0x28:0xedb2d8b8 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 9742 (icoms_agent_cox476) trap number = 12 panic: page fault cpuid = 7 Uptime: 19h54m4s Physical memory: 3315 MB Dumping 326 MB: 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 Reading symbols from /boot/kernel/if_lagg.ko...Reading symbols from /boot/kernel/if_lagg.ko.symbols...done. done. Loaded symbols for /boot/kernel/if_lagg.ko Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko #0 doadump () at pcpu.h:196 196 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc07c2b27 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc07c2df9 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0ada1ec in trap_fatal (frame=0xedb2d878, eva=100) at /usr/src/sys/i386/i386/trap.c:939 #4 0xc0ada470 in trap_pfault (frame=0xedb2d878, usermode=0, eva=100) at /usr/src/sys/i386/i386/trap.c:852 #5 0xc0adae2c in trap (frame=0xedb2d878) at /usr/src/sys/i386/i386/trap.c:530 #6 0xc0ac0c9b in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #7 0xc07a1d26 in jailed (cred=0x0) at /usr/src/sys/kern/kern_jail.c:465 #8 0xc07a1da5 in prison_priv_check (cred=0x0, priv=320) at /usr/src/sys/kern/kern_jail.c:581 #9 0xc07b62ce in priv_check_cred (cred=0x0, priv=320, flags=0) at /usr/src/sys/kern/kern_priv.c:86 #10 0xc09e742d in chkdq (ip=0xcb55c980, change=28, cred=0x0, flags=Variable "flags" is not available. ) at /usr/src/sys/ufs/ufs/ufs_quota.c:188 #11 0xc09c24f7 in ffs_truncate (vp=0xcac04cf0, length=0, flags=2048, cred=0xc9871d00, td=0xc95d28c0) at /usr/src/sys/ufs/ffs/ffs_inode.c:276 #12 0xc09ed372 in ufs_setattr (ap=0xedb2db64) at /usr/src/sys/ufs/ufs/ufs_vnops.c:600 #13 0xc0af0582 in VOP_SETATTR_APV (vop=0xc0c2ff80, a=0xedb2db64) at vnode_if.c:583 #14 0xc084c446 in kern_open (td=0xc95d28c0, path=0x4890e68c , pathseg=UIO_USERSPACE, flags=Variable "flags" is not available. ) at vnode_if.h:315 #15 0xc084c5b0 in open (td=0xc95d28c0, uap=0xedb2dcfc) at /usr/src/sys/kern/vfs_syscalls.c:999 #16 0xc0ada7c5 in syscall (frame=0xedb2dd38) at /usr/src/sys/i386/i386/trap.c:1090 #17 0xc0ac0d00 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #18 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 11 #11 0xc09c24f7 in ffs_truncate (vp=0xcac04cf0, length=0, flags=2048, cred=0xc9871d00, td=0xc95d28c0) at /usr/src/sys/ufs/ffs/ffs_inode.c:276 276 (void) chkdq(ip, -datablocks, NOCRED, 0); (kgdb) list 271 if (ip->i_flag & IN_SPACECOUNTED) 272 fs->fs_pendingblocks -= datablocks; 273 UFS_UNLOCK(ump); 274 } else { 275 #ifdef QUOTA 276 (void) chkdq(ip, -datablocks, NOCRED, 0); 277 #endif 278 softdep_setup_freeblocks(ip, length, needextclean ? 279 IO_EXT | IO_NORMAL : IO_NORMAL); 280 ASSERT_VOP_LOCKED(vp, "ffs_truncate1"); (kgdb) fr 7 #7 0xc07a1d26 in jailed (cred=0x0) at /usr/src/sys/kern/kern_jail.c:465 465 { (kgdb) list 460 /* 461 * Return 1 if the passed credential is in a jail, otherwise 0. 462 */ 463 int 464 jailed(struct ucred *cred) 465 { 466 467 return (cred->cr_prison != NULL); 468 } 469 -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 7.1: QUOTA: kernel panics in jailed()
On Sun, 6 Dec 2009 20:18:13 +0200 Kostik Belousov wrote: > The kernel paniced because chkdq was supplied NULL credentials and > _positive_ blocks use count change. Line 276 calls chkdq with > -datablocks as the change. This could happen if you have problems > either with hardware (e.g. memory or CPU cache), or your fs > is damaged. > > Another possibility is random corruption of the kernel memory, but > I recommend to start with fsck and then continue with memory testers > if fsck have shown no problems. We have checked FS -- looks OK. So far we have just rebooted to the kernel without quota. To check the hardware is in our plans. Thank you. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 7.1: QUOTA: kernel panics in jailed()
On Wed, 9 Dec 2009 15:52:23 -0600 Mike Pritchard wrote: > On Mon, Dec 07, 2009 at 10:23:49AM +0200, Mikolaj Golub wrote: >> On Sun, 6 Dec 2009 20:18:13 +0200 Kostik Belousov wrote: >> >> > The kernel paniced because chkdq was supplied NULL credentials and >> > _positive_ blocks use count change. Line 276 calls chkdq with >> > -datablocks as the change. This could happen if you have problems >> > either with hardware (e.g. memory or CPU cache), or your fs >> > is damaged. >> > >> > Another possibility is random corruption of the kernel memory, but >> > I recommend to start with fsck and then continue with memory testers >> > if fsck have shown no problems. >> >> We have checked FS -- looks OK. So far we have just rebooted to the kernel >> without quota. To check the hardware is in our plans. Thank you. > > Did you happen to turn quotas off then back on for the file system in > question? Do you mean at the moment of the crash? No, our admins were far from the host then :-). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
NFS locking issue with FreeBSD7.1 client
call+0x335 73265 100685 ls -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 73292 100832 mc -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_lstat+0x4f lstat+0x2f syscall+0x335 73357 100772 ls -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 73796 100746 ls -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_lstat+0x4f lstat+0x2f syscall+0x335 74074 100800 tcsh -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 74125 100543 ls -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 74449 100547 df -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_statfs+0x69 __vfs_statfs+0x2f kern_getfsstat+0x2d5 getfsstat+0x2e syscall+0x335 Xint0x80_syscall+0x20 74497 100737 bash -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 74650 100837 df -mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_statfs+0x69 __vfs_statfs+0x2f kern_getfsstat+0x2d5 getfsstat+0x2e syscall+0x335 Xint0x80_syscall+0x20 76499 100771 perl5.8.9-mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 76533 100850 perl5.8.9-mi_switch+0x146 sleepq_switch+0xcb sleepq_wait+0x36 _sleep+0x2d6 acquire+0x7a _lockmgr+0x45c vop_stdlock+0x40 VOP_LOCK1_APV+0x46 _vn_lock+0x166 vget+0x114 vfs_hash_get+0x143 nfs_nget+0x94 nfs_root+0x3f lookup+0xa1c namei+0x39f kern_stat+0x3d stat+0x2f syscall+0x335 I can send the full output privately if someone from developers is interested to look at it. We have removed all NFS shares from this server after the last incident, but we have other servers where the problem might occur too. So any suggestions what we should check/do then to provide more info could be helpful. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD NFS client/Linux NFS server issue
, 15, 31, 52}, nm_sdrtt = {3, 3, 15, 15}, nm_sent = 0, nm_cwnd = 4096, nm_timeouts = 0, nm_deadthresh = 9, nm_rsize = 32768, nm_wsize = 32768, nm_readdirsize = 4096, nm_readahead = 1, nm_wcommitsize = 1177026, nm_acdirmin = 30, nm_acdirmax = 60, nm_acregmin = 3, nm_acregmax = 60, nm_verf = "JК╬W\000\004oМ", nm_bufq = {tqh_first = 0xda82dc70, tqh_last = 0xda8058e0}, nm_bufqlen = 2, nm_bufqwant = 0, nm_bufqiods = 1, nm_maxfilesize = 1099511627775, nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, nm_tprintf_delay = 30, nm_nfstcpstate = { rpcresid = 0, flags = 1, sock_send_inprog = 0}, nm_hostname = "172.30.10.92\000/var/www/app31", '\0' , nm_clientid = 0, nm_fsid = { val = {0, 0}}, nm_lease_time = 0, nm_last_renewal = 0} buffers on it: (kgdb) p *nmp->nm_bufq.tqh_first $7 = {b_bufobj = 0xc7324960, b_bcount = 31565, b_caller1 = 0x0, b_data = 0xde581000 " valid_lines:", ' ' , "1341\n invalid_lines:", ' ' , "1556\n total_lines:", ' ' , "2897\n\nError summary:\n Inactive pr"..., b_error = 0, b_iocmd = 2 '\002', b_ioflags = 0 '\0', b_iooffset = 196608, b_resid = 0, b_iodone = 0, b_blkno = 384, b_offset = 196608, b_bobufs = {tqe_next = 0x0, tqe_prev = 0xc7324964}, b_left = 0x0, b_right = 0x0, b_vflags = 0, b_freelist = { tqe_next = 0xda805894, tqe_prev = 0xc725d3c0}, b_qindex = 0, b_flags = 536870948, b_xflags = 2 '\002', b_lock = {lk_object = {lo_name = 0xc0b73635 "bufwait", lo_type = 0xc0b73635 "bufwait", lo_flags = 70844416, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, lk_interlock = 0xc0c77b50, lk_flags = 262144, lk_sharecount = 0, lk_waitcount = 0, lk_exclusivecount = 1, lk_prio = 80, lk_timo = 0, lk_lockholder = 0xfffe, lk_newlock = 0x0}, b_bufsize = 31744, b_runningbufspace = 0, b_kvabase = 0xde581000 " valid_lines:", ' ' , "1341\n invalid_lines:", ' ' , "1556\n total_lines:", ' ' , "2897\n\nError summary:\n Inactive pr"..., b_kvasize = 32768, b_lblkno = 6, b_vp = 0xc73248a0, b_dirtyoff = 31512, b_dirtyend = 31565, b_rcred = 0x0, b_wcred = 0xcebec400, b_saveaddr = 0xde581000, b_pager = { pg_reqpage = 0}, b_cluster = {cluster_head = {tqh_first = 0xda917ec8, tqh_last = 0xda888e94}, cluster_entry = {tqe_next = 0xda917ec8, tqe_prev = 0xda888e94}}, b_pages = {0xc3726e90, 0xc448dca8, 0xc2a55b98, 0xc3bf1a28, 0xc3467ff0, 0xc3299600, 0xc28db130, 0xc2301398, 0x0 }, b_npages = 8, b_dep = {lh_first = 0x0}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0, b_fsprivate3 = 0x0, b_pin_count = 0} These are entires from our log file. Note that b_qindex is 0. But bufqueues[0] is empty: (kgdb) p bufqueues[0] $8 = {tqh_first = 0x0, tqh_last = 0xc0c83e20} Also does not it look strange that lk_lockholder of b_lock points to innvalid location (0xfffe)? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD NFS client/Linux NFS server issue
On Tue, 19 Jan 2010 10:02:57 +0200 Mikolaj Golub wrote: > I have found in the Internet that other people have been observed the similar > problem with FreeBSD6.2 client: > > http://forums.freebsd.org/showthread.php?t=1697 Reading this through carefully it looks like the guy did not experience the problem (gotten stuck processes). He just described the behaviour of freebsd client when appending the file. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD NFS client/Linux NFS server issue
On Tue, 19 Jan 2010 10:02:57 +0200 Mikolaj Golub wrote: > So, on some of our freebsd7.1 nfs clients (and it looks like we have had > similar case with 6.3), which have several nfs mounts to the same CentOS 5.3 > NFS server (mount options: rw,-3,-T,-s,-i,-r=32768,-w=32768,-o=noinet6), at > some moment the access to one of the NFS mount gets stuck, while the access to > the other mounts works ok. > > In all cases we have been observed so far the first gotten stuck process was > php script (or two) that was (were) writing to logs file (appending). In > tcpdump we see that every write to the file causes the sequence of the > following rpc: ACCESS - READ - WRITE - COMMIT. And at some moment this stops > after READ rpc call and successful reply. > > After this in tcpdump successful readdir/access/lookup/fstat calls are > observed from our other utilities, which just check the presence of some files > and they work ok (df also works). The php process at this state is in bo_wwait > invalidating buffer cache [1]. > > If at this time we try accessing the share with mc then it hangs acquiring the > vn_lock held by php process [2] and after this any operations with this NFS > share hang (df hangs too). > > If instead some other process is started that writes to some other file on > this share (append) then the first process "unfreezes" too (starting from > WRITE rpc, so there is no any retransmits). So it looks for me that the problem here is that eventually problem nfsmount ends up in this state: (kgdb) p *nmp $1 = {nm_mtx = {lock_object = {lo_name = 0xc0b808ee "NFSmount lock", lo_type = 0xc0b808ee "NFSmount lock", lo_flags = 16973824, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, nm_flag = 35399, nm_state = 1310720, nm_mountp = 0xc6b472cc, nm_numgrps = 16, nm_fh = "\001\000\000\000\000\223\000\000\...@\003\n", '\0' , nm_fhsize = 12, nm_rpcclnt = {rc_flag = 0, rc_wsize = 0, rc_rsize = 0, rc_name = 0x0, rc_so = 0x0, rc_sotype = 0, rc_soproto = 0, rc_soflags = 0, rc_timeo = 0, rc_retry = 0, rc_srtt = {0, 0, 0, 0}, rc_sdrtt = {0, 0, 0, 0}, rc_sent = 0, rc_cwnd = 0, rc_timeouts = 0, rc_deadthresh = 0, rc_authtype = 0, rc_auth = 0x0, rc_prog = 0x0, rc_proctlen = 0, rc_proct = 0x0}, nm_so = 0xc6e81d00, nm_sotype = 1, nm_soproto = 0, nm_soflags = 44, nm_nam = 0xc6948640, nm_timeo = 6000, nm_retry = 2, nm_srtt = {15, 15, 31, 52}, nm_sdrtt = {3, 3, 15, 15}, nm_sent = 0, nm_cwnd = 4096, nm_timeouts = 0, nm_deadthresh = 9, nm_rsize = 32768, nm_wsize = 32768, nm_readdirsize = 4096, nm_readahead = 1, nm_wcommitsize = 1177026, nm_acdirmin = 30, nm_acdirmax = 60, nm_acregmin = 3, nm_acregmax = 60, nm_verf = "JК╬W\000\004oМ", nm_bufq = {tqh_first = 0xda82dc70, tqh_last = 0xda8058e0}, nm_bufqlen = 2, nm_bufqwant = 0, nm_bufqiods = 1, nm_maxfilesize = 1099511627775, nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, nm_tprintf_delay = 30, nm_nfstcpstate = { rpcresid = 0, flags = 1, sock_send_inprog = 0}, nm_hostname = "172.30.10.92\000/var/www/app31", '\0' , nm_clientid = 0, nm_fsid = { val = {0, 0}}, nm_lease_time = 0, nm_last_renewal = 0} We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod thread run for this mount, which is wrong -- nm_bufq will not be emptied until some other process starts writing to the nfsmount and starts nfsiod thread for this mount. Reviewing the code how it could happen I see the following path. Could someone confirm or disprove me? in nfs_bio.c:nfs_asyncio() we have: 1363 mtx_lock(&nfs_iod_mtx); ... 1374 /* 1375 * Find a free iod to process this request. 1376 */ 1377 for (iod = 0; iod < nfs_numasync; iod++) 1378 if (nfs_iodwant[iod]) { 1379 gotiod = TRUE; 1380 break; 1381 } 1382 1383 /* 1384 * Try to create one if none are free. 1385 */ 1386 if (!gotiod) { 1387 iod = nfs_nfsiodnew(); 1388 if (iod != -1) 1389 gotiod = TRUE; 1390 } Let's consider situation when new nfsiod is created. nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks nfs_iod_mtx: 179 mtx_unlock(&nfs_iod_mtx); 180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, 181 0, "nfsiod %d", newiod); 182 mtx_lock(&nfs_iod_mtx); And nfs_nfsiod.c:nfssvc_iod() do the followin: 226 mtx_lock(&nfs_iod_mtx); ... 238 nfs_iodwant[myiod] = curthread->td_proc; 239 nfs
Re: FreeBSD NFS client/Linux NFS server issue
On Fri, 22 Jan 2010 14:37:48 -0500 (EST) Rick Macklem wrote: >> --- nfs_bio.c.orig 2010-01-22 15:38:02.0 + >> +++ nfs_bio.c 2010-01-22 15:39:58.0 + >> @@ -1385,7 +1385,7 @@ again: >> */ >>if (!gotiod) { >>iod = nfs_nfsiodnew(); >> - if (iod != -1) >> + if ((iod != -1) && (nfs_iodwant[iod] == NULL)) >>gotiod = TRUE; >>} >> > > Unfortunately, I don't think the above fixes the problem. > If another thread that called nfs_asyncio() has "stolen" the this "iod", > it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238) > and it will remain NULL until the other thread is done with it. I see. I have missed this. Thanks. > > There should probably be some sort of 3 way handshake between > the code in nfs_asyncio() after calling nfs_nfsnewiod() and the > code near the beginning of nfssvc_iod(), but I think the following > somewhat cheesy fix might do the trick: > > if (!gotiod) { > iod = nfs_nfsiodnew(); > if (iod != -1) { > if (nfs_iodwant[iod] == NULL) { > /* >* Either another thread has acquired this >* iod or I acquired the nfs_iod_mtx mutex >* before the new iod thread did in >* nfssvc_iod(). To be safe, go back and >* try again after allowing another thread >* to acquire the nfs_iod_mtx mutex. >*/ > mtx_unlock(&nfs_iod_mtx); > /* >* So long as mtx_lock() implements some >* sort of fairness, nfssvc_iod() should >* get nfs_iod_mtx here and set >* nfs_iodwant[iod] != NULL for the case >* where the iod has not been "stolen" by >* another thread for a different mount >* point. >*/ > mtx_lock(&nfs_iod_mtx); > goto again; > } > gotiod = TRUE; > } > } > > Does anyone else have a better solution? > (Mikolaj, could you by any chance test this? You can test yours, but I > think it breaks.) Unfortunately we observed this only on our production servers. A week ago we made some changes in configuration as workaround -- reconfigure cron no to run scripts simultaneously, set the scripts in cron that just periodically write a line to the file on nfs share (to "unlock" it if it is locked). We have not been observed problems since then and we would not like to experiment in production. If I manage to produce good test case in test environment I will be able to test the patch but I am not sure... -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: top Segmentation faulting on 8.0p2 amd64
On Wed, 20 Jan 2010 08:06:23 +0100 Harald Schmalzbauer wrote: > Dear all, > > I have no idea why top crashes with segmentation fault on my amd64 > machine running FreeBSD 8.0-RELEASE-p2. > If someone wants to have a loot at the core dump: > http://www.schmalzbauer.de/downloads/top.core core file is useless without binary and libraries. So it is better to run gdb on your host, produce backtrace and post here: gdb /usr/bin/top top.core bt And sure a backtrace from the top built with -g would be much better. cd /usr/src/usr.bin/top CFLAGS=-g make -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: top Segmentation faulting on 8.0p2 amd64 (nss_ldapd problem?)
On Sat, 23 Jan 2010 02:02:04 +0100 Harald Schmalzbauer wrote: > gdb /usr/bin/top top.core > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > Core was generated by `top'. > Program terminated with signal 11, Segmentation fault. > Reading symbols from /lib/libncurses.so.8...done. > Loaded symbols for /lib/libncurses.so.8 > Reading symbols from /lib/libm.so.5...done. > Loaded symbols for /lib/libm.so.5 > Reading symbols from /lib/libkvm.so.5...done. > Loaded symbols for /lib/libkvm.so.5 > Reading symbols from /lib/libc.so.7...done. > Loaded symbols for /lib/libc.so.7 > Reading symbols from /usr/local/lib/nss_ldap.so.1...done. > Loaded symbols for /usr/local/lib/nss_ldap.so.1 > Reading symbols from /libexec/ld-elf.so.1...done. > Loaded symbols for /libexec/ld-elf.so.1 > bt: > #0 0x000800d08403 in __nss_compat_gethostbyname () from > /usr/local/lib/nss_ldap.so.1 > #0 0x000800d08403 in __nss_compat_gethostbyname () from > /usr/local/lib/nss_ldap.so.1 > #1 0x000800d0606f in _nss_ldap_getpwent_r () from > /usr/local/lib/nss_ldap.so.1 It is worth rebuilding and installing nss_ldap.so with debugging symbols. > #2 0x0008009ffc54 in __nss_compat_getpwent_r () from /lib/libc.so.7 > #3 0x000800a84a3d in nsdispatch () from /lib/libc.so.7 > #4 0x000800a50976 in getpwent_r () from /lib/libc.so.7 > #5 0x000800a50596 in sysctlbyname () from /lib/libc.so.7 And may be libc.so :-) > #6 0x00406c6d in machine_init (statics=0x7fffea30, > do_unames=1 '\001') > at /usr/src/usr.bin/top/machine.c:257 > #7 0x00407a10 in main (argc=1, argv=0x7fffeb08) > at /usr/src/usr.bin/top/../../contrib/top/top.c:458 > > I'm using nss_ldapd-0.7.2 and there's no way to live without ldap... -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD NFS client/Linux NFS server issue
On Fri, 22 Jan 2010 17:13:09 -0500 (EST) Rick Macklem wrote: > On Fri, 22 Jan 2010, Rick Macklem wrote: > >> >> There should probably be some sort of 3 way handshake between >> the code in nfs_asyncio() after calling nfs_nfsnewiod() and the >> code near the beginning of nfssvc_iod(), but I think the following >> somewhat cheesy fix might do the trick: >> > [stuff deleted] > I know it's a little weird to reply to my own posting, but I think > this might be a reasonable patch (I have only tested it for a few > minutes at this point). > > I basically redefined nfs_iodwant[] as a tri-state variable (although > it was a struct proc *, it was only tested NULL/non-NULL). > 0 - was NULL > 1 - was non-NULL > -1 - just created by nfs_asyncio() and will be used by it > > I'll keep testing it, but hopefully someone else can test and/or > review it... rick I applied your patch to FreeBSD8.0 (the box I get on weekend :-), mounted 10 shares, set vfs.nfs.iodmaxidle=10 (to have nfsiod creation more frequently) and have been running tests for 4 hours -- just to check the patch does not break anything. No issues have been detected. It would be very nice to have this patch committed. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd returns incorrect hrProcessorLoad values
On Fri, 29 Jan 2010 12:37:52 +0100 Gustau Pérez wrote: > Hi, > > I'm using cacti to monitor some servers running FBSD. I was using 7.2 > with SCHED_4BSD. With this configuration : bsnmpd+bsnmp-ucd was > returning right values for the cores' load. > >I recently updated the servers (via csup) to RELENG_8 and bsnmpd is > returning negative values for the cores' load. If I try something like > in a 4-core system : > > snmpwalk -v 2c -c community server .1.3.6.1.2.1.25.3.3.1 > >what I get is : > > .1.3.6.1.2.1.25.3.3.1.1.6 = OID: .0.0 > .1.3.6.1.2.1.25.3.3.1.1.10 = OID: .0.0 > .1.3.6.1.2.1.25.3.3.1.1.14 = OID: .0.0 > .1.3.6.1.2.1.25.3.3.1.1.18 = OID: .0.0 > .1.3.6.1.2.1.25.3.3.1.2.6 = INTEGER: -182 > .1.3.6.1.2.1.25.3.3.1.2.10 = INTEGER: -182 > .1.3.6.1.2.1.25.3.3.1.2.14 = INTEGER: -182 > .1.3.6.1.2.1.25.3.3.1.2.18 = INTEGER: -182 > > I tried and old bsnmpd-ucd (0.2.1, works fine in a 7,2 system) with a > 8.0 system. Same wrong results. And it seems bsnmpd in /usr/src/contrib > has not changed between 7.2 and 8.0. > > Any ideas ? I'm not an expert, but with tcpdump I see different > results. Against an old 7.2 system, the field related to each core load > gives the right value. Instead, against and 8.0 system, those field show > (in hex) values like fd 4b. What I don't know is how bsdnmp-ucb retrives > those values and how it construct the udp response packet. bsnmpd-ucd has nothing to do with HOST-RESOURCES-MIB. These mibs are provided by snmp_hostres(3) module (/usr/lib/snmp_hostres.so). So something wrong is there (I suppose it is not in sync with some recent changes in kernel or libkvm). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
virtualbox status on 8.0-STABLE i386
Hi, Recently I have updated my 8.0-STABLE i386 system and have learnt that virtualbox begins to crash my box with the error panic: vm_fault: fault on nofault entry, addr: c1608000 (kgdb) bt #0 doadump () at pcpu.h:246 #1 0xc04ec379 in db_fncall (dummy1=-1064468854, dummy2=0, dummy3=-1, dummy4=0xe865d5bc "пуeХ") at /usr/src/sys/ddb/db_command.c:548 #2 0xc04ec7af in db_command (last_cmdp=0xc0e04c9c, cmd_table=0x0, dopager=0) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04ec864 in db_command_script (command=0xc0e05bc4 "call doadump") at /usr/src/sys/ddb/db_command.c:516 #4 0xc04f09a0 in db_script_exec (scriptname=0xe865d6c8 "kdb.enter.panic", warnifnotfound=Variable "warnifnotfound" is not available. ) at /usr/src/sys/ddb/db_script.c:302 #5 0xc04f0a87 in db_script_kdbenter (eventname=0xc0cc248d "panic") at /usr/src/sys/ddb/db_script.c:324 #6 0xc04ee768 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228 #7 0xc08d7d06 in kdb_trap (type=3, code=0, tf=0xe865d804) at /usr/src/sys/kern/subr_kdb.c:535 #8 0xc0beb39b in trap (frame=0xe865d804) at /usr/src/sys/i386/i386/trap.c:690 #9 0xc0bccd0b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #10 0xc08d7e8a in kdb_enter (why=0xc0cc248d "panic", msg=0xc0cc248d "panic") at cpufunc.h:71 #11 0xc08a88b6 in panic (fmt=0xc0cecbc4 "vm_fault: fault on nofault entry, addr: %lx") at /usr/src/sys/kern/kern_shutdown.c:562 #12 0xc0b0c3d7 in vm_fault (map=0xc199, vaddr=3244326912, fault_type=Variable "fault_type" is not available. ) at /usr/src/sys/vm/vm_fault.c:283 #13 0xc0bea7d6 in trap_pfault (frame=0xe865dac0, usermode=0, eva=3244330720) at /usr/src/sys/i386/i386/trap.c:840 #14 0xc0beb225 in trap (frame=0xe865dac0) at /usr/src/sys/i386/i386/trap.c:533 #15 0xc0bccd0b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #16 0xc12beed0 in rtR0MemObjNativeGetPagePhysAddr (pMem=0xc5ed3110, iPage=0) at pmap.h:300 #17 0xc12ac354 in SUPR0LockMem (pSession=0xc5c61c10, pvR3=695959552, cPages=1, paPages=0xc5f83668) at SUPDrv.c:2307 #18 0xc12ac8cb in supdrvIOCtl (uIOCtl=536892942, pDevExt=0xc12c9ac0, pSession=0xc5c61c10, pReqHdr=0xc5f83650) at SUPDrv.c:1245 #19 0xc12b0c3a in VBoxDrvFreeBSDIOCtl (pDev=0xc665d800, ulCmd=536892942, pvData=0xe865dd00 "ю8 )\003╬кюq\002", fFile=3, pTd=0xc69556f0) at /usr/ports/emulators/virtualbox-ose-kmod/work/VirtualBox-3.1.2_OSE/out/freebsd.x86/debug/bin/src/vboxdrv/freebsd/SUPDrv-freebsd.c:505 #20 0xc0829658 in devfs_ioctl_f (fp=0xc670fa80, com=536892942, data=0xe865dd00, cred=0xc6bbeb00, td=0xc69556f0) at /usr/src/sys/fs/devfs/devfs_vnops.c:659 #21 0xc08eec8d in kern_ioctl (td=0xc69556f0, fd=7, com=536892942, data=0xe865dd00 "ю8 )\003╬кюq\002") at file.h:262 #22 0xc08eee14 in ioctl (td=0xc69556f0, uap=0xe865dcf8) at /usr/src/sys/kern/sys_generic.c:678 #23 0xc0beaad0 in syscall (frame=0xe865dd38) at /usr/src/sys/i386/i386/trap.c: #24 0xc0bccda0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #25 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 16 #16 0xc12beed0 in rtR0MemObjNativeGetPagePhysAddr (pMem=0xc5ed3110, iPage=0) at pmap.h:300 300 pa = (pa & PG_FRAME) | (va & PAGE_MASK); (kgdb) list 295 * access the PTE because it would use the new PDE. It is, 296 * however, safe to use the old PDE because the page table 297 * page is preserved by the promotion. 298 */ 299 pa = KPTmap[i386_btop(va)]; 300 pa = (pa & PG_FRAME) | (va & PAGE_MASK); 301 } 302 return (pa); 303 } 304 There were some changes in this part recently (r203182): http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/include/pmap.h.diff?r1=1.140.2.2;r2=1.140.2.3;only_with_tag=RELENG_8 So I removed KPTmap[i386_btop(va)] with *vtopte(va) and have working virtualbox again, but I suppose this is rather the problem with virualbox and not with the kernel code. In February Alexander Eichner posted the patch to freebsd-emulation@ (thread with the subject "Patch to fix VirtualBox with recent kernel versions"): http://lists.freebsd.org/pipermail/freebsd-emulation/2010-February/007434.html But it does not fix my panics. The patch adds additional handling in rtR0MemObjNativeGetPagePhysAddr() for the case pMem.enmType == RTR0MEMOBJTYPE_MAPPING, while I am observing the panics for pMem.enmType == RTR0MEMOBJTYPE_LOCK: (kgdb) fr 17 #17 0xc12ac354 in SUPR0LockMem (pSession=0xc5c61c10, pvR3=695959552, cPages=1, paPages=0xc5f83668) at SUPDrv.c:2307 2307paPages[iPage] = RTR0MemObjGetPagePhysAddr(Mem.MemObj, iPage); (kgdb) p Mem.MemObj.enmType $1 = RTR0MEMOBJTYPE_LOCK So, it looks like some additional handling should b
Re: net.inet.tcp.timer_race: does anyone have a non-zero value?
On Sun, 7 Mar 2010 11:59:35 + (GMT) Robert Watson wrote: > Please check the results of the following command: > > % sysctl net.inet.tcp.timer_race > net.inet.tcp.timer_race: 0 Are the results for FreeBSD7 look interesting for you? Because currently we have mostly FreeBSD7.1 hosts in production and I observe nonzero values on 8 hosts (about 15%). I would send more details to you privately if you are interested. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: virtualbox status on 8.0-STABLE i386
On Sun, 07 Mar 2010 15:28:48 +0100 Alexander Eichner wrote: > Hi, > > can you try the attached patch please? > This should fix the panic you encountered. Please undo your kernel > changes befoer testing. Unfortunately, the same panic: (kgdb) bt #0 doadump () at pcpu.h:246 #1 0xc04ec379 in db_fncall (dummy1=-1064468854, dummy2=0, dummy3=-1, dummy4=0xe866b5b4 "х╣fХ") at /usr/src/sys/ddb/db_command.c:548 #2 0xc04ec7af in db_command (last_cmdp=0xc0e04c9c, cmd_table=0x0, dopager=0) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04ec864 in db_command_script (command=0xc0e05bc4 "call doadump") at /usr/src/sys/ddb/db_command.c:516 #4 0xc04f09a0 in db_script_exec (scriptname=0xe866b6c0 "kdb.enter.panic", warnifnotfound=Variable "warnifnotfound" is not available. ) at /usr/src/sys/ddb/db_script.c:302 #5 0xc04f0a87 in db_script_kdbenter (eventname=0xc0cc246d "panic") at /usr/src/sys/ddb/db_script.c:324 #6 0xc04ee768 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228 #7 0xc08d7d06 in kdb_trap (type=3, code=0, tf=0xe866b7fc) at /usr/src/sys/kern/subr_kdb.c:535 #8 0xc0beb38b in trap (frame=0xe866b7fc) at /usr/src/sys/i386/i386/trap.c:690 #9 0xc0bcccfb in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #10 0xc08d7e8a in kdb_enter (why=0xc0cc246d "panic", msg=0xc0cc246d "panic") at cpufunc.h:71 #11 0xc08a88b6 in panic (fmt=0xc0cecba4 "vm_fault: fault on nofault entry, addr: %lx") at /usr/src/sys/kern/kern_shutdown.c:562 #12 0xc0b0c3c7 in vm_fault (map=0xc199, vaddr=3244318720, fault_type=Variable "fault_type" is not available. ) at /usr/src/sys/vm/vm_fault.c:283 #13 0xc0bea7c6 in trap_pfault (frame=0xe866bab8, usermode=0, eva=3244322776) at /usr/src/sys/i386/i386/trap.c:840 #14 0xc0beb215 in trap (frame=0xe866bab8) at /usr/src/sys/i386/i386/trap.c:533 #15 0xc0bcccfb in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #16 0xc12beef3 in rtR0MemObjNativeGetPagePhysAddr () from /boot/modules/vboxdrv.ko #17 0xc12ac374 in SUPR0LockMem () from /boot/modules/vboxdrv.ko #18 0xc12ac8eb in supdrvIOCtl () from /boot/modules/vboxdrv.ko #19 0xc12b0c5a in VBoxDrvFreeBSDIOCtl () from /boot/modules/vboxdrv.ko #20 0xc0829658 in devfs_ioctl_f (fp=0xc5f1c8c0, com=3321378576, data=0xe866bd00, cred=0xc6972a00, td=0xc728e250) at /usr/src/sys/fs/devfs/devfs_vnops.c:659 #21 0xc08eec8d in kern_ioctl (td=0xc728e250, fd=7, com=536892942, data=0xe866bd00 "@г\023)Ь\023Эю8╫fХЬ\023Эю,╫fХ\005\\╫ю\001") at file.h:262 #22 0xc08eee14 in ioctl (td=0xc728e250, uap=0xe866bcf8) at /usr/src/sys/kern/sys_generic.c:678 #23 0xc0beaac0 in syscall (frame=0xe866bd38) at /usr/src/sys/i386/i386/trap.c: #24 0xc0bccd90 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #25 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (this time I built the modules without debugging symbols). Just to be sure that I did all thing properly below the steps I did: 1) returned original pmap.h (with KPTmap), rebuilt the kernel and rebooted 2) rebuilt with the patch virtualbox drivers and virtualbox (not sure this last was needed bu just in case...): cd emulators/virtualbox-ose-kmod && make patch applied this patch and your previous patch ("Patch to fix VirtualBox with recent kernel versions") built and reinstall the same for emulators/virtualbox-ose 3) rebooted and started vm guest virtualbox-ose-3.1.2_1 A general-purpose full virtualizer for x86 hardware virtualbox-ose-kmod-3.1.2_1 VirtualBox kernel module for FreeBSD -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Fatal trap 12: page fault while in kernel mode/current process: 12 (swi2: cambio)
On Sun, 21 Mar 2010 00:39:01 -0400 jhell wrote: > DDB as I have heard can be configured AFAIR to textdump but I have no > knowledge of that. ddb_enable="YES" in /etc/rc.conf would be enough. But I also remove "textdump set" in kdb.enter.panic script (/etc/ddb.conf) as I prefer normal dumps (with output of ddb scripts in capture buffer) to textdumps. You can't debug textdump and crashinfo will fail too. And all info provided in textdump is retrieved from vmcore capture buffer by crashifo utility automatically. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: sysctl(?) problem on boot in 8-STABLE
On Mon, 5 Apr 2010 10:56:22 -0400 Jeff Blank wrote: > Hi, > > I upgraded an 8-STABLE box to r206119 and am now unable to boot > multi-user. I found that it hangs at line 58/59 of > /etc/rc.d/initrandom: > > ( ps -fauxww; sysctl -a; date; df -ib; dmesg; ps -fauxww ) \ > | dd of=/dev/random bs=8k 2>/dev/null > > when I run each of these commands by hand, I get only as far as > 'sysctl -a', which seems to exit normally but leaves my keyboard > unresponsive (actually acting like I'm leaning on the key). > more digging reveals 'sysctl dev.uart' to be what triggers it. kern/143040 looks similar. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: em driver regression
Hi, On Thu, 8 Apr 2010 14:52:07 -0500 Brandon Gooch wrote: > On Thu, Apr 8, 2010 at 2:17 PM, Jack Vogel wrote: >> Try the code I just checked in, it puts in the CRC stripping, but also >> tweaks the >> TX code, this may resolve the watchdogs. Let me know. >> >> Cheers, >> >> Jack >> > > Yes, this is indeed the fix for both the dhclient and VirtualBox issue > (at least with my setup). There appear to be no ill effects either. Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to the latest current and have "em0: Watchdog timeout -- resetting" issue. My previous kernel was for Mar 12. Tracking the revision where the problem appeared I see that the issue is not observed for r203834 and starts to observe after r205869. Interestingly, if I enter ddb and then exit (sometimes I needed to do this twice) the errors stop and network starts working. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: em driver regression
On Sun, 11 Apr 2010 23:40:03 +0300 Mikolaj Golub wrote: MG> Hi, MG> Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to the MG> latest current and have "em0: Watchdog timeout -- resetting" issue. My MG> previous kernel was for Mar 12. MG> Tracking the revision where the problem appeared I see that the issue is not MG> observed for r203834 and starts to observe after r205869. MG> Interestingly, if I enter ddb and then exit (sometimes I needed to do this MG> twice) the errors stop and network starts working. Adding some prints I observed the following: Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 813, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 818, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in lem_mq_start_locked 1 (ticks 818, watchdog_ time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 818, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 823, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in lem_mq_start_locked 1 (ticks 828, watchdog_ time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 923, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 923, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1023, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 1023, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1023, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 1024, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 1028, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1128, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 1 (ticks: 1128, watchdog_time: 0) Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1128, watchdog_time: 0) ... So althogh adapter->watchdog_check was set TRUE, adapter->watchdog_time was never set. I see that before r205869 watchdog_time was set in em_xmit but lem_xmit does not contain this. After adding back this line to lem_xmit (see the first patch below) the problem has gone on my box. Also seeing that in the current em_mq_start_locked() both watchdog_check and watchdog_time are set I tried another patch adding watchdog_time setting in lem_mq_start_locked() too (see the second patch below). This has also fixed the issue for me but I don't know if this is a correct fix and if this is the only place where watchdog_time should be set (there are other places in the function and in the code where watchdog_check is set to TRUE but watchdog_time is not set). -- Mikolaj Golub Index: sys/dev/e1000/if_lem.c === --- sys/dev/e1000/if_lem.c (revision 206595) +++ sys/dev/e1000/if_lem.c (working copy) @@ -1880,6 +1880,7 @@ lem_xmit(struct adapter *adapter, struct mbuf **m_ */ tx_buffer = &adapter->tx_buffer_area[first]; tx_buffer->next_eop = last; + adapter->watchdog_time = ticks; /* * Advance the Transmit Descriptor Tail (TDT), this tells the E1000 Index: sys/dev/e1000/if_lem.c === --- sys/dev/e1000/if_lem.c (revision 206595) +++ sys/dev/e1000/if_lem.c (working copy) @@ -873,6 +873,7 @@ lem_mq_start_locked(struct ifnet *ifp, struct mbuf */ ETHER_BPF_MTAP(ifp, m); adapter->watchdog_check = TRUE; + adapter->watchdog_time = ticks; } } else if ((error = drbr_enqueue(ifp, adapter->br, m)) != 0) return (error); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: em driver regression
On Wed, 14 Apr 2010 09:28:33 -0700 Jack Vogel wrote: > Oh, didn't realize you were running the lem code :) Will make the changes > shortly, r206614 works for me. Thanks :-) > thanks for your debugging efforts. > > Jack -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote: > I am running bsnmpd with basic snmpd.config (only community and location > changed). > > When there is a problem with HDD and disk disapeared from ATA channel > (eg.: disc physically removed) the bsnmpd always dumps core: > > kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped) > > I see this for a long rime on all releases of 7.x and 8.x branches (i386 > and amd64). I did not tested 9.x. > > Is it a known bug, or should I file PR? Do you happen to run bsnmp-ucd too? If you do then what version is it? In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Mon, Sep 10, 2012 at 04:46:15PM +0200, Miroslav Lachman wrote: > Mikolaj Golub wrote: > > On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote: > >> I am running bsnmpd with basic snmpd.config (only community and location > >> changed). > >> > >> When there is a problem with HDD and disk disapeared from ATA channel > >> (eg.: disc physically removed) the bsnmpd always dumps core: > >> > >> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped) > >> > >> I see this for a long rime on all releases of 7.x and 8.x branches (i386 > >> and amd64). I did not tested 9.x. > >> > >> Is it a known bug, or should I file PR? > > > > Do you happen to run bsnmp-ucd too? If you do then what version is it? > > In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a > > disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6. > > No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base > without any modules. > It is used by MRTG only for network traffic. Nothing else. Then the backtrace might be useful. gdb /usr/sbin/bsnmpd /path/to/bsnmpd.core bt -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Tue, Sep 11, 2012 at 10:16:57PM +0200, Miroslav Lachman wrote: > (gdb) bt > #0 0x000801046cba in refresh_disk_storage_tbl () from > /usr/lib/snmp_hostres.so > #1 0x0008010478bd in refresh_device_tbl () from > /usr/lib/snmp_hostres.so > #2 0x000801047be6 in start_device_tbl () from /usr/lib/snmp_hostres.so > #3 0x00080065fad5 in poll_dispatch () from /lib/libbegemot.so.4 > #4 0x0040616a in main () > > > Is it all you need? (I don't know how to use gdb) > > It is on FreeBSD 8.3-RELEASE #0: Mon Apr 9 21:23:18 UTC 2012 > r...@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Not sure we can get more than provided from this core as snmp_hostres is not built with debugging symbols. You can try rebuilding snmp_hostres with -g option, intalling and running gdb/bt again DEBUG_FLAGS=-g make -C /usr/src/usr.sbin/bsnmpd/modules/snmp_hostres clean all install AFAIK it might work or not. If it does not then wait for another crash :-) -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Wed, Sep 12, 2012 at 10:39:12AM +0200, Miroslav Lachman wrote: > (gdb) bt > #0 0x000801046cba in disk_query_disk (entry=0x0) at > hostres_diskstorage_tbl.c:241 > #1 0x000801dd6a00 in ?? () > #2 0x000801dd6600 in ?? () > #3 0x in ?? () > #4 0x000801048230 in device_entry_create (name=0x0, > location=0x800c14ee0 "0", descr=0x8010482a6 "") at hostres_device_tbl.c:217 > #5 0x000801dd7800 in ?? () > #6 0x000801dd7800 in ?? () > #7 0x000801dd7400 in ?? () > #8 0x in ?? () > #9 0x000801048230 in device_entry_create (name=0x801dd7c00 "", > location=0x801048230 "˙˙I\213|$8čŕ\201˙˙L\211çčŘ\201˙˙é\035ţ˙˙H\215\025", > descr=0x8010482a6 "") at hostres_device_tbl.c:217 > #10 0x000801dd4a00 in ?? () > #11 0x000801dd4a00 in ?? () > #12 0x000801dd1a00 in ?? () > #13 0x in ?? () > #14 0x000801048230 in device_entry_create (name=0x801dd8400 "", > location=0x801048230 "˙˙I\213|$8čŕ\201˙˙L\211çčŘ\201˙˙é\035ţ˙˙H\215\025", > descr=0x8010482a6 "") at hostres_device_tbl.c:217 > #15 0x000801dd1800 in ?? () > #16 0x000801dd1800 in ?? () > #17 0x000800c00ea8 in ?? () > #18 0x0051b1c8 in ?? () > #19 0x000800c00938 in ?? () > #20 0x0051b258 in ?? () > #21 0x000801dc8a00 in ?? () > #22 0x0008009f7be9 in free () from /lib/libc.so.7 > #23 0x in ?? () > #24 0x7fffed98 in ?? () > #25 0x0008010478bd in device_entry_delete () at hostres_device_tbl.c:266 > #26 0x005187d0 in snmp_error () > #27 0x000801047be6 in op_hrDeviceTable (ctx=Variable "ctx" is not > available. > ) at hostres_device_tbl.c:671 > #28 0x0051b840 in ?? () > #29 0x0051b830 in ?? () > #30 0x in ?? () > #31 0x7fffc360 in ?? () > #32 0x0051b830 in ?? () > #33 0x in ?? () > #34 0x0008009efbd2 in _pthread_mutex_init_calloc_cb () from > /lib/libc.so.7 > #35 0x0008009f2d32 in _malloc_prefork () from /lib/libc.so.7 > #36 0x0008009f6e1f in realloc () from /lib/libc.so.7 > #37 0x000800e0b441 in mib_if_is_dyn () from /usr/lib/snmp_mibII.so > #38 0x in ?? () > #39 0x7fffc5cc in ?? () > #40 0x0001 in ?? () > #41 0x7fffc5e0 in ?? () > #42 0x31fa39e2fac72819 in ?? () > #43 0x0001 in ?? () > #44 0x00080065fad5 in poll_dispatch () from /lib/libbegemot.so.4 > #45 0x0040616a in main () > > > I hope it helps you to debug this problem. Looks like we can't trust to this output. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote: > I am running bsnmpd with basic snmpd.config (only community and location > changed). > > When there is a problem with HDD and disk disapeared from ATA channel > (eg.: disc physically removed) the bsnmpd always dumps core: > > kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped) > > I see this for a long rime on all releases of 7.x and 8.x branches (i386 > and amd64). I did not tested 9.x. Ok, I was able to to reproduce this under qemu doing atacontrol detach ata1 It crashes in snmp_hostres module, in refresh_device_tbl->refresh_disk_storage_tbl->disk_OS_get_ATA_disks when traversing device_map list and dereferencing map->entry_p, which is NULL here. device_map table is used for consistent device table indexing. refresh_device_tbl(), refresh routine for hrDeviceTable, checks the list of available devices and calls device_entry_delete() for devices that have gone. It does not remove the entry from device_map table, but just sets entry_p to NULL for it (to preserve index reuse by another device). Then refresh_disk_storage_tbl() is called, which in turn calls disk_OS_get_ATA_disks(); disk_OS_get_MD_disks(); disk_OS_get_disks(); and it crashes in disk_OS_get_ATA_disks() when the removed map entry is dereferenced. I am attaching the patch that fixes the issue for me. I was wandering why the issue was not observed after md device removal, as disk_OS_get_MD_disks() did the same things. It has turned out that hostres just does not see md devices, so this function is currently useless. hostres gets devices from devinfo(3), which does not return md devices. disk_OS_get_disks() calls kern.disks sysctl to get the list of disks, and uses device_map differently, so it is not affected. -- Mikolaj Golub Index: usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c === --- usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c (revision 240529) +++ usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c (working copy) @@ -287,6 +287,9 @@ disk_OS_get_ATA_disks(void) /* Walk over the device table looking for ata disks */ STAILQ_FOREACH(map, &device_map, link) { + /* Skip deleted entries. */ + if (map->entry_p == NULL) + continue; for (found = lookup; found->media != DSM_UNKNOWN; found++) { if (strncmp(map->name_key, found->dev_name, strlen(found->dev_name)) != 0) @@ -345,6 +348,9 @@ disk_OS_get_MD_disks(void) /* Look for md devices */ STAILQ_FOREACH(map, &device_map, link) { + /* Skip deleted entries. */ + if (map->entry_p == NULL) + continue; if (sscanf(map->name_key, "md%d", &unit) != 1) continue; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Sun, Sep 16, 2012 at 05:56:22PM +0400, Andrey V. Elsukov wrote: > On 15.09.2012 16:50, Mikolaj Golub wrote: > > I am attaching the patch that fixes the issue for me. > > > > I was wandering why the issue was not observed after md device > > removal, as disk_OS_get_MD_disks() did the same things. It has turned > > out that hostres just does not see md devices, so this function is > > currently useless. hostres gets devices from devinfo(3), which does > > not return md devices. > > > > disk_OS_get_disks() calls kern.disks sysctl to get the list of disks, > > and uses device_map differently, so it is not affected. > > I also have a big patch to the hostres module, but it is not yet > finished. Probably i should commit the part related to the disk > subsystem. This part has been rewritten to be GEOM aware. Wonderful! And as I understand it will solve this problem too? Then I think no need in committing my patch, unless you are not planning to merge to stable/[78] (where any fix for this problem is highly desirable). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bsnmpd always died on HDD detach
On Sun, Sep 16, 2012 at 07:07:20PM +0200, Miroslav Lachman wrote: > I am glad to read that you found the bug! > The fix (patch) seems trivial - will it be commited / MFCed? :) Andrey told me that he was not sure when he would be able to commit his work, so I have just committed my fix. I am going to MFC it. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hastctl hang
Sorry, the message went privately to Daisuke, which was not my intention. -- Forwarded message -- From: Mikolaj Golub Date: Mon, Nov 26, 2012 at 9:38 AM Subject: Re: hastctl hang To: Daisuke Aoyama On Mon, Nov 26, 2012 at 01:17:46AM +0900, Daisuke Aoyama wrote: > Hello, > > I'm trying to integrate HAST to NAS4Free (FreeBSD 9.1-RC3). > Now I have created version 9.1.0.1.531. > http://sourceforge.net/projects/nas4free/files/NAS4Free-9.1.0.1/9.1.0.1.531/ > > Basic CARP + HAST + iSCSI target setup can be done, but very frequently I > get hastctl hang when called: > > /sbin/hastctl status > /sbin/hastctl dump > > Is it better for this method not to call from a script? > or somthing wrong to use it? Normally it is ok to use hastctl for scripting. Do you have it hang forever of just for a few seconds? Usually hanged hastctl means that hastd master process is waiting for its worker (either its response or exit). Could you provide logs from both master ans secondary? Also you might want to run hastd with -d to make it more verbose. > Also, I don't know how to detect an error of writing to local device from > hastd. > Does anyone know about it? Currently only by monitoring logs. It looks like a good idea to add error counters to hastctl statistics output... > Thanks, > Daisuke Aoyama > > -- the procstat shows like this: > [root@nas4free-nodeb /tmp]# procstat -ka|grep hast > 11668 100069 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep kern_wait sys_wait4 > amd64_syscall Xfast_syscall > 17981 100406 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep do_wait > __umtx_op_wait_uint_private amd64_syscall Xfast_syscall > 17981 100559 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit > recvit sys_recvfrom amd64_syscall Xfast_syscall > 17981 100560 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit > recvit sys_recvfrom amd64_syscall Xfast_syscall > 17981 100561 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep do_wait > __umtx_op_wait_uint_private amd64_syscall Xfast_syscall > 17984 100078 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep do_wait > __umtx_op_wait_uint_private amd64_syscall Xfast_syscall > 17984 100562 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit > recvit sys_recvfrom amd64_syscall Xfast_syscall > 17984 100563 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit > recvit sys_recvfrom amd64_syscall Xfast_syscall > 17984 100564 hastd-mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep do_wait > __umtx_op_wait_uint_private amd64_syscall Xfast_syscall > 18218 100145 hastctl -mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit > recvit sys_recvfrom amd64_syscall Xfast_syscall > > [root@nas4free-nodeb /tmp]# procstat -ta|grep hast > 11668 100069 hastd- 0 120 sleep wait > 17979 100557 hastd- 2 120 sleep g_waitid Strange, I don't see 17979 process in procstat -k output. Again, the logs might be helpful here. > 17981 100406 hastd- 2 120 sleep uwait > 17981 100559 hastd- 0 120 sleep sbwait > 17981 100560 hastd- 0 120 sleep sbwait > 17981 100561 hastd- 1 120 sleep uwait > 17984 100078 hastd- 2 121 sleep uwait > 17984 100562 hastd- 3 120 sleep sbwait > 17984 100563 hastd- 2 120 sleep sbwait > 17984 100564 hastd- 1 121 sleep uwait > 18218 100145 hastctl - 2 152 sleep sbwait > -- the procstat shows like this: > > > ___________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
libstdc++, libsupc++, delete operators and valgrind
(operator delete[](void*)) redirected to 0x1005700 (operator delete[](void*)) Now the question is: is it ok that now we have "new" operators being still called via libstdc++ while "delete" operators being called directly from libsupc++? If it is ok, is the proposed solution with adding redirects for libsupc++ is a right way to fix the valgrind? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: libstdc++, libsupc++, delete operators and valgrind
On Sun, Jan 20, 2013 at 02:19:55PM +0200, Mikolaj Golub wrote: > Hi, > > Some time ago I noticed that valgrind started to complain about > "Mismatched free() / delete / delete []" for valid new/delete > combinations. > > For example, the following test program > > int main() > { > char* buf = new char[10]; > delete [] buf; > > return 0; > } > > produced a warning: > > ==38718== Mismatched free() / delete / delete [] > ==38718==at 0x100416E: free (vg_replace_malloc.c:473) > ==38718==by 0x4007BE: main (test.cpp:5) > ==38718== Address 0x2400040 is 0 bytes inside a block of size 10 alloc'd > ==38718==at 0x10047D7: operator new[](unsigned long) > (vg_replace_malloc.c:382) > ==38718==by 0x40079D: main (test.cpp:4) > > For some time I hoped that "someone" would fix the problem but seeing > that after several upgrades it was still there I decided it is time to > do some investigations. > > Running the valgrind with "--trace-redir=yes -v" showed that valgrind > activates redirections for new/delete symbols in libstdc++: > > --6729-- Reading syms from /usr/lib/libstdc++.so.6 (0x1209000) > ... > --6729---- ACTIVE -- > ... > --6729-- 0x01260770 (operator new[](unsig) R-> (1001.0) 0x010041b0 > operator new[](unsigned long, std::nothrow_t const&) > --6729-- 0x01260780 (operator new(unsigne) R-> (1001.0) 0x01004270 > operator new(unsigned long, std::nothrow_t const&) > --6729-- 0x012608a0 (operator delete[](vo) R-> (1005.0) 0x01003e40 > operator delete[](void*, std::nothrow_t const&) > --6729-- 0x012608b0 (operator delete(void) R-> (1005.0) 0x01003fa0 > operator delete(void*, std::nothrow_t const&) > --6729-- 0x012dea90 (operator new[](unsig) R-> (1003.0) 0x01004770 > operator new[](unsigned long) > --6729-- 0x012deab0 (operator new(unsigne) R-> (1003.0) 0x01004860 > operator new(unsigned long) > --6729-- 0x012deca0 (operator delete[](vo) R-> (1005.0) 0x01003ef0 > operator delete[](void*) > --6729-- 0x012e2b80 (operator delete(void) R-> (1005.0) 0x01004050 > operator delete(void*) > > But "delete" redirection is not triggered, while "new" is: > > --6729-- REDIR: 0x12dea90 (operator new[](unsigned long)) redirected to > 0x1004770 (operator new[](unsigned long)) > --6729-- REDIR: 0x19dd9a0 (free) redirected to 0x1004100 (free) > ==6729== Mismatched free() / delete / delete [] > ==6729==at 0x100416E: free (vg_replace_malloc.c:473) > ==6729==by 0x400715: main (test.cpp:5) > ==6729== Address 0x1ed7040 is 0 bytes inside a block of size 10 alloc'd > ==6729==at 0x10047D7: operator new[](unsigned long) > (vg_replace_malloc.c:382) > ==6729==by 0x400701: main (test.cpp:4) > > A little research revealed that in this case the delete operator from > libsupc++ is called and valgrind does not provide redirections for the > symbols in libsupc++. > > When I added the redirections for libsupc++ to valgrind's > vg_replace_malloc.c: > > #define VG_Z_LIBSUPCXX_SONAME libsupcZpZpZa // libsupc++* > > FREE(VG_Z_LIBSUPCXX_SONAME, _ZdlPv,__builtin_delete ); > FREE(VG_Z_LIBSUPCXX_SONAME, _ZdlPvRKSt9nothrow_t, __builtin_delete ); > FREE(VG_Z_LIBSUPCXX_SONAME, _ZdaPv, __builtin_vec_delete ); > FREE(VG_Z_LIBSUPCXX_SONAME, _ZdaPvRKSt9nothrow_t, __builtin_vec_delete ); > > the issue was fixed: > > --99254-- Reading syms from /usr/lib/libstdc++.so.6 > ... > --99254---- ACTIVE -- > ... > --99254-- 0x012627c0 (operator new[](unsig) R-> (1001.0) 0x01004ce0 > operator new[](unsigned long, std::nothrow_t const&) > --99254-- 0x012627d0 (operator new(unsigne) R-> (1001.0) 0x01004860 > operator new(unsigned long, std::nothrow_t const&) > --99254-- 0x012628d0 (operator delete[](vo) R-> (1005.0) 0x01005b00 > operator delete[](void*, std::nothrow_t const&) > --99254-- 0x012628e0 (operator delete(void) R-> (1005.0) 0x01005500 > operator delete(void*, std::nothrow_t const&) > --99254-- 0x012c27e0 (operator new[](unsig) R-> (1003.0) 0x01004a80 > operator new[](unsigned long) > --99254-- 0x012c2800 (operator new(unsigne) R-> (1003.0) 0x01004430 > operator new(unsigned long) > --99254-- 0x012c29a0 (operator delete[](vo) R-> (1005.0) 0x01005800 > operator delete[](void*) > --99254-- 0x012c3e40 (operator delete(void) R-> (1005.0) 0x01005200 > operator delete(void*) > ... > --99254-- Reading syms from /usr/lib/libsupc++.so.1 > ... > --99254---- ACTIVE -- >
Re: Vimage Jail kernel crashed
On Sat, May 04, 2013 at 02:52:23PM +0900, KIRIYAMA Kazuhiko wrote: > May 4 11:19:46 xx kernel: Fatal trap 12: page fault while in kernel mode > May 4 11:19:46 xx kernel: cpuid = 2; apic id = 02 > May 4 11:19:46 xx kernel: fault virtual address = 0x7818c3798 > May 4 11:19:46 xx kernel: fault code = supervisor write > data, page not present > May 4 11:19:46 xx kernel: instruction pointer= > 0x20:0x8162c19e > May 4 11:19:46 xx kernel: stack pointer = > 0x28:0xff8121b22860 > May 4 11:19:46 xx kernel: frame pointer = > 0x28:0xff8121b22870 > May 4 11:19:46 xx kernel: code segment = base 0x0, limit > 0xf, type 0x1b > May 4 11:19:46 xx kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 > May 4 11:19:46 xx kernel: processor eflags = interrupt enabled, > resume, IOPL = 0 > May 4 11:19:46 xx kernel: current process= 15360 > (ifconfig) > May 4 11:19:46 xx kernel: trap number= 12 > May 4 11:19:46 xx kernel: panic: page fault > May 4 11:19:46 xx kernel: cpuid = 2 > May 4 11:19:46 xx kernel: KDB: stack backtrace: > May 4 11:19:46 xx kernel: #0 0x80923446 at kdb_backtrace+0x66 > May 4 11:19:46 xx kernel: #1 0x808ed0be at panic+0x1ce > May 4 11:19:46 xx kernel: #2 0x80c7e330 at trap_fatal+0x290 > May 4 11:19:46 xx kernel: #3 0x80c7e668 at trap_pfault+0x1e8 > May 4 11:19:46 xx kernel: #4 0x80c7ec6e at trap+0x3be > May 4 11:19:46 xx kernel: #5 0x80c682ef at calltrap+0x8 > May 4 11:19:46 xx kernel: #6 0x8162c76d at > pfi_change_group_event+0x4d > May 4 11:19:46 xx kernel: #7 0x809a0d3b at if_delgroup+0x38b > May 4 11:19:46 xx kernel: #8 0x809a7846 at > if_clone_destroyif+0x136 > May 4 11:19:46 xx kernel: #9 0x809a831a at if_clone_destroy+0x17a > May 4 11:19:46 xx kernel: #10 0x809a5892 at ifioctl+0x482 > May 4 11:19:46 xx kernel: #11 0x80934ef6 at kern_ioctl+0x106 > May 4 11:19:46 xx kernel: #12 0x8093513d at sys_ioctl+0xfd > May 4 11:19:46 xx kernel: #13 0x80c7dc10 at amd64_syscall+0x540 > May 4 11:19:46 xx kernel: #14 0x80c685d7 at Xfast_syscall+0xf7 It looks like it crashed when referring vnet that had already been destroyed, in pfi_change_group_event hook. > Is there any suggestions? VIMAGE+pf support is fragile. If it works for someone it is rather by accident. I expect replacing pf with ipfw_nat or natd will give better results. If you still prefer pf, you may try destroying epair interface before destroying vnet, e.g. using prestop rc.d/jail hooks instead of poststop, if it is possible. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Vimage Jail kernel crashed
On Sat, May 04, 2013 at 10:41:46PM +0900, KIRIYAMA Kazuhiko wrote: > > If you still prefer pf, you may try destroying epair interface before > > destroying vnet, e.g. using prestop rc.d/jail hooks instead of > > poststop, if it is possible. > > In particular, execute following sequence? > > # ifconfig epairXa destroy > # ifconfig bridge0 deletem epairXa Yes, but in the revers order, delete from the bridge first. It is about lines like these in your configuration: export jail_web_exec_poststop0="ifconfig bridge0 deletem epair4a" export jail_web_exec_poststop1="ifconfig epair4a destroy" The crash happened when executing ifconfig epair destroy. You might want to try running commands manually before using the rc script. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Nullfs leaks i-nodes
On Tue, May 07, 2013 at 08:30:06AM +0200, Göran Löwkrantz wrote: > I created a PR, kern/178238, on this but would like to know if anyone has > any ideas or patches? > > Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 > and still have the problem. I am observing an effect that might look like inode leak, which I think is due free nullfs vnodes caching, recently added by kib (r240285): free inode number does not increase after unlink; but if I purge the free vnodes cache (temporary setting vfs.wantfreevnodes to 0 and observing vfs.freevnodes decreasing to 0) the inode number grows back. You have only about 1000 inodes available on your underlying fs, while vfs.wantfreevnodes I think is much higher, resulting in running out of i-nodes. If it is really your case you can disable caching, mounting nullfs with nocache (it looks like caching is not important in your case). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Proposed MFC to hastctl: compact 'status' and introduce 'list' command
On Fri, May 24, 2013 at 12:54:56AM +0400, Dmitry Morozovsky wrote: > Dear colleagues, > > is there any objection for MFCing the change introduced in > > http://svnweb.freebsd.org/changeset/base/248291 > > (the most major change: compacting output of `hastctl status' to one-liner > per > provider; old output is retained as `list' command) > > to at least stable/9 ? If we agreed to merge, I would prefer to both stable/9 and 8 to have divergence between branches as minimal as possible. > The reason I'm asking is that it could lead to changes in hast-related > scripts > which one use in production. > > If no objections are received I'm (with the generous support from trociny, > thank you Mikolaj!) tend to merge it after, say, 2 weeks. > > Thanks! > > -- > Sincerely, > D.Marck [DM5020, MCK-RIPE, DM3-RIPN] > [ FreeBSD committer: ma...@freebsd.org ] > > *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Proposed MFC to hastctl: compact 'status' and introduce 'list' command
On Fri, May 24, 2013 at 02:08:28PM +0400, Dmitry Morozovsky wrote: > Pete, > > On Fri, 24 May 2013, Pete French wrote: > > > > http://svnweb.freebsd.org/changeset/base/248291 > > ... > > > The reason I'm asking is that it could lead to changes in hast-related > > > scripts > > > which one use in production. > > > > > > Any chance we could do this is 2 stages - first being to add 'list' to give > > us a chnace > > ti change scripts over, then make the chnages to 'status'. I have scripts > > which try and parse the outut from 'status' which will need changing, > > and I sspect I am not the only one... > > I see no problem with this, as it is one-lite patch (modulo usage/manual page > changes); it would be direct commit to -stable, but as it is temporary, I see > no problem there too. > > Mikolaj, your opinion? It looks like a very good idea. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast and zfs trim possibly causing some problems in 9.2
On Wed, Oct 09, 2013 at 03:47:29PM +0100, Steven Hartland wrote: > ZFS will try to send DELETE requests to the underlying storage to > support TRIM. If that fails then it will disable TRIM support for > that vdev. > > My guess would be you're just seeing hast being a bit verbose > when these initial batch failures happen. If the device on the secondary node does not supports DELETE, but the device on the primary does, HAST will report to ZFS that DELETE succeeded (although it failed on the secondary), and ZFS will not disable TRIM. Pete, isn't this your case? > From: "Pete French" > > >I just had a machine fall over on my for the first time in ages - one > > of a pair of machine we have running hast with zfs on top. I havent > > got any concrete evidence of what made it die as yet, but I > > did notice the logifles filling up with thoursands of lines like this > > just prior to the crash: > > > > serpentine-active hastd[1522]: [serp1] (primary) Remote request failed > > (Operation not supported): DELETE(26847744000, 1536). > > > > so I am guessing taht is ZFS trying to send a trim command to hast, and hast > > does not support it. Have disabled zfs trim now, but thought it was > > worth mentioning - I would have not expected zfs to be trying to issue > > a trim command to an underlying device which doesnt support it. These > > machines were rock solid under 8, and the only chnage I can see with 9 is > > the trim support being added. Another important change that comes to mind is the default replication mode, changed from fullsync to memsync. Do you have the replication mode explicitly set in your config? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast and zfs trim possibly causing some problems in 9.2
On Fri, Oct 11, 2013 at 11:27:36AM +0100, Pete French wrote: > > If the device on the secondary node does not supports DELETE, but the > > device on the primary does, HAST will report to ZFS that DELETE > > succeeded (although it failed on the secondary), and ZFS will not > > disable TRIM. Pete, isn't this your case? > > Afraid not, both machines are running normal "spinning rust" hard > drives as the actual storage layer, so there is nothing TRIM capable > anywhere. > > I didnt get much chnace to look at this yesterday, but am looking at the logs > again now, and I see these messages right up to the time the machine > fell over. That machine had been up for a long time, and it was still logging > these messages, so it looks very much as if ZFS did not stop trying to > issue the TRIM. You showed only "Remote request failed" errors from your logs. Do you have "Local request failed" errors too? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: hast and zfs trim possibly causing some problems in 9.2
On Fri, Oct 11, 2013 at 01:42:39PM +0300, Mikolaj Golub wrote: > You showed only "Remote request failed" errors from your logs. Do you > have "Local request failed" errors too? You should also see them in "local errors" statistics from `hastctl list' output. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NICs locking up, "*tcp_sc_h"
On Fri, 13 Mar 2009 20:56:24 +1100 Nick Withers wrote: > I'm sorry to ask what is probably a very simple question, but is there > somewhere I should look to get clues on debugging from a manually > generated dump? I tried "panic" after manually envoking the kernel > debugger but proved highly inept at getting from the dump the same > information "ps" / "where" gave me within the debugger live. You can capture ddb session in capture buffer and then extract it from the dump. In ddb run capture on do your debugging then run "panic" or "call doadump" and after reboot: ddb capture -M /var/crash/vmcore.X print > out I would recommend to increase debug.ddb.capture.bufsize sysctl variable to be sure all the ddb session will be captured. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 7.2-PRERELEASE/sunx2200/bge/msi broken
On Sun, 22 Mar 2009 12:55:02 +0200 Danny Braniss wrote: DB> Hi, DB> between March 16 and now, bge on a Sun X2200 stopped working, DB> turning off msi (via hw..pci.enable_msi=0) got it working again. DB> I tried first replacing bge with an older version but that did not help. It looks like related to this report: http://www.freebsd.org/cgi/getmsg.cgi?fetch=1253844+1263253+/usr/local/www/db/text/2009/freebsd-bugs/20090322.freebsd-bugs -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: RELENG_7 crash
On Tue, 21 Apr 2009 01:25:06 -0400 Mike Tancsa wrote: MT> The box has a fairly heavy UDP load. Its RELENG_7 as of today and MT> took 3hrs for it to dump core. MT> Fatal trap 12: page fault while in kernel mode MT> cpuid = 1; apic id = 01 MT> fault virtual address = 0x68 MT> fault code = supervisor read, page not present MT> instruction pointer = 0x20:0xc0637146 MT> stack pointer = 0x28:0xe766eaac MT> frame pointer = 0x28:0xe766eb54 MT> code segment= base 0x0, limit 0xf, type 0x1b MT> = DPL 0, pres 1, def32 1, gran 1 MT> processor eflags= interrupt enabled, resume, IOPL = 0 MT> current process = 761 (bsnmpd) MT> trap number = 12 MT> panic: page fault MT> cpuid = 1 MT> Uptime: 3h47m43s MT> Physical memory: 2036 MB MT> Dumping 83 MB: 68 52 36 20 4 MT> (kgdb) bt MT> #0 doadump () at pcpu.h:196 MT> #1 0xc05964d7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 MT> #2 0xc05967a9 in panic (fmt=Variable "fmt" is not available. MT> ) at /usr/src/sys/kern/kern_shutdown.c:574 MT> #3 0xc07f64ac in trap_fatal (frame=0xe766ea6c, eva=104) at MT> /usr/src/sys/i386/i386/trap.c:939 MT> #4 0xc07f6730 in trap_pfault (frame=0xe766ea6c, usermode=0, eva=104) MT> at /usr/src/sys/i386/i386/trap.c:852 MT> #5 0xc07f70dc in trap (frame=0xe766ea6c) at /usr/src/sys/i386/i386/trap.c:530 MT> #6 0xc07db7eb in calltrap () at /usr/src/sys/i386/i386/exception.s:159 MT> #7 0xc0637146 in sysctl_ifdata (oidp=0xc08816a0, arg1=0xe766ec24, MT> arg2=2, req=0xe766eba4) at /usr/src/sys/net/if_mib.c:127 MT> #8 0xc059fd77 in sysctl_root (oidp=Variable "oidp" is not available. MT> ) at /usr/src/sys/kern/kern_sysctl.c:1413 MT> #9 0xc059ff14 in userland_sysctl (td=0xc5374460, name=0xe766ec14, MT> namelen=6, old=0x0, oldlenp=0xbfbf8478, inkernel=0, new=0x0, MT> newlen=0, retval=0xe766ec10, flags=0) at MT> /usr/src/sys/kern/kern_sysctl.c:1506 MT> #10 0xc05a0064 in __sysctl (td=0xc5374460, uap=0xe766ecfc) at MT> /usr/src/sys/kern/kern_sysctl.c:1443 MT> #11 0xc07f6a85 in syscall (frame=0xe766ed38) at MT> /usr/src/sys/i386/i386/trap.c:1090 MT> #12 0xc07db850 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 MT> #13 0x0033 in ?? () MT> Previous frame inner to this frame (corrupt stack?) MT> (kgdb) Just FYI, the same problem has already been registered in pr database as kern/132734. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"