Re: 8.0-RC1 NFS client timeout issue

2009-10-31 Thread Daniel Braniss
> 
> First off, I know that cross posting is evil, but I wanted to try
> and make sure developers saw it.
> 
> On Tue, 27 Oct 2009, Olaf Seibert wrote:
> 
> > I see an annoying behaviour with NFS over TCP. It happens both with nfs
> > and newnfs. This is with FreeBSD/amd64 8.0-RC1 as client. The server is
> > some Linux or perhaps Solaris, I'm not entirely sure.
> >
> > After trying to find something in packet traces, I think I have found
> > something.
> >
> > The scenario seems to be as follows. Sorry for the width of the lines.
> >
> >
> > No. TimeSourceDestination   Protocol 
> > Info
> >   2296 2992.216855 xxx.xxx.31.43 xxx.xxx.16.142NFS  V3 
> > LOOKUP Call (Reply In 2297), DH:0x3819da36/w
> >   2297 2992.217107 xxx.xxx.16.142xxx.xxx.31.43 NFS  V3 
> > LOOKUP Reply (Call In 2296) Error:NFS3ERR_NOENT
> >   2298 2992.217141 xxx.xxx.31.43 xxx.xxx.16.142NFS  V3 
> > LOOKUP Call (Reply In 2299), DH:0x170cb16a/bin
> >   2299 2992.217334 xxx.xxx.16.142xxx.xxx.31.43 NFS  V3 
> > LOOKUP Reply (Call In 2298), FH:0x61b8eb12
> >   2300 2992.217361 xxx.xxx.31.43 xxx.xxx.16.142NFS  V3 
> > ACCESS Call (Reply In 2301), FH:0x61b8eb12
> >   2301 2992.217582 xxx.xxx.16.142xxx.xxx.31.43 NFS  V3 
> > ACCESS Reply (Call In 2300)
> >   2302 2992.217605 xxx.xxx.31.43 xxx.xxx.16.142NFS  V3 
> > LOOKUP Call (Reply In 2303), DH:0x61b8eb12/w
> >   2303 2992.217860 xxx.xxx.16.142xxx.xxx.31.43 NFS  V3 
> > LOOKUP Reply (Call In 2302) Error:NFS3ERR_NOENT
> >   2304 2992.318770 xxx.xxx.31.43 xxx.xxx.16.142TCP  934 
> > > nfs [ACK] Seq=238293 Ack=230289 Win=8192 Len=0 TSV=86492342 TSER=12393434
> >   2306 3011.537520 xxx.xxx.16.142xxx.xxx.31.43 NFS  V3 
> > GETATTR Reply (Call In 2305)  Directory mode:2755 uid:4100 gid:4100
> >   2307 3011.637744 xxx.xxx.31.43 xxx.xxx.16.142TCP  934 
> > > nfs [ACK] Seq=238429 Ack=230405 Win=8192 Len=0 TSV=86511662 TSER=12395366
> >   2308 3371.534980 xxx.xxx.16.142xxx.xxx.31.43 TCP  nfs 
> > > 934 [FIN, ACK] Seq=230405 Ack=238429 Win=49232 Len=0 TSV=12431366 
> > TSER=86511662
> >
> > The server decides, for whatever reason, to terminate the
> > connection and sends a FIN.
> >
> >   2309 3371.535018 xxx.xxx.31.43 xxx.xxx.16.142TCP  934 
> > > nfs [ACK] Seq=238429 Ack=230406 Win=8192 Len=0 TSV=86871578 TSER=12431366
> >
> > Client acknowledges this,
> >
> >   2310 3375.379693 xxx.xxx.31.43 xxx.xxx.16.142NFS  V3 
> > ACCESS Call, FH:0x008002a2
> >
> > but tries to sneak in another call anyway.  [A]
> >
> Probably not the best behaviour, but I think it is technically allowed by 
> TCP. (My TCP is very rusty, but I think the socket should be in
> TCPS_CLOSE_WAIT at this point and the BSD code will have called
> socantrcvmore(), but not socantsndmore().)
> 
> >   2311 3375.474788 xxx.xxx.16.142xxx.xxx.31.43 TCP  nfs 
> > > 934 [ACK] Seq=230406 Ack=238569 Win=49232 Len=0 TSV=12431760 TSER=86875423
> >
> > Server ACKs but doesn't send anything else... [B]
> >
> > Time passes...
> >
> This is where it seems interesting. It looks to me like the socket upcall
> for receiving the FIN would have happened before this point, setting the
> ct_error.re_status to RPC_CANTRECV, but the code in clnt_vc_call() doesn't
> check for this. (It does check for it happening during and after the
> sosend(), but not before it, from what I can see.)
> 
> >
> > [B] would be a bug of the server in my opinion. If it ACKs a call, it
> > should send a reply. And if it can't, it shouldn't.
> >
> I'll leave this one for the TCP wizzards. I'm not sure what the
> correct behaviour is when data is received on a connection. (I think
> it is waiting for a FIN from the client side at this point.)
> 
> If you could try the following patch and see if it helps, that would be
> appreciated, rick
> ps: I'll try to reproduce the situation here, but I'm not sure if I can.
> --- rpc/clnt_vc.c.sav 2009-10-28 15:44:20.0 -0400
> +++ rpc/clnt_vc.c 2009-10-28 15:49:57.0 -0400
> @@ -413,6 +413,19 @@
> 
>   cr->cr_xid = xid;
>   mtx_lock(&ct->ct_lock);
> + /*
> +  * Check to see if the other end has already started to close down
> +  * the connection. If it happens after this point, it will be
> +  * detected below, when cr->cr_error is checked.
> +  */
> + if (ct->ct_error.re_status == RPC_CANTRECV) {
> + if (errp != &ct->ct_error) {
> + errp->re_errno = ct->ct_error.re_errno;
> + errp->re_status = RPC_CANTRECV;
> + }
> + stat = RPC_CANTRECV;
> + goto out;
> + }
>   TAILQ_INSERT_TAIL(&ct->ct_pending, cr, cr_link);
>   mtx_unlock(&ct->ct_lock);

Di

Console has no name?

2009-10-31 Thread Dimitry Andric
Hi,

Somewhere between r198312 and r198702 I started getting this message
during boot, on a machine using serial console:

WARNING: console at 0xc093cea0 has no name

This is just before the copyright banner of the kernel.  Does anybody
have an idea what might cause this, before I start digging? :)

The machine has /boot.config with just "-P", and no console-related
entries in /boot/loader.conf.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Console has no name?

2009-10-31 Thread Dimitry Andric
On 2009-10-31 14:14, Dimitry Andric wrote:
> Somewhere between r198312 and r198702 I started getting this message
> during boot, on a machine using serial console:
> 
> WARNING: console at 0xc093cea0 has no name

Okay, this seems to be caused by r198655, which is an MFC of r197570,
where experimental support for USB serial console was added.

However, the consdev::cn_name field is never initialized in
usb_serial.c, whereas this does happen in other console drivers.

There doesn't seem to be a consensus whether this field needs to be
initialized in the _cnprobe() or _cninit() functions: I count 9 instances
of initialization in the former, and 3 in the latter.  Is there a
preferred way?

Since _cnprobe seems more popular, I propose the following fix:

Index: sys/dev/usb/serial/usb_serial.c
===
--- sys/dev/usb/serial/usb_serial.c (revision 198702)
+++ sys/dev/usb/serial/usb_serial.c (working copy)
@@ -1301,6 +1301,7 @@ static void
 ucom_cnprobe(struct consdev  *cp)
 {
cp->cn_pri = CN_NORMAL;
+   strlcpy(cp->cn_name, "ucom", sizeof cp->cn_name);
 }
 
 static void
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


hostapd "deauthenticated due to local deauth request"

2009-10-31 Thread Ivan Voras
I'm trying to setup an AP with a run0 interface on latest 8-STABLE but
apparently 802.11 association fails:

Oct 31 16:21:30 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
802.11: associated
Oct 31 16:21:33 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
802.11: deauthenticated due to local deauth request
Oct 31 16:21:33 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
802.11: deassociated
Oct 31 16:21:35 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
802.11: associated
Oct 31 16:21:38 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
802.11: deauthenticated due to local deauth request
Oct 31 16:21:38 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
802.11: deassociated

etc. Apparently the client never comes to the phase to receive DHCP address.

The client in this case is WinXP and the setup did work with 7-STABLE,
though with a bug in the rum driver which caused regular kernel panics
on the AP.

The devices are configured like this:

rum0: flags=8843 metric 0 mtu 2290
ether 00:1c:f0:9d:08:b3
media: IEEE 802.11 Wireless Ethernet autoselect mode 11g 
status: running
wlan0: flags=8943
metric 0 mtu 1500
ether 00:1c:f0:9d:08:b3
inet6 fe80::21c:f0ff:fe9d:8b3%wlan0 prefixlen 64 scopeid 0x6
inet 10.0.0.3 netmask 0xff00 broadcast 10.0.0.255
media: IEEE 802.11 Wireless Ethernet autoselect mode 11g 
status: running
ssid Cosmos channel 1 (2412 Mhz 11g) bssid 00:1c:f0:9d:08:b3
country US authmode WPA privacy MIXED deftxkey 3 TKIP 2:128-bit
TKIP 3:128-bit txpower 0 scanvalid 60 protmode CTS dtimperiod 1 -dfs

hostapd.conf contains:

interface=wlan0
debug=3
ctrl_interface=/var/run/hostapd
ctrl_interface_group=wheel
ssid=Cosmos
wpa=1
wpa_passphrase=something
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP

Any ideas?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Console has no name?

2009-10-31 Thread Hans Petter Selasky
Hi,

Your patch has been committed to USB P4:


http://p4web.freebsd.org/chv.cgi?CH=170003


Thanks for reporting!

--HPS

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Console has no name?

2009-10-31 Thread Hans Petter Selasky
On Saturday 31 October 2009 15:53:53 Dimitry Andric wrote:
> On 2009-10-31 14:14, Dimitry Andric wrote:
> > Somewhere between r198312 and r198702 I started getting this message
> > during boot, on a machine using serial console:
> >
> > WARNING: console at 0xc093cea0 has no name
>
> Okay, this seems to be caused by r198655, which is an MFC of r197570,
> where experimental support for USB serial console was added.
>
> However, the consdev::cn_name field is never initialized in
> usb_serial.c, whereas this does happen in other console drivers.
>
> There doesn't seem to be a consensus whether this field needs to be
> initialized in the _cnprobe() or _cninit() functions: I count 9 instances
> of initialization in the former, and 3 in the latter.  Is there a
> preferred way?
>
> Since _cnprobe seems more popular, I propose the following fix:
>
> Index: sys/dev/usb/serial/usb_serial.c
> ===
> --- sys/dev/usb/serial/usb_serial.c   (revision 198702)
> +++ sys/dev/usb/serial/usb_serial.c   (working copy)
> @@ -1301,6 +1301,7 @@ static void
>  ucom_cnprobe(struct consdev  *cp)
>  {
>   cp->cn_pri = CN_NORMAL;
> + strlcpy(cp->cn_name, "ucom", sizeof cp->cn_name);
>  }
>
>  static void

Patch looks fine by me.

--HPS
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: hostapd "deauthenticated due to local deauth request"

2009-10-31 Thread Paul B Mahol
On 10/31/09, Ivan Voras  wrote:
> I'm trying to setup an AP with a run0 interface on latest 8-STABLE but
> apparently 802.11 association fails:
>
> Oct 31 16:21:30 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
> 802.11: associated
> Oct 31 16:21:33 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
> 802.11: deauthenticated due to local deauth request
> Oct 31 16:21:33 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
> 802.11: deassociated
> Oct 31 16:21:35 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
> 802.11: associated
> Oct 31 16:21:38 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
> 802.11: deauthenticated due to local deauth request
> Oct 31 16:21:38 ursaminor hostapd: wlan0: STA 00:22:69:07:30:9e IEEE
> 802.11: deassociated
>
> etc. Apparently the client never comes to the phase to receive DHCP address.
>
> The client in this case is WinXP and the setup did work with 7-STABLE,
> though with a bug in the rum driver which caused regular kernel panics
> on the AP.
>

I tried same one with rum(4) as AP and ndis(4) as client on same
machine(some version of 8.0 - CURRENT). ndis client (configured via
wpa_supplicant)
would keep auth and deauth all the time.
I came to conclusion that rum is broken. But I think I remmember that
bwi(4) (as a client)
did not have such problem ... (I will test again to see)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


CVE-2009-0689

2009-10-31 Thread Oliver Pinter
Hello list!

I have a small question from CVE-2009-0689[1]:
The commit r197059 fixed this bug or not?

On security lists its marked as high priority and find it at juni.

some info:
[1] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0689
[2] http://securityreason.com/achievement_securityalert/63
[3] http://seclists.org/fulldisclosure/2009/Oct/357
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 Stable Crash - possibly related to if_re

2009-10-31 Thread Pyun YongHyeon
On Fri, Oct 30, 2009 at 06:23:51PM -0700, Norbert Papke wrote:
> On October 30, 2009, Pyun YongHyeon wrote:
> > On Thu, Oct 29, 2009 at 09:56:19PM -0700, Norbert Papke wrote:
> > > This occurred shortly after "scp"ing from a VirtualBox VM to the host. 
> > > The file transfer got stuck.  The "re" interface stopped working. 
> > > Shortly afterwards, the host crashed.  The "re" interface was used by the
> > > host, the guest was using a different NIC in bridged mode.
> > >
> > >
> > > FreeBSD proven.lan 7.2-STABLE FreeBSD 7.2-STABLE #5 r198666: Thu Oct 29
> > > 18:36:57 PDT 2009
> > >
> > > Fatal trap 12: page fault while in kernel mode
> > > cpuid = 0; apic id = 00
> > > fault virtual address   = 0x18
> >
> > It looks like a NULL pointer dereference, possibly mbuf related
> > one.
> >
> > > fault code  = supervisor write data, page not present
> > > instruction pointer = 0x8:0x80d476ee
> > > stack pointer   = 0x10:0xff878ae0
> > > frame pointer   = 0x10:0xff878b40
> > > code segment= base 0x0, limit 0xf, type 0x1b
> > > = DPL 0, pres 1, long 1, def32 0, gran 1
> > > processor eflags= interrupt enabled, resume, IOPL = 0
> > > current process = 18 (swi5: +)
> > > Physical memory: 8177 MB
> > >
> > >
> 
> > > #9  0x807e710e in calltrap ()
> > > at /usr/public/freebsd/sources/stable/sys/amd64/amd64/exception.S:218
> > > #10 0x80d476ee in re_rxeof () from /boot/kernel/if_re.ko
> >
> > Hmm, I think there is a missing information here. Not sure where it
> > dereferenced a NULL pointer in re_rxeof(). 
> 
> >> #11 0x80d4a481 in re_int_task (arg=Variable "arg" is not available.
> >> ) 
> >> 
> at 
> /usr/public/freebsd/sources/stable/sys/modules/re/../../dev/re/if_re.c:2191  
> 
> I am not sure how much I trust frame 10.  The instruction 
> at "0x80d476ee" is the one after the "retq" from re_rxeof().  Frame 
> 11 seems OK to me.  The "struct rl_softc*", in particular, looks plausible 
> but I don't know enough to say for sure.
> 
> > Because that this is the 
> > first report for NULL pointer dereference in Rx handler I need more
> > information how to reproduce it with minimal configuration. Can you
> > also reproduce the issues without virtual box?
> 
> I am trying but no luck so far.
> 
> > By chance, did you stop the re0 interface with ifconfig when you
> > noticed the file transfer got stuck?
> 
> It is possible.  I had it happen twice.  The first time I definitely tried 
> to "down" re.  I cannot recall what I did the second time.  The crash dump is 
> from the second time.
> 

Ok, then would you try attached patch?
Index: sys/dev/re/if_re.c
===
--- sys/dev/re/if_re.c	(revision 198686)
+++ sys/dev/re/if_re.c	(working copy)
@@ -1817,6 +1817,8 @@
 
 	for (i = sc->rl_ldata.rl_rx_prodidx; maxpkt > 0;
 	i = RL_RX_DESC_NXT(sc, i)) {
+		if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0)
+			break;
 		cur_rx = &sc->rl_ldata.rl_rx_list[i];
 		rxstat = le32toh(cur_rx->rl_cmdstat);
 		if ((rxstat & RL_RDESC_STAT_OWN) != 0)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"