Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Robert N. M. Watson

On 13 Oct 2010, at 18:46, Ryan Stone wrote:

> On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson  wrote:
>> +   /*
>> +* get and fill a header mbuf, then chain data as an
>> extended
>> +* mbuf.
>> +*/
>> +   MGETHDR(m, M_DONTWAIT, MT_DATA);
>> 
>> The idea of calling into the mbuf allocator in this context is just freaky,
>> and may have some truly awful side effects.  I suppose this is the cost of
>> trying to combine code paths in the network device driver rather than have
>> an independent path in the netdump case, but it's quite unfortunate and will
>> significantly reduce the robustness of netdumps in the face of, for example,
>> mbuf starvation.
> 
> Changing this will require very invasive changes to the network
> drivers.  I know that the Intel drivers allocate their own mbufs for
> their receive rings and I imagine that all other drivers have to do
> something similar.  Plus the drivers are responsible for freeing mbufs
> after they have been transmitted.  It seems to me that the cost of
> making significant changes to the network drivers to support an
> alternate lifecycle for netdump mbufs far outweighs the cost of losing
> a couple of kernel dumps in extreme circumstances.

My concern is less about occasional lost dumps that destabilising the dumping 
process: calls into the memory allocator can currently trigger a lot of 
interesting behaviours, such as further calls back into the VM system, which 
can then trigger calls into other subsystems. What I'm suggesting is that if we 
want the mbuf allocator to be useful in this context, we need to teach it about 
things not to do in the dumping / crash / ... context, which probably means 
helping uma out a bit in that regard. And a watchdog to make sure the dump is 
making progress.

Robert___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: strange resolver behavour

2010-10-14 Thread Eugene Grosbein
On 14.10.2010 13:43, Doug Barton wrote:

>> Hopefully it does not find but what if such names would exist
>> and have MX records? host(1) would lie to me.
> 
> No, it would act the way it's supposed to.

Is host(1) supposed to do lookups using suffixes from /etc/resolv.conf
for FQDN with dot at the end?

$ host koin-nkz.com.

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Attilio Rao
2010/10/14 Robert N. M. Watson :
>
> On 13 Oct 2010, at 18:46, Ryan Stone wrote:
>
>> On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson  wrote:
>>> +               /*
>>> +                * get and fill a header mbuf, then chain data as an
>>> extended
>>> +                * mbuf.
>>> +                */
>>> +               MGETHDR(m, M_DONTWAIT, MT_DATA);
>>>
>>> The idea of calling into the mbuf allocator in this context is just freaky,
>>> and may have some truly awful side effects.  I suppose this is the cost of
>>> trying to combine code paths in the network device driver rather than have
>>> an independent path in the netdump case, but it's quite unfortunate and will
>>> significantly reduce the robustness of netdumps in the face of, for example,
>>> mbuf starvation.
>>
>> Changing this will require very invasive changes to the network
>> drivers.  I know that the Intel drivers allocate their own mbufs for
>> their receive rings and I imagine that all other drivers have to do
>> something similar.  Plus the drivers are responsible for freeing mbufs
>> after they have been transmitted.  It seems to me that the cost of
>> making significant changes to the network drivers to support an
>> alternate lifecycle for netdump mbufs far outweighs the cost of losing
>> a couple of kernel dumps in extreme circumstances.
>
> My concern is less about occasional lost dumps that destabilising the dumping 
> process: calls into the memory allocator can currently trigger a lot of 
> interesting behaviours, such as further calls back into the VM system, which 
> can then trigger calls into other subsystems. What I'm suggesting is that if 
> we want the mbuf allocator to be useful in this context, we need to teach it 
> about things not to do in the dumping / crash / ... context, which probably 
> means helping uma out a bit in that regard. And a watchdog to make sure the 
> dump is making progress.

I think that this would be way too complicated just to cope with panic
within the VM/UMA (not sure what other subsystems you are referring
to, wrt supposed to call). Besides, if we have a panic in the VM I'm
sure that normal dumps could also be affected.
When dealing with netdump, I'm not trying to fix all the bugs related
to our dumping infrastructure because, as long as we already
discussed, we know there are quite a few of them, but trying at least
to follow the same fragile-ness than what we have today.
And again, while I think the "watchdog" idea is good, I think it still
applies to normal dumps too, it is not specific to netdump.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Robert N. M. Watson

On 14 Oct 2010, at 15:10, Attilio Rao wrote:

>> My concern is less about occasional lost dumps that destabilising the 
>> dumping process: calls into the memory allocator can currently trigger a lot 
>> of interesting behaviours, such as further calls back into the VM system, 
>> which can then trigger calls into other subsystems. What I'm suggesting is 
>> that if we want the mbuf allocator to be useful in this context, we need to 
>> teach it about things not to do in the dumping / crash / ... context, which 
>> probably means helping uma out a bit in that regard. And a watchdog to make 
>> sure the dump is making progress.
> 
> I think that this would be way too complicated just to cope with panic
> within the VM/UMA (not sure what other subsystems you are referring
> to, wrt supposed to call). Besides, if we have a panic in the VM I'm
> sure that normal dumps could also be affected.
> When dealing with netdump, I'm not trying to fix all the bugs related
> to our dumping infrastructure because, as long as we already
> discussed, we know there are quite a few of them, but trying at least
> to follow the same fragile-ness than what we have today.
> And again, while I think the "watchdog" idea is good, I think it still
> applies to normal dumps too, it is not specific to netdump.

No, what I'm saying is: UMA needs to not call its drain handlers, and ideally 
not call into VM to fill slabs, from the dumping context. That's easy to 
implement and will cause the dump to fail rather than causing the system to 
hang.

Robert___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Attilio Rao
2010/10/14 Robert N. M. Watson :
>
> On 14 Oct 2010, at 15:10, Attilio Rao wrote:
>
>>> My concern is less about occasional lost dumps that destabilising the 
>>> dumping process: calls into the memory allocator can currently trigger a 
>>> lot of interesting behaviours, such as further calls back into the VM 
>>> system, which can then trigger calls into other subsystems. What I'm 
>>> suggesting is that if we want the mbuf allocator to be useful in this 
>>> context, we need to teach it about things not to do in the dumping / crash 
>>> / ... context, which probably means helping uma out a bit in that regard. 
>>> And a watchdog to make sure the dump is making progress.
>>
>> I think that this would be way too complicated just to cope with panic
>> within the VM/UMA (not sure what other subsystems you are referring
>> to, wrt supposed to call). Besides, if we have a panic in the VM I'm
>> sure that normal dumps could also be affected.
>> When dealing with netdump, I'm not trying to fix all the bugs related
>> to our dumping infrastructure because, as long as we already
>> discussed, we know there are quite a few of them, but trying at least
>> to follow the same fragile-ness than what we have today.
>> And again, while I think the "watchdog" idea is good, I think it still
>> applies to normal dumps too, it is not specific to netdump.
>
> No, what I'm saying is: UMA needs to not call its drain handlers, and ideally 
> not call into VM to fill slabs, from the dumping context. That's easy to 
> implement and will cause the dump to fail rather than causing the system to 
> hang.

Ok.
My point is, however, still the same: that should not happen just for
the netdump specific case but for all the dumping/KDB/panic cases (I
know it is unlikely current code !netdump calls into UMA but it is not
an established pre-requisite and may still happen that some added code
does).
I still see this as a weakness on the infrastructure, independently
from netdump. I can see that your point is that it is vital to netdump
correct behaviour though, so I'd wonder if it worths fixing it now or
later.

More people's comment would be appreciated.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


problem with net subsystem

2010-10-14 Thread Eugen Konkov
Hi

#top -SI
last pid: 67653;  load averages:  1.23,  1.12,  1.24 up 
1+09:46:14  20:22:34
94 processes:  3 running, 76 sleeping, 15 waiting
CPU:  0.0% user,  0.0% nice, 33.3% system, 66.7% interrupt,  0.0% idle
Mem: 54M Active, 226M Inact, 103M Wired, 59M Buf, 103M Free
Swap: 2048M Total, 2048M Free

  PID USERNAMETHR PRI NICE   SIZERES STATETIME   WCPU COMMAND
   13 root  1 106- 0K 8K RUN251:36 49.02% ng_queue
   11 root  1 171 ki31 0K 8K RUN 21.5H 35.60% idle
   12 root 15 -44- 0K   120K WAIT31:04 12.06% intr
0 root  9 -680 0K64K -   18:13  0.49% kernel
67618 root  1  440 10020K  2312K ttyin0:00  0.34% systat
18710 root  2  440 84660K 62868K select  16:11  0.29% mpd5
67653 root  1  440  9944K  2020K RUN  0:00  0.15% top

there are no queues nor pipes in ipfw.


#systat -v
  usersLoad

Mem:KBREALVIRTUAL   VN PAGER   SWAP PAGER
Tot   Share  TotShareFree   in   out in   out
Actcount
Allpages
Proc:Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltcow total
  zfod
  ozfod
  . %Sys. %Intr   . %User   . %Nice   . %Idle%ozfod
|||||||||||   daefr
  prcfr
   dtbuf  totfr
Namei Name-cache   Dir-cache   desvn  react
   Callshits   %hits   %   numvn  pdwak
   frevn  pdpgs
  intrn
Disks   ad0   wire
KB/t  act
tps   inact
MB/s  cache
%busy free


# vmstat -i
interrupt  total   rate
irq1: atkbd01203  0
irq14: ata0   139502  1
irq15: ata1   35  0
irq19: sis00  0
cpu0:timer 258485585   2124
Total  783059828   6436

# ifconfig sis0
sis0: flags=8843 metric 0 mtu 1500
options=82048
ether 00:0b:6a:a6:0c:f0
inet 10.11.8.18 netmask 0xff00 broadcast 10.11.8.255
inet6 fe80::20b:6aff:fea6:cf0%sis0 prefixlen 64 scopeid 0x1
inet R.E.A.L netmask 0xfffc broadcast X.X.X.X
nd6 options=29
media: Ethernet autoselect (100baseTX )
status: active

#netstat -s
tcp:
205416 packets sent
124715 data packets (8022081 bytes)
1425 data packets (99472 bytes) retransmitted
38 data packets unnecessarily retransmitted
0 resends initiated by MTU discovery
75509 ack-only packets (64514 delayed)
0 URG only packets
0 window probe packets
0 window update packets
3824 control packets
1653273 packets received
115656 acks (for 8017539 bytes)
1967 duplicate acks
0 acks for unsent data
104386 packets (4886466 bytes) received in-sequence
62 completely duplicate packets (1512 bytes)
0 old duplicate packets
0 packets with some dup. data (0 bytes duped)
4 out-of-order packets (112 bytes)
0 packets (0 bytes) of data after window
0 window probes
3 window update packets
1 packet received after close
4 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
0 discarded due to memory problems
1441 connection requests
1091 connection accepts
0 bad connection attempts
6726 listen queue overflows
25 ignored RSTs in the windows
2530 connections established (including accepts)
111289 connections closed (including 214 drops)
2185 connections updated cached RTT on close
2267 connections updated cached RTT variance on close
1516 connections updated cached ssthresh on close
2 embryonic connections dropped
107299 segments updated rtt (of 107034 attempts)
14

Re: strange resolver behavour

2010-10-14 Thread Doug Barton

On 10/14/2010 2:43 AM, Eugene Grosbein wrote:

Is host(1) supposed to do lookups using suffixes from /etc/resolv.conf
for FQDN with dot at the end?


... if only there were a document of some kind that described how the 
tool was supposed to work ... something like a manual ...



:)

Doug

--

Breadth of IT experience, and|   Nothin' ever doesn't change,
depth of knowledge in the DNS.   |   but nothin' changes much.
Yours for the right price.  :)   |  -- OK Go
http://SupersetSolutions.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: problem with net subsystem

2010-10-14 Thread Коньков Евгений
Здравствуйте, Eugen.

Вы писали 14 октября 2010 г., 20:44:50:

EK> Hi

EK> #top -SI
EK> last pid: 67653;  load averages:  1.23,  1.12,  1.24  up 
1+09:46:14  20:22:34
EK> 94 processes:  3 running, 76 sleeping, 15 waiting
EK> CPU:  0.0% user,  0.0% nice, 33.3% system, 66.7% interrupt,  0.0% idle
EK> Mem: 54M Active, 226M Inact, 103M Wired, 59M Buf, 103M Free
EK> Swap: 2048M Total, 2048M Free

EK>   PID USERNAMETHR PRI NICE   SIZERES STATETIME   WCPU COMMAND
EK>13 root  1 106- 0K 8K RUN251:36 49.02% ng_queue
EK>11 root  1 171 ki31 0K 8K RUN 21.5H 35.60% idle
EK>12 root 15 -44- 0K   120K WAIT31:04 12.06% intr
EK> 0 root  9 -680 0K64K -   18:13  0.49% kernel
EK> 67618 root  1  440 10020K  2312K ttyin0:00  0.34% systat
EK> 18710 root  2  440 84660K 62868K select  16:11  0.29% mpd5
EK> 67653 root  1  440  9944K  2020K RUN  0:00  0.15% top

EK> there are no queues nor pipes in ipfw.


EK> #systat -v
EK>   usersLoad

EK> Mem:KBREALVIRTUAL   VN PAGER  SWAP PAGER
EK> Tot   Share  TotShareFree   in   out  in   out
EK> Actcount
EK> Allpages
EK> Proc:Interrupts
EK>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltcow total
EK>   zfod
EK>   ozfod
EK>   . %Sys. %Intr   . %User   . %Nice   . %Idle%ozfod
EK> |||||||||||   daefr
EK>   prcfr
EK>dtbuf  totfr
EK> Namei Name-cache   Dir-cache   desvn  react
EK>Callshits   %hits   %   numvn  pdwak
EK>frevn  pdpgs
EK>   intrn
EK> Disks   ad0   wire
EK> KB/t  act
EK> tps   inact
EK> MB/s  cache
EK> %busy free


EK> # vmstat -i
EK> interrupt  total   rate
EK> irq1: atkbd01203  0
EK> irq14: ata0   139502  1
EK> irq15: ata1   35  0
EK> irq19: sis00  0
EK> cpu0:timer 258485585   2124
EK> Total  783059828   6436

EK> # ifconfig sis0
EK> sis0: flags=8843 metric 0 mtu 1500
EK> options=82048
EK> ether 00:0b:6a:a6:0c:f0
EK> inet 10.11.8.18 netmask 0xff00 broadcast 10.11.8.255
EK> inet6 fe80::20b:6aff:fea6:cf0%sis0 prefixlen 64 scopeid 0x1
EK> inet R.E.A.L netmask 0xfffc broadcast X.X.X.X
EK> nd6 options=29
EK> media: Ethernet autoselect (100baseTX )
EK> status: active

EK> #netstat -s
EK> tcp:
EK> 205416 packets sent
EK> 124715 data packets (8022081 bytes)
EK> 1425 data packets (99472 bytes) retransmitted
EK> 38 data packets unnecessarily retransmitted
EK> 0 resends initiated by MTU discovery
EK> 75509 ack-only packets (64514 delayed)
EK> 0 URG only packets
EK> 0 window probe packets
EK> 0 window update packets
EK> 3824 control packets
EK> 1653273 packets received
EK> 115656 acks (for 8017539 bytes)
EK> 1967 duplicate acks
EK> 0 acks for unsent data
EK> 104386 packets (4886466 bytes) received in-sequence
EK> 62 completely duplicate packets (1512 bytes)
EK> 0 old duplicate packets
EK> 0 packets with some dup. data (0 bytes duped)
EK> 4 out-of-order packets (112 bytes)
EK> 0 packets (0 bytes) of data after window
EK> 0 window probes
EK> 3 window update packets
EK> 1 packet received after close
EK> 4 discarded for bad checksums
EK> 0 discarded for bad header offset fields
EK> 0 discarded because packet too short
EK> 0 discarded due to memory problems
EK> 1441 connection requests
EK> 1091 connection accepts
EK> 0 bad connection attempts
EK> 6726 listen queue overflows
EK> 25 ignored RSTs in the windows
E