from:"Bruce M. Simpson"

Re: Route messages

2008-06-15 Thread Bruce M. Simpson


Paul wrote:

Get these with GRE tunnel on
FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun May 11 19:00:57 EDT 
2008 :/usr/obj/usr/src/sys/ROUTER  amd64

But do not get them with 7.0-RELEASE

Any ideas what changed? :)  Wish there was some sort of changelog..
# of messages per second seems consistent with packets per second on 
GRE interface..
No impact in routing, but definitely impact in cpu usage for all 
processes monitoring the route messages.


RTM_MISS is actually fairly common when you don't have a default route.

Messages which get enqueued don't necessarily get delivered -- and very 
few processes actually listen to the routing socket actively like this, 
so I wouldn't worry about it.


If it's a real concern for you then you could try hacking in a sysctl to 
tell the radix trie code not to issue RTM_MISS messages on the routing 
socket.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: HEAD UP: non-MPSAFE network drivers to be disabled (was: 8.0 network stack MPsafety goals (fwd))

2008-07-01 Thread Bruce M. Simpson


Robert Watson wrote:


An FYI on the state of things here: in the last month, John has 
updated a number of device drivers to be MPSAFE, and the USB work 
remains in-flight. I'm holding fire a bit on disabling IFF_NEEDSGIANT 
while things settle and I catch up on driver state, and will likely 
send out an update next week regarding which device drivers remain on 
the kill list, and generally what the status of this project is.


Goliath needs to get stoned, it's been a major hurdle in doing 
IGMPv3/SSM because of the locking fandango. I look forward to it.


[For those who ask, what the hell? IGMPv3 potentially makes your 
wireless multicast better with or without little things like SSM, 
because of protocol robustness, compact state-changes, and the use of a 
single link-local IPv4 group for state-change reports, making it easier 
for your switches to actually do their job.]


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: BPF problems on FreeBSD 7.0

2008-07-14 Thread Bruce M. Simpson


Robin Sommer wrote:

Hi all,

we're seeing some strange effects with our libpcap-based application
(the Bro network intrusion detection system) on a FreeBSD 7-RELEASE
system. As the application has always been running fine on 6.x,
we're wondering whether this might be triggered by any of the
changes that went into 7.
  

...


I'm wondering whether anybody here has seen something similar or
might have an idea where to start looking for the cause. Any ideas?
  


One place to start might be: netstat -B output in 7.x (I *think* this 
got MFCed), this will let us see what the drop count is for the Bro 
process, and what the flags are for the open BPF descriptors in the system.


I'm not hot on current BPF internals, but I hazard a guess this is 
related to BPF descriptor buffering -- an area where there have been 
changes, some of which I've eyeballed.


cheers
BMS


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Small patch to multicast code...

2008-08-21 Thread Bruce M. Simpson


[EMAIL PROTECTED] wrote:

The only thing i can think of is that it's the UDP checksum,
residing beyond hlen, which is overwritten somewhere in the
call to if_simloop -- in which case perhaps a better fix is
to m_pullup() the udp header as well ?



It is the checksum that gets trashed, yes.
...
The m_*() routines actually have reasonable comments, it just seems
the wrong one was used here.
  


Actually, m_copy() has been legacy for some time now -- see comments.

I'd be concerned that the change to m_dup() (which makes a full mbuf 
chain copy) rather than m_copym() (which bumps refcounts) is going to 
eat into the mbuf clusters on fast links, though it's an easy band-aid 
for the problem.


I agree with Luigi that some of the API contract for mbuf(9) doesn't 
hold any more now that we have TSO and other offload.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Small patch to multicast code...

2008-08-22 Thread Bruce M. Simpson


[EMAIL PROTECTED] wrote:

I gather you mean that a fast link on which also we're looping back
the packet will be an issue?  Since this packet is only going into the
simloop() routine.
  


We end up calling if_simloop() from a few "interesting" places, in 
particular the kernel PIM packet handler.


In this particular case we're going to take a full mbuf chain copy every 
time we send a packet which needs to be looped back to userland.


  
I was actually hoping, as the person who last hacked this code, that
you might have a suggestion as to a "right" fix.  
  


It's been a while since I've done any in-depth FreeBSD work other than 
hacking on the IGMPv3 snap, and my time is largely tied up with other 
work these days, sadly.


It doesn't seem right to my mind that we need to make a full copy of an 
mbuf chain with m_dup() to workaround this kind of problem.


Whilst it may suffice for a band-aid workaround, we may see mbuf pool 
fragmentation as packet rates go up.


However we are now in a "new world order" where mbuf chains may be very 
tied to the device where they've originated or to where they're going. 
It isn't clear to me where this kind of intrusion is happening.


In the case of ip_mloopback(), somehow we are stomping on a read-only 
copy of an mbuf chain. The use of m_copy() with m_pullup() there is fine 
according to the documented uses of mbuf(9), although as Luigi pointed 
out, most likely we need to look at the upper-layer protocol too, e.g. 
where UDP checksums are also being offloaded.


Some of the code in the IGMPv3 branch actually reworks how loopback 
happens i.e. the preference is not to loop back wherever possible 
because of the locking implications. Check the bms_netdev branch history 
for more info.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Small patch to multicast code...

2008-08-22 Thread Bruce M. Simpson


[EMAIL PROTECTED] wrote:

Somehow the data that the device needs to do the proper checksum
offload is getting trashed here.  Now, since it's clear we need a
writable packet structure so that we don't trash the original, I'm
wondering if the m_pullup() will be sufficient.
  


If it's serious enough to break UDP checksumming on the wire, perhaps we 
should just swallow the mbuf allocator heap churn and do the m_dup() for 
now, but slap in a big comment about why it's there.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Code review request

2008-08-24 Thread Bruce M. Simpson


M. Warner Losh wrote:

I've been shepherding this patch in my p4 tree for a long time.  It
removes the obsolete support for other systems in if_spppsubr.c.  Is
there a reason I shouldn't commit this?
  


Looks fine to me.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [CFT/R] IPv4 source address selection

2008-08-24 Thread Bruce M. Simpson


Bjoern A. Zeeb wrote:

Hi,

I have a patch, that was inspired by work from Y!, to do porper
IPv4 source address selection for unbound sockets (with multi-IP
jails).


Hi,

This kinda overlaps with some other ideas I'd like to see go in. It 
looks good and if it's already been tested, it should probably go in 
anyway as it disentangles the logic and puts it in a separate function.


I'm thinking we may wish to use criteria other than interface or jailed 
socket to select source address.


I should point out though that we picked some stuff up from KAME to do 
source address selection but it's not in the IPv4 stack.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: reading routing table

2008-09-01 Thread Bruce M. Simpson


Debarshi Ray wrote:

I am implementing a library/utility which basically encompasses the
features of the traditional route utilities and those of newer tools
(like ip from iproute2), which are mostly specific to a particular
kernel. The overpowering objective is to make the library/utility work
uniformly across all different kernels, so that programs like
NetworkManager have a portable library/utility to use instead of the
Linux-kernel specific ip which is now being used.
  


Why don't you just use XORP's FEA code?
It already does all this under a BSD-type license.

cheers
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: reading routing table

2008-09-01 Thread Bruce M. Simpson


Debarshi Ray wrote:

...
I was going through the FreeBSD and NetBSD documentation and the
FreeBSD sources of netstat and route. I was suprised to see that while
NetBSD's route implementation has a 'show' command, FreeBSD does not
offer any such thing. Moreover it seems that one can not read the
entire routing table using the PF_ROUTE sockets and RTM_GET returns
information pertaining to only one destination. This suprised me
because one can do such a thing with the Linux kernel's RTNETLINK.

Is there a reason why this is so? Or is reading from /dev/kmem the
only way to get a dump of the routing tables?
  


You want 'netstat -rn' to dump them, this is a very common command which 
should be present in a number of online resources on using and 
administering FreeBSD so I am somewhat surprised that you didn't find it.


P.S. Look in the sysctl tree if you need to snapshot the kernel IP 
forwarding tables. You can use kmem, but it is generally frowned upon 
unless you're working from core dumps -- kernels can be built without 
kmem support, or kmem locked down, etc.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: reading routing table

2008-09-01 Thread Bruce M. Simpson


Debarshi Ray wrote:

Why don't you just use XORP's FEA code?
It already does all this under a BSD-type license.



I was not aware of it. What does it do? Is it portable across other
OSes or is it *BSD specific?
  


XORP's FEA process is responsible for talking to the underlying 
forwarding plane. It supports *BSD, Linux, MacOS X, and Microsoft Windows.


Over the last year there was a refactoring where the forwarding table 
management got split into plugin-like modules. It is written in C++ 
although it's likely this split might make integration into other 
projects easier.


Normally that support all goes into a single process, rather than being 
linked into many.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: how to read dynamic data structures from the kernel (was Re: reading routing table)

2008-09-02 Thread Bruce M. Simpson


Luigi Rizzo wrote:

do you know if any of the *BSD kernels implements some good mechanism
to access a dynamic kernel data structure (e.g. the routing tree/trie,
or even a list or hash table) without the flaws of the two approaches
i indicate above ?
  


Hahaha. I ran into an isomorphic problem with Net-SNMP at work last week.

   There's a need to export the BGP routing table via SNMP. Of course 
doing this in our framework at work requires some IPC calls which always 
require a select() (or WaitForMultipleObjects()) based continuation.
   Net-SNMP doesn't support continuations at the table iterator level, 
so somehow, we need to implement an iterator which can accomodate our 
blocking IPC mechanism.


  [No, we don't use threads, and that would actually create more 
problems than it solves -- running single-threaded with continuations 
lets us run lock free, and we rely on the OS's IPC primitives to 
serialize our code. works just fine for us so far...]


   So we would end up caching the whole primary key range in the SNMP 
sub-agent on a table OID access, a technique which would allow us to 
defer the IPC calls providing we walk the entire range of the iterator 
and cache the keys -- but even THAT is far too much data for the BGP 
table, which is a trie with ~250,000 entries. I hate SNMP GETNEXT.


   Back to the FreeBSD kernel, though.

   If you look at in_mcast.c, particularly in p4 bms_netdev, this is 
what happens for the per-socket multicast source filters -- there is the 
linearization of an RB-tree for setsourcefilter().
   This is fine for something with a limit of ~256 entries per socket 
(why RB for something so small? this is for space vs time -- and also it 
has to merge into a larger filter list in the IGMPv3 paths.)
   And the lock granularity is per-socket. However it doesn't do for 
something as big as a BGP routing table.


   C++ lends itself well to expressing these kinds of smart-pointer 
idioms, though.
   I'm thinking perhaps we need the notion of a sysctl iterator, which 
allocates a token for walking a shared data structure, and is able to 
guarantee that the token maps to a valid pointer for the same entry, 
until its 'advance pointer' operation is called.


Question is, who's going to pull the trigger?

cheers
BMS

P.S. I'm REALLY getting fed up with the lack of openness and 
transparency largely incumbent in doing work in p4.


Come one come all -- we shouldn't need accounts for folk to see and 
contribute what's going on, and the stagnation is getting silly. FreeBSD 
development should not be a committer or chum-of-committer in-crowd.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Problem with IFDATA_DRIVERNAME sysctl

2008-09-09 Thread Bruce M Simpson


Whenever I call this sysctl, I get an errno of EPROGNOTAVAIL from sysctl():

»···name[0] = CTL_NET;
»···name[1] = PF_LINK;
»···name[2] = NETLINK_GENERIC;
»···name[3] = IFMIB_IFDATA;
»···name[4] = ifindex;
»···name[5] = IFDATA_DRIVERNAME;

»···len = IFNAMSIZ;
»···if (sysctl(name, 6, dname, &len, NULL, 0) == -1) {
»···»···warnc(EX_OSERR, "cannot obtain driver name for ifname %s",
»···»···ifname);
»···»···return (-1);
»···}

The ifindex is valid. "dname" is a pointer to an IFNAMSIZ sized buffer. 
This problem is happening on a 7.0-RELEASE system.


It looks like the switch..case in that path could be fubar'd by the 
compiler as there are not break statements for each distinct case label, 
could this be due to gcc friendly fire?


cheers
BMS


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Problem with IFDATA_DRIVERNAME sysctl

2008-09-09 Thread Bruce M. Simpson


Bruce M Simpson wrote:


It looks like the switch..case in that path could be fubar'd by the 
compiler as there are not break statements for each distinct case 
label, could this be due to gcc friendly fire?


Possibly false alarm or PEBKAC, I wasn't checking return values right in 
some of my code, although we should probably have "break" there anyway.


Patch against RELENG_7_0.
--- if_mib.c.orig   2008-09-10 00:31:25.0 +0100
+++ if_mib.c2008-09-10 00:32:15.0 +0100
@@ -90,6 +90,7 @@
switch(name[1]) {
default:
return ENOENT;
+   break;
 
case IFDATA_GENERAL:
bzero(&ifmd, sizeof(ifmd));
@@ -136,6 +137,7 @@
error = SYSCTL_IN(req, ifp->if_linkmib, ifp->if_linkmiblen);
if (error)
return error;
+   break;
 
case IFDATA_DRIVERNAME:
/* 20 is enough for 64bit ints */
@@ -152,6 +154,7 @@
error = EPERM;
free(dbuf, M_TEMP);
return (error);
+   break;
}
return 0;
 }
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: reading routing table

2008-09-18 Thread Bruce M. Simpson


Debarshi Ray wrote:

...
By the way, would you want someone to implement 'show' support for
FreeBSD's route implementation? I can give it a go now. :-)
  


For sure, we'd be very happy to see a patch like that.

Many thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.

2008-09-21 Thread Bruce M. Simpson


[EMAIL PROTECTED] wrote:

Old Synopsis: icmp socket receives icmp replies not owned by the process.
New Synopsis: [icmp]: icmp socket receives icmp replies not owned by the 
process.
  


This PR is bogus because:
ICMP has no concept of datagrams being "owned" by a process. There is no 
field in the ICMP protocol which differentiates ICMP "sessions" on a 
per-process basis, and this is because ICMP has no concept of "sessions" 
-- ICMP messages are directed at IP endpoints.


The networking stack will only selectively dispatch ICMP traffic based 
on two conditions:

1. ip_proto number (raw sockets may selectively bind to a protocol) and
2. multicast group membership (not applicable in this instance).

> It also shows that both echo requests have different identifiers in 
the id field which should keep the icmp streams seperated.


There is absolutely no requirement for the kernel code to look at the ID 
field, beyond reporting it to consumers of the SOCK_RAW interface.


This PR can be closed, the submitter should consult the pfSense maintainers.

thanks
BMS





___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.

2008-09-21 Thread Bruce M. Simpson

The following reply was made to PR kern/127528; it has been noted by GNATS.

From: "Bruce M. Simpson" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: freebsd-net@FreeBSD.org, [EMAIL PROTECTED]
Subject: Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned
 by the process.
Date: Sun, 21 Sep 2008 23:12:30 +0100

 [EMAIL PROTECTED] wrote:
 > Old Synopsis: icmp socket receives icmp replies not owned by the process.
 > New Synopsis: [icmp]: icmp socket receives icmp replies not owned by the 
 > process.
 >   

 This PR is bogus because:
 ICMP has no concept of datagrams being "owned" by a process. There is no 
 field in the ICMP protocol which differentiates ICMP "sessions" on a 
 per-process basis, and this is because ICMP has no concept of "sessions" 
 -- ICMP messages are directed at IP endpoints.

 The networking stack will only selectively dispatch ICMP traffic based 
 on two conditions:
  1. ip_proto number (raw sockets may selectively bind to a protocol) and
  2. multicast group membership (not applicable in this instance).

  > It also shows that both echo requests have different identifiers in 
 the id field which should keep the icmp streams seperated.

 There is absolutely no requirement for the kernel code to look at the ID 
 field, beyond reporting it to consumers of the SOCK_RAW interface.

 This PR can be closed, the submitter should consult the pfSense maintainers.

 thanks
 BMS

 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to "[EMAIL PROTECTED]"

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.

2008-09-22 Thread Bruce M. Simpson


Chris Buechler wrote:


This PR is bogus because:
ICMP has no concept of datagrams being "owned" by a process. There is 
no field in the ICMP protocol which differentiates ICMP "sessions" on 
a per-process basis, and this is because ICMP has no concept of 
"sessions" -- ICMP messages are directed at IP endpoints.


ICMP echo and echo replies do have "sessions" of sorts, at least 
unique identifying fields - identifier and sequence number.


These fields do exist in ICMP, and as you point out, they are sometimes 
used to implement session-like behaviour.  Many NAT implementations use 
them in this way.


However there is no way of specifying them in a bind() call -- ICMP can 
only be received on a raw socket, and raw sockets will not filter these 
things on behalf of a user process, nor have they ever done to the best 
of my knowledge. They are not part of the address structures for a raw 
socket (SOCK_RAW, PF_INET, * or IPPROTO_ICMP).




This was opened by a pfSense maintainer because it's a change in 
behavior from 6.x releases where this was never an issue, and is 
something we feel is a regression.


Robert has replied outlining a few situations where the behaviour might 
have changed.


Raw sockets do support binding laddr/faddr, there is the possibility 
this could have changed, however there is no notion of processes 
"owning" streams of ICMP messages, this has never been part of the ICMP 
protocol and to think in these terms is misleading.


It sounds to me as though the application is relying on a form of 
filtering which isn't happening, and the way to track this down is to 
carefully note what, if anything, changed in the expected behaviour 
between releases.


For example, does the application bind() to any given host addresses? 
This is the only form of filtering, apart from multicast SSM, that raw 
sockets would support, and SSM ain't in the tree [yet].


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ACE on FreeBSD?

2008-09-24 Thread Bruce M. Simpson


Hi,

I looked at ACE years and years ago (~1997) when Doug Schmidt was first 
promoting the ideas behind it. The whole Reactor/Proactor split pretty 
much hangs on the event dispatch which your particular OS supports.


The key observation is whether your target OS implements events in an 
edge-triggered or level-triggered way; I am borrowing definitions from 
electronic engineering here.


You could do a straight port with Proactor, but performance will 
probably suck, because both FreeBSD (and Linux, I believe) need to 
emulate POSIX asynchronous I/O operations.


Reactor will generally "fare better" on UNIX derived systems such as 
FreeBSD and Linux, because its event handling primitives are geared 
towards the level-triggered facilities provided by select().


In Windows, Winsock events use asynchronous notifications which may be 
tied to Win32 EVENT objects, and the usual Kernel32.DLL thread 
primitives are used around this. This makes Proactor more appropriate in 
that environment.


XORP does some similar stuff to ACE under the hood to support the native 
socket facilities of both Windows and FreeBSD/Linux. It's hybridized but 
it behaves more like Reactor because we run in a single thread, and you 
have to force Winsock's helper thread to run, by preempting you, using 
some file handle and socket tricks.


I don't currently know about stability of ACE on FreeBSD.

cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...

2008-09-24 Thread Bruce M. Simpson


Hi,

I agree with the intent of the change that IPv4 and IPv6 input queues 
should have a tunable queue length. However, the change provided is 
going to make the definition of IFQ_MAXLEN global and dependent upon a 
variable.


[EMAIL PROTECTED] wrote:

Hi,

It turns out that the last time anyone looked at this constant was
before 1994 and it's very likely time to turn it into a kernel
tunable.  On hosts that have a high rate of packet transmission
packets can be dropped at the interface queue because this value is
too small.  Rather than make a sweeping code change I propose the
following change to the macro and updating a couple of places in the
IP and IPv6 stacks that were using this macro to set their own global
variables.
  


This isn't appropriate for many uses of ifq's which might be internal to 
a given driver or subsystem, and which may use IFQ_MAXLEN for 
convenience, as Ruslan has pointed out. I have code elsewhere which does 
this.


Can you please do this on a per-protocol stack basis? i.e. give IPv4 and 
IPv6 their own TUNABLE queue length.


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...

2008-09-24 Thread Bruce M. Simpson


[EMAIL PROTECTED] wrote:

...
I found no occurrences of the above in our code base.  I used cscope
to search all of src/sys.  Are you aware of any occurrences of this?
  


I have been using IFQ_MAXLEN to size buffer queues internal to some 
IGMPv3 stuff.


I don't feel comfortable with a change which sizes the queues for both 
IPv4 and IPv6 stacks, from a variable which is obscured by a macro.


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: lost routes

2008-09-24 Thread Bruce M. Simpson


Giulio Ferro wrote:
 
There are no messages in the logs, and no interface has been

touched. Anyway, since there are a lot of routes and only one
gets deleted I don't think it depends on interface changing
(it would delete them all, wouldn't it?)


Normally static routes only get touched if the state of the underlying 
ifp/ifa changes. There are paths in netinet which will cause routes to 
be deleted in this situation.


Occasionally the idea of a floating static re-surfaces... look in the PR 
database with this term for possibly related reports.


cheers
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Initialisation of a networking protocol

2008-09-29 Thread Bruce M. Simpson


Hi Ryan,

Did you initialize the .pr_init member of struct protosw for MPLS?

AFAIK, MPLS does not use an outer IP header, so adding a struct 
ipprotosw won't work; they are similar structs however.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Freeing an mbuf cluster

2008-10-02 Thread Bruce M. Simpson


Yony Yossef wrote:

Hi All,

I'm trying to manually build an mbuf chain with clusters in various sizes.
I'm doing it using the MGETHDR and MEXTADD macros, it works fine.
Now I'm looking for the simplest way to free an mbuf cluster, since I want
to free the clusters seperately. This function will be given as a parameter
to MEXTADD.

Is there a simple command like 'free(buf)' to free an mbuf cluster?
  


You don't specify if you are trying to add the external storage from a 
pool you manage, in which case, you're on your own.


m_free() for a cluster or mbuf should just "do the right thing". Since 
the UMA cleanup there are destructor functions which should free the 
mbuf or cluster using the right pool.


m_freem() works on chains, of course.

cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

How to support an Ethernet PHY without ID registers?

2008-10-07 Thread Bruce M Simpson


Hi,

I have been trying to get FreeBSD onto the Freecom FSG3 Storage Gateway.
It is an xScale based ARM system.

Whilst the npe(4) driver appears to attach, the PHY does not. It is a 
Realtel RTL8305SB switch chip in dual miibus mode. Unfortunately the 
RTL8305SB does not have ID registers. The RTL8305SC does, but it's a 
totally different chip.


We do have a driver in the tree for the RTL8305SC, however these chips 
are different enough for this to cause problems.


Is there any way I could for example force ukphy(4) to attach?

Note: Because there are no ID registers, mii_phy_probe_gen() WILL NOT 
work. It looks like I'd have to override this by hacking if_npe.c 
itself. Can anyone clarify?


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: How to support an Ethernet PHY without ID registers?

2008-10-11 Thread Bruce M. Simpson


Sepherosa Ziehau wrote:

Are you sure you could read from BMSR?  Return invalid value from BMSR
is the usual cause of miibus attaching/probing failure.  For ID1/ID2
reading, you could just fake some values in npe(4)'s miibus_readreg
implementation.
  


Thanks for the tip (from you and Pyun). I had to spoof the BMSR read to 
get npe(4) to attach just to begin with. For whatever reason the chip 
doesn't seem to respond on any of the PHY IDs which the Linux folk are 
using (5 and 4 for npe0 (-B) and npe1 (-C) respectively).


I noticed the ucLinux folk needed a similar patch to force driver attach 
under Linux w/the IXP: 
http://mailman.uclinux.org/pipermail/uclinux-dev/2005-March/031419.html


The switch pretty much disappears after npe(4) attaches, I don't see any 
activity lights or link lights at that point. This seems to happen after 
any mii register access.


If I frob things to allow rlswitch to attach, by using hints and hacking 
if_npe.c, I can get dumps of the PHY register space, but it's all ones, 
suggesting that it failed at xScale register level -- that would suggest 
the PHY IDs are *wrong*, or something else isn't right.


Pyun also suggested trying to manually take the PHYs out of power-down 
mode. I tried that with a code snippet I sent him, but still no dice. I 
can't even be sure that the PHYs are being addressed right.


At this point I kind of have to go, whoah, wish I had a logic analyzer 
and grabbers! I believe the firmware configures the switch chip in a 
certain VLAN configuration which isn't meant to be disrupted, although 
Freecom's own SnapGear-based distro apparently does the right thing.


I've looked through all of their GPL materials and cannot find the 
driver for the switch.


I suppose one thing I could try is re-flashing the box with the official 
Freecom firmware, and using mii-diag to dump out what Linux thinks the 
registers are.


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Vimage howto

2008-12-08 Thread Bruce M. Simpson


Julian,

Thank you (and Marko) very much for preparing this document.

The VIMAGE import has had me at something of an impasse re: the IGMPv3 
branch and clearly written documentation is a big help indeed.


Julian Elischer wrote:

Well not completely, but I've had a number of questions over the
last few months about what it is, so, as Marko and I have written
the following "how to virtualize your module" document, I've been
directing people to it. After another couple of questions I think
this could do with wider distribition..


Thank you also for providing it here on the list, as opposed to relying 
on Perforce alone. Whilst I understand committers rate p4 for 
experimental work in the FreeBSD sphere, sadly it is simply not 
accessible to the not-so-silent majority in the FreeBSD sphere who are 
not committers, which makes its continued use questionable at best.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: how to program a driver?

2008-12-09 Thread Bruce M. Simpson


Espartano wrote:

Actually i know how to program with C language in a basic level but i
don't know nothing about hardware or computer organization, what
topics i should study for gain knowledges about net-drivers ? or if
someone can recommend me books about this topic  i will be very
thankful.
  


Try "The Indispensable PC Hardware Book" by Hans-Peter Messmer for a 
general overview of PC architecture.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: how to program a driver?

2008-12-09 Thread Bruce M. Simpson


[Resend to list for everyone]

Espartano wrote:

Actually i know how to program with C language in a basic level but i
don't know nothing about hardware or computer organization, what
topics i should study for gain knowledges about net-drivers ? or if
someone can recommend me books about this topic  i will be very
thankful.
  


   The seminal work is TCP/IP Illustrated Volume 2 (Gary Wright and W. 
Richard Stevens, Addison-Wesley). Whilst dated it will give you an 
overview of how all the parts in the BSD networking stack fit together.
   It really needs to be updated, however enough things are in flux 
right now that summarising all the changes would be difficult until say 
after FreeBSD 8.0 dust is settled.


   For computer architecture, probably best to learn PC architecture 
these days -- x86 is here to stay, kids, and Netbooks are something of a 
reactionary response triggered by the One-Laptop-Per-Child (OLPC) 
project. In my day, I learned 68000 assembly and C on the Amiga.


   Hans-Peter Messmer's "The Indispensable PC Hardware Book" is a huge 
book which cost me about 50 GBP new when I first bought it -- I was 
working in a reasonably well paid job at the time, but it can be found 
second hand no doubt around the world.
   Cover to cover it will tell you what you need to know about how the 
PC architecture fits together, but if you need more detail e.g. on stuff 
like FreeBSD network drivers, again, it's best to refer back to the 
source code itself.


Hope this helps.

cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Heads up --- Thinking about UDP and tunneling

2008-12-11 Thread Bruce M. Simpson


Hi,

I am missing context of what Max's suggestion was, do you have a 
reference to an old email thread?


Style bugs:
* needs style(9) and whitespace cleanup.
* C typedefs should be suffixed with _t for consistency with other 
kernel typedefs.

* Function typedefs usually named like foo_func_t (see other subsystems)

Have you looked at m_apply() ? It already exists for stuff like this 
i.e. functions which act on an mbuf chain, although it doesn't 
necessarily expect chain heads.


cheers
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: last call for L2/L3 rewrite code review

2008-12-11 Thread Bruce M. Simpson


Hi,

Just skimming this I notice it uses the if_afdata[AF_INET] pointer 
purely for lltbl purposes; this clashes with the IGMPv3 code drop.


Please look in the bms_netdev branch, where I introduce a 'struct 
ip_ifinfo' to make more general use of that slot. IGMPv3 needs to store 
per-interface state for AF_INET, so this slot really needs to be shared 
with other AF_INET stuff.


Looks like it needs to be updated for VIMAGE also, hopefully others more 
familiar with this can help -- I am busy enough with non-programming 
activity as it is to get up to speed on this, although I have at least 
managed to print Julian's write-up...


Other than that, it looks like a much needed improvement and we are all 
very grateful for our work on this.


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Having problems with limited broadcast

2009-01-07 Thread Bruce M. Simpson


Peter Steele wrote:

..

Based on the discussion in the link above, it doesn't seem like the
problem was entirely resolved by the patches mentioned in this thread.
Has anything been done since this discussion took place. Surely there
must be a way to get limited broadcast to work under FreeBSD.
  


You will need to go to the pcap layer to send limited broadcasts w/o any 
IPv4 addresses configured in a BSD stack for now. If you have an IP on 
the interface, you can just use IP_ONESBCAST.


thanks
BMS
 


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
  


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Having problems with limited broadcast

2009-01-08 Thread Bruce M. Simpson


Peter Steele wrote:

...
It's really a matter of time. We didn't anticipate limited broadcast
being broken in FreeBSD and we're scrambling to come up with a solution.
To be quite frank I haven't done anything with IPv6 before so it would
be more research to get up to speed on this option. It seems our best
option is scapy, which unfortunately I also haven't used before...
  


It's not broken -- it has always been this way in all BSD derived 
networking stacks.


Limited broadcast addresses just don't contain any information about 
where the datagram should go, and this is the case in all other 
implementations. They are similar to multicast addresses in that regard.


Linux has a knob SO_BINDTODEVICE which is partly there to workaround 
this problem, however it isn't the ideal semantic fit.


The folk who point out that link-local addresses could be used, have an 
interesting suggestion which might work for you.


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Having problems with limited broadcast

2009-01-08 Thread Bruce M. Simpson


Peter Steele wrote:

The folk who point out that link-local addresses could be used, have

an 
  

interesting suggestion which might work for you.



It's definitely interesting, but it is very likely that some of our
customers will want to be able to set their own IP ranges and not be
limited to 169.254/16. So we need a more generic solution.


Sounds like it's bpf/pcap city for you guys.

A similar bump-in-the-stack to SO_BINDTODEVICE, e.g. let's call it 
IP_SENDIF has been on the drawing board, but it needs appropriate 
security screening -- the ability to bypass the forwarding tables, 
whilst specifying an interface e.g. by index or name, would be desirable 
only for certain privileged processes.


BTW: If you guys are already looking at scapy, you may also wish to give 
pcs.sourceforge.net a look as an alternative.


It is a Python project which I did some hacking on with George 
Neville-Neill who started it. It has BPF/PCAP support out of the box and 
has a number of powerful features, including a packet-level expect() 
facility, which works in a very similar manner to pexpect (Python expect 
for text streams).


I added a scapy-like concatenation syntax ('/' operator) to it as that 
makes plugging packet chains together that much easier.


I have the beginnings of an IGMPv3 test suite in my home repo written 
using PCS, it uses pcap capture. I imagine a DHCP like protocol could 
easily be implemented using PCS too.


cheers
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Having problems with limited broadcast

2009-01-08 Thread Bruce M. Simpson


Peter Steele wrote:

...
I personally like this idea, but I'm not sure I can sell it to the
others. Are there any restrictions to these 169.254.x.y addresses?
  


169.254.0.0/16 must never appear outside a link -- it is strictly scoped 
to that link.


Currently the IPv4 BSD stack has no concept of link-scoped addresses, 
but IPv6 does. Link is a realized concept there because of KAME's 
support for the % syntax. Internally, interface indexes get used.


In practice this shouldn't be an issue as long as you can guarantee 
different addresses are used for the 169.254.0.0/16 block on each 
interface, however, it would mean any app using sockets would need to 
explicitly bind to the local address to ensure the correct interface is 
used. Furthermore, we effectively need to be able to support multiple 
next-hops for the 169.254.0.0/16 prefix, otherwise we can support only 
one such interface w/o significant kernel code rewrites.


So, really, LL may not buy you anything at all, and it's likely you need 
to go straight to pcap for your app. These restrictions have existed for 
years, and the fact that they haven't been addressed has largely been 
because there has been no community strategy to deal with it. I 
speculate some BSD-using organisations might have already solved these 
problems, however, without evidence (and code sharing), that's pure 
speculation.


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Having problems with limited broadcast

2009-01-08 Thread Bruce M. Simpson


Bruce M. Simpson wrote:

Peter Steele wrote:

...
I personally like this idea, but I'm not sure I can sell it to the
others. Are there any restrictions to these 169.254.x.y addresses?
  


169.254.0.0/16 must never appear outside a link -- it is strictly 
scoped to that link.


P.S. I checked in a change to ip_forward() a while back which enforces 
this, as forwarding such traffic between interfaces without NATting it 
or otherwise proxying it is a really bad idea (and also breaks the IPv4 
LL RFC).

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: howto determine network device unit number? device.hints?

2009-01-15 Thread Bruce M. Simpson


Yony Yossef wrote:

Thanks for the explanation.
 
So there's no way to determine this in advance.. 
I must build a script that contains my own mapping between MAC addresses and

the wanted interface names and run it after each driver load, rename the
interfaces if necessary.
It seems quite wrong, don't you agree?
 
And how come the unit number is given an arbitrary value? Is there a good

reason for that?
  


Normally the PCI probe runs in the opposite direction from that of 
Linux. It's largely to do with how the NEWBUS code walks the PCI bus. 
From a systems management point of view, yeah, it's irritating, however 
it would probably take more effort (i.e. kernel code) to try to patch it 
to work differently, and not everyone has free time to sit down and 
patch the kernel.


That and (unlike Solaris) there is no *direct* mapping between the 
card's driver number on the bus and its network driver number.


In your case I'm not sure why your two cards would flip order. Could it 
be how your BIOS and hardware set up the PCI IDSEL lines at boot?



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: howto determine network device unit number? device.hints?

2009-01-15 Thread Bruce M. Simpson


Yony,

Bruce M. Simpson wrote:


And how come the unit number is given an arbitrary value? Is there a 
good

reason for that?
  

...

In your case I'm not sure why your two cards would flip order. Could 
it be how your BIOS and hardware set up the PCI IDSEL lines at boot?


If this is the case on your system, then you really need to provide more 
data about your hardware, i.e. motherboard, BIOS, vendor information 
etc. as others point out.


Based on the data you've provided about the issue to date, my best guess 
is that something in the above is different on your system (which is why 
I mentioned IDSEL lines -- the mechanism PCI uses to actually assign bus 
numbers electrically).


Normally the behaviour of FreeBSD's bus probes is well known -- nexus is 
walked for child buses, then these buses are plumbed into NEWBUS, e.g. 
cpu0...cpuN on nexus itself, PCI buses, and PCI subordinate buses in 
that order.


* You mention you don't encounter the issue with Linux, but you may 
already be aware that udev can tie driver instance number(s) to specific 
MAC addresses, although this process isn't fully automatic and any given 
distro may or may not create the persistent udev rules on a first run -- 
so this is comparing apples with oranges.


* [PCI-Express is a special case though, and I've had to sit down and do 
some work with commercial clients to make sure their appliance was able 
to detect devices being in particular slot numbers. Again, though, it's 
just as subject to the PCI enumeration order further up on the bus 
hierarchy as non-PCI-Express drivers.]


So your issue may not be a simple matter of "this seems wrong, this 
doesn't work", though I am sorry to hear it isn't working for you right now.


There are a lot of dynamic factors in the overall picture of the system, 
and what seems to work as expected for many users, may not be working 
for you, and we really need basic hardware information, when folk see 
things like this happening, for any volunteer(s) out there to come up 
with the right solution, let alone the true picture of what's actually 
going on in your specific case.


thanks
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: howto determine network device unit number? device.hints?

2009-01-15 Thread Bruce M. Simpson


Eygene Ryabinkin wrote:

...
I wanted to stress only one point: simple 'kldunload ' and
'kldload ' makes devices to flip for Yony's case.  This means
that unless some PCI hotplug stuff is here (which I don't believe to be
present, because no physical cards are touched and there is actually a
small amount of PCI hotplug support in FreeBSD), no physical PCI devices
get added or removed from the PCI child tree.  It looks like that
something goes wrong during the PCI tree reprobe on the driver module
loading.
  


BTW: Thanks for looking further at the software layer first.

VIM is a wee bit easier to use than a bus analyzer.

Most motherboards don't support PCI geographical addressing, so... I 
wager it's the network driver code which may be the source of the 
problem, based on your analysis!


If this code just doing a blind bump of an instance count and using that 
as a "unit number"... well, that's OK and expected for software virtual 
devices, but is counter-intuitive for something like hardware.


But I don't have any mtnic source, so this is pure speculation on my part.


Correct me if I am wrong, but pci_driver_added from /sys/pci/pci.c will
invoke device_get_children() to get the list of the attached devices,
and for PCI case the list should be static.
  


Yup, that's right.


I guess that when Yony will enable verbose boot and will show us kernel
messages from two successive kldunload/kldload sequences, we will get
some additional information about what's going on.
  


Hopefully he will chime in...

[bms does some google searching *before* he thinks about throwing his 
toys out of the pram at the Orignal.Poster.]


ding :-) [a light bulb above bms' head]

So... Yony. you're writing a driver.
Maybe there's a bug in it?
That's cool, dude.
Hope it's a nice card and you plan on sharing the sweets with the rest 
of the class. ;-)


But seriously, please mention that you are writing a driver in general 
questions you might ask about the whole system, otherwise, FreeBSD 
volunteers will run around going "Is core code broken?" and that's not 
so good for community stress levels as a whole.


with lemonade,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: IGMP+WiFi panic on recent kernel - in igmp_fasttimo()

2009-03-14 Thread Bruce M Simpson


Sam,

Sam Leffler wrote:
This patches avoids the crash.  Not sure how ifma_protospec is 
supposed to be handled so I'm not committing it.


Thanks for this.

I have a test machine ready to be prepped but it's missing a CF card (I 
have none) so need to pick one up from a friend. I have a pci-cardbus 
adapter + a ral(4) CardBus card, but no CardBus ath(4) -- I imagine this 
ain't specific to ath(4) so that should be fine.


I'll try to look at this Sun/Mon, I have a -CURRENT image built for the 
1U box now that just needs bootstrapping (it has a CF slot).


thanks,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work

2009-03-23 Thread Bruce M Simpson


Matthias Apitz wrote:

I went today evening with my EeePC and CURRENT on USB key
to that Greek restaurant; DHCP does not get IP in CURRENT either;
this is somehow good news, isn't it :-)
  


This may be orthogonal, but:
   A lab colleague and I have been seeing a sporadic problem where the 
ath0 exhibits the symptoms of being disassociated from its AP. We are 
running RELENG_7 on the EeePC 701 since the open source HAL merge.
   In the behaviour we're seeing, we don't see any problem with the 
initial dhclient run, the ath0 just seems to get disassociated within 
5-10 minutes of associating.


If we leave 'ping ' running in the background, we don't 
see this problem.


   We have yet to produce a tcpdump to catch it 'in the act' and 
observe the DLT_IEEE80211 traffic when it actually happens, I have only 
seen the symptoms. The AP does not show the EeePC units as being 
associated any more at this point, but ath0 still shows 'status: 
associated'. The AP involved is a Netgear WG602 V2, and is running the 
vendor's firmware.


I'll try to get set up with 'tcpdump -y ieee802_11' from initial boot 
(including dhcp and anything we bump into).


cheers
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work

2009-03-23 Thread Bruce M Simpson

The following reply was made to PR kern/132722; it has been noted by GNATS.

From: Bruce M Simpson 
To: Matthias Apitz 
Cc: bug-follo...@freebsd.org, Sam Leffler , 
 freebsd-net@freebsd.org, "Sean C. Farley" 
Subject: Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP
 or IP does not work
Date: Mon, 23 Mar 2009 18:44:42 +

 Matthias Apitz wrote:
 > I went today evening with my EeePC and CURRENT on USB key
 > to that Greek restaurant; DHCP does not get IP in CURRENT either;
 > this is somehow good news, isn't it :-)
 >   

 This may be orthogonal, but:
 A lab colleague and I have been seeing a sporadic problem where the 
 ath0 exhibits the symptoms of being disassociated from its AP. We are 
 running RELENG_7 on the EeePC 701 since the open source HAL merge.
 In the behaviour we're seeing, we don't see any problem with the 
 initial dhclient run, the ath0 just seems to get disassociated within 
 5-10 minutes of associating.

 If we leave 'ping ' running in the background, we don't 
 see this problem.

 We have yet to produce a tcpdump to catch it 'in the act' and 
 observe the DLT_IEEE80211 traffic when it actually happens, I have only 
 seen the symptoms. The AP does not show the EeePC units as being 
 associated any more at this point, but ath0 still shows 'status: 
 associated'. The AP involved is a Netgear WG602 V2, and is running the 
 vendor's firmware.

 I'll try to get set up with 'tcpdump -y ieee802_11' from initial boot 
 (including dhcp and anything we bump into).

 cheers
 BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

ath0 apparent silent disassociation

2009-03-23 Thread Bruce M Simpson


[Repost without attachment]

OK. We've managed to reproduce this set of symptoms now in our work area.

[If anyone needs to see a pcap, please Cc: me offlist.]

Timebase: beginning of the pcap is in sync with a bringup from
single-user mode; the tcpdump runs in the background from init whilst
the system is brought up.

OK, so I timed the apparent loss of connectivity as 6m 30s from that
point I hit the stopwatch, to when I hit it again when the AP's Web GUI
no longer shows the STA affected as being associated.
Obviously such a timing is subject to human/visual jitter, and how
often Netgear's firmware pulls the STA association list from the AP into
the web GUI.

What stands out in the pcap is that 302.291s in (almost 5m exactly),
the STA (ath0) sends an IEEE 802.11 NULL frame to the AP with the PWR
MGT bit set (I'm going to sleep!). This more or less coincides with a
normal beacon from the Netgear AP. It does not advertise Auto Power Save
Delivery (apsd), that bit is 0.
This is puzzling as we don't enable power management by default. As
I understand it, this may be an AP feature in some environments... I can
try reproducing this with an explicit 'ifconfig ath0 -powersave' and see
if it reoccurs.

You'll see that after this NULL frame is sent, there is another
Probe Request, and the Netgear AP does Probe Respond, but this makes no
difference (I ended the capture around 150s after the NULL frame was sent).

At this point we can't send traffic from the ath0, or rather, the AP
is acting as though it never even heard the STA. The STA learns the AP's
IP address/MAC mapping through passive ARP -- we still see broadcasts on
the SSID -- but the AP has started to totally ignore the STA, and seemed
to have ignored its ARP requests also.
We are using MAC address ACL control with this AP, and the ath0
affected is definitely listed in its ACL table, configured up, rebooted etc.

It is as though the STA is entering power saving mode when not
explicitly told to, and the AP is not waking up the STA as it should.

If any more information needed, or where to look, please let me know
what's involved (I MFCed the change after all, so I'll help where I can
until I'm on holiday this week...)

My lab colleague is just working around this with 'ping ' for
now, that keeps things up, as does OpenVPN...

cheers
BMS



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/124282: [libc] socket(2): INP_PORTHIGH and INP_ONESBCAST share same value

2009-03-23 Thread Bruce M. Simpson


bru...@freebsd.org wrote:

Synopsis: [libc] socket(2): INP_PORTHIGH and INP_ONESBCAST share same value

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: brucec
Responsible-Changed-When: Mon Mar 23 21:45:54 UTC 2009
Responsible-Changed-Why: 
Over to maintainer(s).
  


rwatson@ saw this crop up in -CURRENT and I believe he has a fix. Not 
sure about MFC but it clearly needs to get fixed...


cheers,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work

2009-03-23 Thread Bruce M Simpson


John Hay wrote:

I found doing a -bgscan before it happens, make it not happen. I now
have -bgscan in my rc.conf.
  


That's exactly the workaround I needed. Thanks John.

As Sam points out, the root fix is probably already in HEAD; it would be 
nice to find time to backport, but this works for us for now as a 
workaround (we are just using ath0 as a STA for testing in the lab at 
the moment, it is likely we will use hostap later).


cheers,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work

2009-03-23 Thread Bruce M Simpson

The following reply was made to PR kern/132722; it has been noted by GNATS.

From: Bruce M Simpson 
To: John Hay 
Cc: Matthias Apitz , freebsd-net@freebsd.org, 
 Sam Leffler ,
 "Sean C. Farley" , bug-follo...@freebsd.org
Subject: Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP
 or IP does not work
Date: Tue, 24 Mar 2009 01:08:33 +

 John Hay wrote:
 > I found doing a -bgscan before it happens, make it not happen. I now
 > have -bgscan in my rc.conf.
 >   

 That's exactly the workaround I needed. Thanks John.

 As Sam points out, the root fix is probably already in HEAD; it would be 
 nice to find time to backport, but this works for us for now as a 
 workaround (we are just using ath0 as a STA for testing in the lab at 
 the moment, it is likely we will use hostap later).

 cheers,
 BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: CARP as a module; followup thoughts

2009-04-22 Thread Bruce M. Simpson


Hi,

Will Andrews wrote:

Hello,

I've written a patch (against 8.0-CURRENT as of r191369) which makes
it possible to build, load, run, & unload CARP as a module, using the
GENERIC kernel.  It can be obtained from:

http://firepipe.net/patches/carp-as-module-20090421.diff
  


There's no need to implement the in*_proto_register() stuff in that 
patch, you should just be able to re-use the encap_attach_func() 
functions. Look at how PIM is implemented in ip_mroute.c for an example.


Other than that it looks like a good start... but would hold off on 
committing as-is. the more general case of registering a MAC address on 
an interface should be considered.


cheers,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: intel 802.11 2200BG routing

2007-04-01 Thread Bruce M. Simpson


Da Rock wrote:


So I could use some guidance as to what I can do to rectifiy this 
problem. I have 2 goals:

1. setup iwi to start on boot, and attach to my ap whenever its in range.
2. make sure iwi stays connected without manually monitoring it.
3. prioritise my routes via the rl0 and iwi if's so that cable is used 
over wifi, but both can be used to access the network. 


Umm, that's 3 goals. :^) The short answer is, you can't do what you're 
trying to do, yet.


You can cut over without rebooting, you just need to remember to kill 
off all dhclient processes and manually remove the default route, as in 
FreeBSD all forwarding entries ('routes') reference an interface 
pointer, and the PRC_IFDOWN handler will not touch routes marked RTF_STATIC.


No one as far as I know has rolled a 'cutover' script. What would be 
really useful is a port which can do this cutover in a more general way 
until the stack is changed. This isn't that different from say Microsoft 
Windows where a manual cutover is needed, although the OS having a 
multipath FIB ('routing table') helps.


The long answer is, it's possible, and it requires some things in the 
network stack to be carefully reworked. I have looked at these issues in 
some depth; there are at least 3 items on the Network Stack Wiki which 
are directly relevant to making the kind of clean cut-over between 
wireless/wired interfaces possible.


Notably looking at the PRC_IFDOWN handler in netinet, making forwarding 
entry lookup skip interfaces marked down, and introducing route 
preference into the routing trie. There are historical reasons why the 
code is the way it is. It will take a while to get these issues 
addressed going forward.


Regards,
BMS

P.S. routed isn't going to help you at all in this situation, it's just 
an implementation of the RIPv2 routing protocol; it may have helped as 
the routes it introduces to the kernel are !RTF_STATIC.


One thing I haven't tried is IPv4 Router Discovery (rdisc), that may 
help update the default route quickly. The problem with this of course 
is the additional network configuration in the infrastructure itself.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: IPv6 Router Alert breaks forwarding

2007-04-05 Thread Bruce M. Simpson

I can only speak about IPv4 router alert in detail; we do nothing with 
IPv4 RA nor would it appear that it would make any real difference in 
performance given how the code is laid out. RSVP packets should be 
passed verbatim to userland from ip_input() via rip_input() there.


I think your IPv6 fix is good for now but will wait to hear further from 
[EMAIL PROTECTED]


I am heading out the door so if someone could add an item for this to 
http://wiki.FreeBSD.org/Networking I should be most grateful.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Source-specific multicast

2007-04-06 Thread Bruce M Simpson

I am very close to merging support for RFC 3678 to -CURRENT.  I will 
make a patch available before I commit.


The only userland consumer in the tree which is likely to be affected by 
the removal of ip_multicast_if() from the kernel is routed, which I will 
update to use the new setsourcefilter() API.


The SSM code does change some of the coupling between sockets and IGMP, 
and changes some logic in udp_input; strict multicast membership becomes 
the default. For systems which deal with many multicast sockets and 
traffic, they may benefit from an additional hash table. I haven't 
finished touching the raw IP input path.


Given current looming commitments I'm open to someone volunteering to 
finish the work of merging IGMPv3 and MLDv2, or possibly to fund the work.


I wish to get at least the socket part of ASM/SSM merged before I come 
back to Yar's PR with vlan and pfsync, which I have not had reason to 
investigate thoroughly; I have had no further reports of problems with 
carp(4) in -CURRENT.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: A radical restructuring of IPsec...

2007-04-06 Thread Bruce M. Simpson

I'm all for this in principle. I believe that the case for FAST_IPSEC 
over KAME IPSEC is fairly clear for those of us who have read the USENIX 
paper. Qualitatively speaking I can say FAST_IPSEC has been more 
pleasant to work with when introducing the TCP-MD5 support.


I will try to look at the patch in more detail as time permits.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Spillover routing?

2007-04-07 Thread Bruce M. Simpson


Rajkumar S wrote:

Hi,

I have a low cost 128kbps and a high cost 512 kbps link to internet.
Is it possible to do a "spillover" routing so that the high cost link
is used only when the low cost link is, say, used more than 80%.
This feature is almost certainly not going to be present in the base 
system. What you would need to do to implement this is to configure a 
part of the kernel to perform bandwidth measurements and make an upcall 
to bring up the other link in a dial-on-demand style configuration. Add 
NAT into the mix and it gets even more interesting. I believe pf+altq 
may have the potential to do this however I could not help you with 
where to begin re configuring it to do so, so I wish you best of luck in 
your research.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Call for testers: olsrd and IP_ONESBCAST

2007-04-09 Thread Bruce M Simpson


Hi,

For a while now I have had a patch available to teach olsrd to use 
IP_ONESBCAST instead of using libnet/bpf just to send broadcast 
datagrams in FreeBSD, which has had IP_ONESBCAST for a few years now.


If anyone is using olsrd on FreeBSD I would greatly appreciate testing 
and feedback for this patch: 
http://people.freebsd.org/~bms/dump/olsrd-onesbcast.diff


Thanks!
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Interface index hack in IP_ADD_MEMBERSHIP

2007-04-09 Thread Bruce M Simpson


Yar Tikhiy wrote:

Quagga still uses it, too, if its configure script detects FreeBSD
or NetBSD.  I'm afraid it was me who submitted the patch to the
Quagga folks when I'd found that Quagga's ospfd couldn't handle
unnumbered P2P interfaces in FreeBSD because their local IPs weren't
unique.  Unfortunately, Quagga doesn't seem to use the protocol
independent part of the RFC 3678 API yet.
  


A preliminary patch for the Rhyolite.com routed is available at:
   http://people.freebsd.org/~bms/dump/routed.rfc3678.diff

The upcoming rewrite of IPv4 multicast host-mdoe logic (currently in 
bms_netdev) adds support for the Linux-derived 'struct ip_mreqn' for 
specifying interface indexes to IP_MULTICAST_IF. The RFC 3678 API is 
implemented; IGMPv3 and MLDv2 may be hooked in later on subject to 
available resources.


The RFC 1724 hack has been completely removed from the kernel in this 
spin. The new code passes the existing regression tests for any-source 
multicast. I hope to have source-specific multicast regression tests in 
the main tree ASAP, I am very close to a code drop.


Whilst the radical approach of rewriting this stuff may break legacy 
applications, they should probably be updated to support the new APIs 
anyway, given that Linux 2.6 and Microsoft Windows "Longhorn" both 
support RFC 3678.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

[CODE DROP] KDE support for Avahi service browsing on FreeBSD

2007-04-10 Thread Bruce M Simpson


Hi,

As part of my ongoing work to support Zero-configuration networking in 
FreeBSD, please check out the following patches. I have been able to 
browse and connect to services from with the KDE environment using these.


At the moment, the basic kernel support for Zeroconf/Bonjour is in place 
in FreeBSD. The next challenge there is to support address scope and 
preference for IPv4.


http://people.freebsd.org/~bms/dump/nss_mdns.diff
is a patch for the FreeBSD port of nss_mdns which I may commit shortly 
as it fixes a dynamic symbol issue found by Pat Lashley. nss_mdns must 
be installed and configured in FreeBSD's /etc/nsswitch.conf files before 
proceeding.


http://people.freebsd.org/~bms/dump/avahi-qt3.diff
is a patch for the FreeBSD avahi port to build and install the QT3 
bindings for Avahi. After applying this patch and reinstalling the avahi 
port, please manually change the following file's 'prefix' line to point 
to ${X11BASE}, at ${LOCALBASE}libdata/pkgconfig/avahi-qt3.pc. e.g. 
"prefix=/usr/local" would become "prefix=/usr/X11R6".
This is to allow kdnssd-avahi's configure script to find QT's 
Meta-object compiler (moc) which the FreeBSD ports system installs under 
${X11BASE} by default.


[Help from a ports committer to convert this patch into a 'slave port' 
would be very appreciated.]


http://people.freebsd.org/~bms/dump/kdnssd_avahi.tar
is a port for kdnssd-avahi. Installing this port will overwrite the 
default libkdnssd.so.1 library which is installed by the kdelibs port.


After applying both of these changes to your system, you must completely 
restart KDE for them to take effect. Please read the pkg-message file in 
kdnssd_avahi.tar for step-by-step information on how to test the Avahi 
support for KDE in FreeBSD.


I would greatly welcome your further testing and feedback. I apologise 
in advance for the unpolished nature of this work, however, integration 
of Avahi with KDE is an ongoing challenge for many other open source 
projects, and I would hope that the loose ends on FreeBSD become tied 
together in the near future.


Thanks again,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [PATCH] kern/681110 re-roll of RFC3522 (Eifel detection) patchset

2007-04-12 Thread Bruce M. Simpson


First of all,

Xin: Many thanks for your excellent work on bringing the code up to date.

Mike Silbersack wrote:


No.  That is not going into FreeBSD if I can help it.
http://www.ietf.org/ietf/IPR/ERICSSON-EIFEL
On top of that, we don't need yet another complication to the already 
too-complex retransmission code.
I wasn't aware of Ericsson's submission on this basis. Whilst FreeBSD's 
license is recognised by the OSI, the implications of having code in the 
kernel which are covered by an Ericsson patent are quite grim if anyone 
wishes to use FreeBSD for commercial purposes.


I therefore agree with you that that this change should not go in, and 
have removed it from the Wiki.


Kind regards,
BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ipv6 multicast refcnt panic

2007-04-12 Thread Bruce M. Simpson


Andrew Thompson wrote:

I have come across this panic which appears to be from incorrect
refcounting on the inet6 multicast code.
  
I'm assuming this is in -CURRENT, as the refcount code has not yet been 
MFCed.


...


in6m_refcount is still 1 so the in6_multi is not freed.


I'll try to investigate further as time permits. Thanks for pointing 
this out, I suspect the same problem affects vlan and other nested cloners.


Regards,
BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: altq unfortunately queuing vlan traffic.

2007-04-12 Thread Bruce M. Simpson

I can't speak for ALTQ at the moment however I believe dummynet may work 
on vlan devices.


I was careful not to break this when rewriting ether_input() in 
-CURRENT, as ip_dn_check_rule() is always called any time ether_demux() 
is entered (regardless if ether_input() has been re-entered due to the 
presence of M_PROMISC on a given mbuf chain).


Regards,
BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ipv6 multicast refcnt panic

2007-04-12 Thread Bruce M. Simpson


Andrew Thompson wrote:

I have come across this panic which appears to be from incorrect
refcounting on the inet6 multicast code.
  


I can reproduce this panic, however I don't entirely understand what's 
going on. When the same IPv6 unicast address is configured twice on the 
edsc0 interface, the ifmcstat(8) utility reports that the refcnt for two 
IPv6 multicast addresses changed. I do not understand why the duplicate 
unicast address isn't rejected, or why groups are being joined twice for 
the same address.


I strongly suspect this is a bug in KAME the kind of which existed in 
netinet (whereby the 224.0.0.1 address was being joined more than once 
per ifnet) which the refcounting change has exposed as a panic, a very 
brief look at the ifaddr code in netinet6 suggests this is the case.


Before second address assignment:

edsc0:
   inet6 f00f::1
   group ff01::1%edsc0 refcnt 1
   mcast-macaddr 33:33:00:00:00:01 refcnt 1
   group ff02::2:f23c:3567%edsc0 refcnt 1
   mcast-macaddr 33:33:f2:3c:35:67 refcnt 1
   group ff02::1%edsc0 refcnt 1
   mcast-macaddr 33:33:00:00:00:01 refcnt 1
   group ff02::1:ff00:1%edsc0 refcnt 1
   mcast-macaddr 33:33:ff:00:00:01 refcnt 1

After second address assignment:

edsc0:
   inet6 f00f::1
   group ff02::1:ff00:1%edsc0 refcnt 1
   mcast-macaddr 33:33:ff:00:00:01 refcnt 1
   group ff01::1%edsc0 refcnt 2
   mcast-macaddr 33:33:00:00:00:01 refcnt 1
   group ff02::2:f23c:3567%edsc0 refcnt 2
   mcast-macaddr 33:33:f2:3c:35:67 refcnt 1
   group ff02::1%edsc0 refcnt 2
   mcast-macaddr 33:33:00:00:00:01 refcnt 1

The order of the addresses in the list has flipped around, which makes 
visual comparison that much more difficult. Flipping those around to the 
same order as the first sample yields:


edsc0:
   inet6 f00f::1
   group ff01::1%edsc0 refcnt 2
   mcast-macaddr 33:33:00:00:00:01 refcnt 1
   group ff02::2:f23c:3567%edsc0 refcnt 2
   mcast-macaddr 33:33:f2:3c:35:67 refcnt 1
   group ff02::1%edsc0 refcnt 2
   mcast-macaddr 33:33:00:00:00:01 refcnt 1
   group ff02::1:ff00:1%edsc0 refcnt 1
   mcast-macaddr 33:33:ff:00:00:01 refcnt 1

So we can be sure the addresses themselves haven't changed, but the 
refcount on the IPv6 multicast entries has gone up by 1. The refcount is 
no longer proxied to the ifnet-level ifma object since the code was changed.


I don't entirely understand the relationship between the protocol-level 
multicast addresses and the unicast address in netinet6, or why 
attempting to configure the same unicast address on the same interface 
more than once wasn't rejected with an error.


As far as I can tell the code is correct for the single address case. 
I've attached a patch which makes the netinet6 detach path more like the 
netinet one, though this isn't going to make a great deal of difference 
apart from code style; the net code already calls in6_ifdetach() in the 
right order.


We can weaken the error checking in if_delmulti() to get an operational 
kernel, but this kind of defeats the point of doing the error checking 
(which is there to expose such problems). When reporting problems with 
the networking code it is helpful to use ifmcstat, INVARIANTS, and the 
DIAGNOSTIC kernel option as I tend to add code to catch cases like this.


Regards,
BMS


 //depot/user/bms/netdev/sys/netinet6/in6_ifattach.c#1 - /home/bms/p4/netdev/sys/netinet6/in6_ifattach.c 
--- /tmp/tmp.3746.0	Thu Apr 12 12:32:23 2007
+++ /home/bms/p4/netdev/sys/netinet6/in6_ifattach.c	Thu Apr 12 12:25:55 2007
@@ -76,6 +76,7 @@
 static int get_ifid __P((struct ifnet *, struct ifnet *, struct in6_addr *));
 static int in6_ifattach_linklocal __P((struct ifnet *, struct ifnet *));
 static int in6_ifattach_loopback __P((struct ifnet *));
+static void in6_purgemaddrs __P((struct ifnet *));
 
 #define EUI64_GBIT	0x01
 #define EUI64_UBIT	0x02
@@ -731,8 +732,6 @@
 	struct rtentry *rt;
 	short rtflags;
 	struct sockaddr_in6 sin6;
-	struct in6_multi *in6m;
-	struct in6_multi *in6m_next;
 
 	/* remove neighbor management table */
 	nd6_purge(ifp);
@@ -790,18 +789,10 @@
 		IFAFREE(&oia->ia_ifa);
 	}
 
-	/* leave from all multicast groups joined */
-
 	in6_pcbpurgeif0(&udbinfo, ifp);
 	in6_pcbpurgeif0(&ripcbinfo, ifp);
-
-	for (in6m = LIST_FIRST(&in6_multihead); in6m; in6m = in6m_next) {
-		in6m_next = LIST_NEXT(in6m, in6m_entry);
-		if (in6m->in6m_ifp != ifp)
-			continue;
-		in6_delmulti(in6m);
-		in6m = NULL;
-	}
+	/* leave from all multicast groups joined */
+	in6_purgemaddrs(ifp);
 
 	/*
 	 * remove neighbor management table.  we call it twice just to make
@@ -889,4 +880,23 @@
 	}
 
 	splx(s);
+}

Re: ipv6 multicast refcnt panic

2007-04-13 Thread Bruce M. Simpson

I speculate that the problem you are seeing in netinet6 is due to it not 
freeing referenced in6_multi objects when the interface address changes 
or the same address is re-added, as the same bug was present in netinet. 
Previous to the introduction of refcounting, FreeBSD would just leak memory.


Further to this:

The problem Yar was seeing with vlan and pfsync, which I pointed out, 
was an older bug which has been progressively shuffled around the stack 
due to code rewrites. I have a fix for the kernel panic caused by 
pfsync's member interface being detached which is now checked into 
bms_netdev, it should probably go straight into -CURRENT.


The fix is cumulative -- pfsync's detach handler is called after netinet 
has torn down all inet state for an instance of ifnet, therefore it 
should not be trying to call in_delmulti(), however it should mark the 
ifp as no longer valid for pfsync's use.


A suggested architectural fix going forward, is to change the semantics 
of objects owned by the netinet and netinet6 protocol domains, such as 
multicast group objects, to tear down hardware state when the ifnet 
instance goes away, yet allow consumers elsewhere in the kernel to 
retain handles for such objects. This is what the lower-level net code 
now does for ifmultiaddr objects.


if_delmulti_locked() accepts an argument which specifies whether it is 
being called from if_detach(). If so, hardware state is torn down, and 
internal structures are freed, but the object *is not* freed if its 
reference count is not zero as someone still holds a pointer.


In plainer language: netinet and netinet6 should probably be doing the 
same thing as net now does, insofaras this only apples to ifmultiaddr, 
the same should be done for in_multi and in6_multi.


Of course, it would be easier to do this if per-protocol-domain state in 
ifnet were e.g. moved to the if_afdata[] array currently defined in 
ifnet for this purpose, this is guaranteed to break the ABI. The 
situation in ifnet as it stands just now strikes me as one of confusion.


Regards,
BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [PATCH] Zeroconf: avahi-autoipd support for FreeBSD

2007-04-13 Thread Bruce M. Simpson


Bruce M Simpson wrote:


Comments and feedback, particularly more in-depth testing by another 
contributor, are very welcome. I have tested this on my local 802.11 
wireless segment with ath(4).


Before this can be committed to ports or pushed upstream, it is 
missing an rc script.


There has been feedback from the Avahi guys. I've updated the 
BSD-specific patch (which is against our present port), and the code has 
just been checked into Avahi SVN with some fixes. It would be great if 
someone could find time to look at integrating this.


This way, we get a working autoipd until Fredrik (who will be working on 
Zeroconf for his Google SoC project) can make progress on a flavour of 
autoipd which is suitable for the base system.


P.S. If anyone out there is working on wide-area DNS-SD, please make 
yourselves known to us...


Regards,
BMS

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

[CODE DROP] Source-Specific Multicast for FreeBSD 7: Phase 1

2007-04-13 Thread Bruce M Simpson


I am proud to announce the first code drop of SSM support for FreeBSD 7.0.

From the README file:
%%%

Source-Specific Multicast for FreeBSD 7.0 -- Phase I

This change brings FreeBSD closer to the standard of multicast API
support offered by Linux 2.6 and Microsoft Windows "Longhorn". It is
mostly of interest to organizations and individuals working with
Internet multimedia applications, and IPv4/IPv6 routing, such as ISPs.
It represents several weeks of work.

The code is written to accomodate IPv6 and MLDv2 with only a little
additional work. A regression test is included under
src/tools/regression/netinet/ipmulticast in the code drop.

The code is available in the bms_netdev branch on perforce.freebsd.org,
or as a patch against -CURRENT extracted from this branch (with additional
files, relative to src) available at:
   http://people.freebsd.org/~bms/ssm_phase1.tar

The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6,
which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html

%%%

Regards,
BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: WiFi channel bonding with netgraph - possible ? Back end needed ?

2007-04-14 Thread Bruce M. Simpson


Gore Jarold wrote:

Comments ?  The idea is to bond things into one, single, usable connection that 
could provide multiple connections worth of bandwidth for single-threaded 
network transactions (like downloading a single file from an ftp server).  
Perhaps there is a better tool to do this with than netgraph ?

 


You could try pf's load-balanced NAT feature to deal with the NAT. This 
of course assumes you can configure a FreeBSD router directly with line 
presentation i.e. not using an intermediate box between it and the link 
to the ISP. However, I can't say I like the idea much of trying to tie 
all those nodes together with tunnels transiting the ISP.


Sounds like a clear cut case for 802.11s ESS Mesh... which isn't 
available yet.


BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: fake MAC addresses and ARP

2007-04-18 Thread Bruce M. Simpson


Some ideas:

1. Enable IFF_STATICARP on your interface to stop ARP sending out to 
resolve the IP/MAC address tuple.


2. Consider that you can deal with resolution in userland (RTF_RESOLVE) 
but this involves changing the net's entry (route) in the FTE. You'd 
then process RTM_RESOLVE messages and install routes yourself -- it's 
possible to do arp in userland with this.


3. Try to avoid using the 169.254.0.0/16 prefix as it has a specific 
meaning. We don't implement interface scoping for these addresses yet so 
the FTE can't deal with them appearing more than once for the same 
subnet; it may be easier to pick something else -- note that if ARP is 
enabled for an interface with one of these addresses, all ARP traffic is 
forced to be broadcast as per the zeroconf RFCs.


BMS


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: rtentry and rtrequest

2007-04-18 Thread Bruce M. Simpson


Alan Garfield wrote:

Hi all!

One word HOW! :)

I've no clue what this FreeBSD ARP stuff is all about, there is little
or no documentation, there are 14 different sock_addr's which seem to
have a bazillion different fields, and I cannot output a simple debug
statement without getting 'error: dereferencing pointer to incomplete
type' errors!
  


The ARP code is pretty well documented in TCP/IP Illustrated Volume 2 
and hasn't really significantly changed. Whilst I personally dislike how 
reentry happens in some of the paths, it works. In BSD, ARP lives in the 
routing table, which can be confusing to newcomers; such entries have 
the RTF_LLINFO flag set.


From the sounds of it, if you are having to fake MAC addresses, you 
would be better off just enabling static mode ARP on the interface, 
possibly also enabling IFF_SMART ('manages own routes') on your 
interface and explicitly purging and re-adding your ARP entries from 
within your driver rather than trying to hack the rtrequest code to 
munge things on the fly. arp_rtrequest() is driver-independent code and 
will get hooked up to your code anyway when the net/ framework notices 
that your driver is one of IFT_ETHER.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

MFC of ether_input() changes

2007-04-20 Thread Bruce M Simpson


Hi,

Does anyone want to see these changes MFCed, or otherwise object to such 
an MFC?

The introduction of M_PROMISC did the following:

  * Drop frames immediately if the interface is not marked IFF_UP.
  * Always trim off the frame checksum if present.
  * Always use M_VLANTAG in preference to passing 802.1Q frames
to consumers.
  * Use __func__ consistently for KASSERT().
  * Use the M_PROMISC flag to detect situations where ether_input()
may reenter itself on the same call graph with the same mbuf which
was promiscuously received on behalf of subsystems such as
netgraph, carp, and vlan.
  * 802.1P frames (that is, VLAN frames with an ID of 0) will now be
passed to layer 3 input paths.
  * Deal with the special case for CARP in a sane way.

For end users the main change of interest will be the ability for 
FreeBSD to receive 802.1p frames, even if it doesn't do anything with 
the priority fields right now.


If I hear 'yeses' I will try to MFC this as time permits.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [CODE DROP] Source-Specific Multicast for FreeBSD 7: Phase 1

2007-04-20 Thread Bruce M. Simpson

I've had some feedback from Robert Watson which has been factored into 
the branch. Thanks, Robert!


If I hear no objections I'll aim to commit this code to -CURRENT within 
the next week, subject to approval. No MFC is planned because of the 
magnitude of the change.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: MFC of ether_input() changes

2007-04-22 Thread Bruce M Simpson

Actually, I thought the change which moved the VLAN tag out of the mbuf 
tag pool and into the mbuf packet header had also been MFCed. It has not.


As CURRENT is the branch normally used for feature development it is 
probably best I don't MFC this unless the VLAN tag change is MFCed also.


Therefore there is not a lot of point in merging this change apart from 
to benefit from the code cleanup which M_PROMISC offers, so I'll back 
off for now.


Cheers...
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Why can't I sendto() to 127.255.255.255

2007-04-30 Thread Bruce M. Simpson


Abraham K. Mathen wrote:


 Is it possible to successfully sendto() on a UDP socket
with 127.255.255.255 as the destination address? If yes,
how can that be done. 


No, because in FreeBSD, lo(4) is not implemented as a broadcast 
interface. It is a multicast capable software loopback interface. It has 
no concept of a broadcast domain. Unicast traffic, as well as multicast 
traffic, is looped back on this interface.


You can see that in the output of 'ifconfig lo0', the BROADCAST flag is 
not set.


RFC 3330 says:
"A datagram sent by a higher level protocol to an address anywhere 
within this block should loop back inside the host."


A few quick tests suggests this does not happen by default on FreeBSD. I 
suspect that this is because although lo0 is configured with 127.0.0.1/8 
by default, a cloning interface route is not added as ARP does not run 
on such an interface. Therefore only a host route for 127.0.0.1 appears 
in the table.


To tell the stack to transmit datagrams destined for 127/8 via lo0 you'd 
do the following:

route -n add 127.0.0.0/8 -net -iface lo0

Nothing will reply as nothing is listening on that address 
(127.255.255.255).


You can configure multiple lo interfaces, they just don't participate in 
a broadcast domain, as they are not broadcast interfaces. However, how 
lo(4) is implemented has the peculiar side-effect that all loopback 
interfaces are in the same 'transmission domain'... tcpdumping on lo0 
will show you traffic on lo1. All loopback ifnet instances see each 
other's traffic, it's just up to the stack to reject it if it's not 
destined for a configured address on that instance.


To try that, you'd 'ifconfig lo1 create' and 'ifconfig lo1 127.0.0.2/32' 
as FreeBSD's network stack does not really allow you to have more than 
one interface configured on the same subnet.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: IPPNP

2007-05-21 Thread Bruce M. Simpson


Ozgur Ozdemircili wrote:

Hello,

I have a network of 10.10.10.0 and the gw is at 10.10.10.1. GW is
giving out ip with DHCP. If the client pc is configured with DHCP they
can get the ip from the server and go out to internet easily. But if
the client has* static Ip configured*, for example 192.168.0.2 with gw
192.168.0.1, they cannot find the GW and cannot go out. I need the
clients to be able to find the gw and *go out to internet without
changing their ip configurations.
*
I have searched for technology behind this and it seems like IPPNP is
the solution. This technology is implemented in most of the hotspot
gateways (nomadix, dlink etc)
  


This sounds like another buzzword that the vendors just made up when 
everyone with a clue wasn't looking. Someone's created a Sourceforge 
project for it which is still empty, and search engine matches find many 
vendors describing this feature as 'unique'. The way it has been 
verbally beaten on Linux related lists is something I can't disagree 
with. :^)

   http://article.gmane.org/gmane.linux.network.bridge.ebtables.user/896

The short answer is that this appears to be some kind of MAC based 
gateway protocol; basically, an 'internet access device' (stub router 
for home/small office) will forward any traffic it sees for a subnet 
which it isn't configured with, by spoofing ARP traffic so as to make it 
appear as though it is on that subnet.


This sounds like a configuration nightmare to implement correctly, and 
goodness help you if you have more than one of these things connected to 
the same network. Whilst it probably can be done in the network stack, I 
speculate it couldn't be turned on at the same time as a number of other 
features such as Proxy ARP, or CARP, and may have problems scaling to 
more than a two-armed router (that is, 1 WAN uplink, and 1 Ethernet 
interface running this stuff). It also seems to rely on a few  
assumptions (subnet is a /24, and 0 is the subnetwork address). I think 
it also assumes that the network is 802.1x or MAC address ACL 
authenticated, and that clients are directly attached to the Layer 2 
domain to which the interface which runs the quirk is attached.


I see no IETF standard for this quirk, and it doesn't seem to have been 
as well thought out as the Zeroconf proposals.


So whilst it may seem like a quick fix, it would have to be implemented, 
and the easiest thing for FreeBSD users to do in this situation is 
probably just to configure a separate network alias on their internal 
LAN interface -- something which is obviously more difficult to do with 
the kind of device the quirk is intended for.


Now that I think on it, if IPv4 addresses are scoped, it might be 
possible to implement a knob which says "All IPv4 addresses learned on 
*this* interface have local scope", which in turn implies a 1:M NAT. It 
relies of course on the NAT module e.g. pf being able to join the dots 
and notice that an IP address outside of a configured subnet is being 
used on that interface, however, it would stop the forwarding code doing 
the wrong thing and forwarding the datagram back out the WAN interface 
again after pf has demuxed the inbound datagram there.


To implement this feature properly requires that the forwarding code is 
changed to allow it. The changes required are similar to those needed 
for doing unnumbered IP. An interim solution could probably be 
implemented in userland using bpf which would stash the appropriate 
firewall rules to rewrite the outgoing traffic i.e. make the FreeBSD 
router appear on the subnet which the client thinks it's on.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kern/108197: [ipv6] IPv6-related crash if if_delmulti

2007-05-21 Thread Bruce M. Simpson


Andre Oppermann wrote:

Synopsis: [ipv6] IPv6-related crash if if_delmulti

Responsible-Changed-From-To: freebsd-net->bms
Responsible-Changed-By: andre
Responsible-Changed-When: Sun May 13 18:36:25 UTC 2007
Responsible-Changed-Why: 
Send over to BMS.  He's active in that area and may have fixed the bug already.


http://www.freebsd.org/cgi/query-pr.cgi?pr=108197
  


Sorry, but I have no time to look at this at the moment. Is someone else 
free to look at it?
The fix probably needs to be borrowed from the IPv4 code which adds an 
address to an interface.


This wouldn't be the final fix; the root issue, to my mind, is that 
protocol specific state is contained within struct ifnet, when it 
probably shouldn't be. The address configuration code in both cases is 
therefore somewhat convoluted; FreeBSD lazy-allocates protocol domain 
structures for an instance of struct ifnet, rather than making the 
attachment of a protocol domain to an ifnet an explicit operation.


Thanks,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: asymetric speeds over gigE link

2007-05-21 Thread Bruce M. Simpson

Wilkinson, Alex wrote:
0n Mon, May 21, 2007 at 07:39:06PM +0100, Tom Judge wrote: 

> I have also seen 700Mb/s sustained FreeBSD - FreeBSD using the openssh HPN
> patch set and no extra tuning of the network stack.  Which makes me 
> think that maybe the linux stack needs some tuning?

What is the "HPN patch" ?

http://www.psc.edu/networking/projects/hpn-ssh/

Pittsburgh Supercomputing Center high performance networking patches, 
which have been around for a few years and are maintained, available as 
part of ports/security/openssh-portable.

Sadly, my patches for the ROT13 cipher have not made it into 
OpenSSH/OpenSSL as of yet. +

Regards,
BMS

+ Very capable of line rate encryption. And based on a mature USENET 
technology...

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Implement Multi-Protocol Label Switching (MPLS)

2007-05-29 Thread Bruce M. Simpson


Luca Da Col wrote:

Hello,

I would like to know if someone is actively working on MPLS project for
FreeBSD. I would also like to know if James Leu's MPLS implementation for
Linux has been considered as starting point for this project.


No one is actively working on this to the best of my knowledge, however, 
there has been work on updating the kernel support for the Click Modular 
Router which would probably be a more appropriate starting point for 
producing an MPLS implementation which would work in the FreeBSD kernel.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: driver packet coalesce

2007-05-31 Thread Bruce M. Simpson

Jack Vogel wrote:

On 5/31/07, Wilkinson, Alex <[EMAIL PROTECTED]> wrote:

0n Wed, May 30, 2007 at 04:45:05PM -0700, Jack Vogel wrote:

> Does any driver do this now? And if a driver were to coalesce
> packets and send something up the stack that violates mss
> will it barf?

erm, what is meant by "coalesce" ?

combining packets before sending to the stack, aka LRO.

Yup - the firmware for the card's LRO engine would have to know not to 
coalesce packets not destined for the local host. I speculate many cards 
are not smart enough to do this, and LRO is an all-or-nothing 
proposition, as it's a technology designed to optimize for hosts, not 
routers; see recent discussions/slanging matches on end2end.

At the moment there is no central place where we track all layer 2 
addresses for which traffic should be delivered locally. This would 
logically belong in struct ifnet, and clients e.g. CARP would have to be 
taught to add their layer 2 endpoint addresses there.

It seems acceptable to disable LRO if bridging is on and document this 
behaviour.

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Firewalling NFS

2007-06-15 Thread Bruce M. Simpson


Eygene Ryabinkin wrote:

NFSD binds to the port nfsd (2049) and for my -CURRENT both lockd
and statd have '-p' options:
-
$ man rpc.lockd rpc.statd | grep -- -p
 rpc.lockd [-d debug_level] [-g grace period] [-p port]
 -p  The -p option allow to force the daemon to bind to the specified
 rpc.statd [-d] [-p port]
 -p  The -p option allow to force the daemon to bind to the specified
-
Are we talking about same entities?
  


I added the -p switch to mountd(8) a few years ago, as I needed to run a 
read-only NFS server exposed to the outside world; to firewall it I 
needed a deterministic RPC port number, which is what -p gives you. 
Otherwise you have to rely on the TCP wrapper support built into 
rpcbind(8). The rpc.lockd and rpc.statd daemons were recently changed to 
incorporate this switch too, although I don't think it has been 
backported to the 6-STABLE branch yet.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: new ARP code review

2007-06-19 Thread Bruce M. Simpson


Julian Elischer wrote:


I have some thoughts on this.
firstly, while it is interesting to have an arp table (ok LLA table) 
on each interface, I'm not sure that it gains you very much.


Unfortunately maintaining a single ARP table is insufficient for 
supporting multiple paths within the IPv4 stack. Even without supporting 
multiple routing paths, we would still need to break out the ARP cache 
in this way so as to support being attached to the same layer 2 domain 
properly (ie two network cards on the same Ethernet segment or switch). 
At the moment if_bridge and netgraph are our get-out-of-jail-free cards, 
they cause the IPv4 stack to be bypassed.


As mentioned elsewhere, the connection of the arp information with the 
routing table menas that the arp lookup is virtually free.
Or, at least it used to be in the Uniprocessor world. It's hard to 
beat free.


It's hard to beat hard figures, which is something we don't have at the 
moment.


What we do have is a set of design considerations. Intuition would 
suggest that one lock performs better than two, however, it depends on 
the nature of the lock and on the nature of the data structure lookup.




The comment "Eventually, with this structure you can do the route lookup
only when you need to find the next hop (e.g. when a route
changes etc.) and just the much-cheaper L3-L2 map in other cases."
makes me wonder..If we are not caching the arp code in the route any 
more,

then how do we avoid doing a route lookup on each packet?


I don't think you can ever avoid doing a lookup of any kind per packet 
if you're running a router. What you can do is amortize lookup cost over 
time, e.g. two expensive initial lookups followed by one cheaper lookup 
for subsequent packets.


Whatever happens, though, has to play nice with policy forwarding and 
source selection.


This is what complicates matters - otherwise I'd just suggest keeping a 
per-interface hash of ARP entries, an IPv4 routing trie, and a 
per-destination cache hash which returns the combined lookup against the 
trie and the L2 hash -- pretty much what Luigi is suggesting.




BTW having a per interface arp table does make sense if there a s a 
particular

thread that is responsible for that interface as only it would need
access to teh table and it could be done lock-free if one was careful 
enough.


The ARP code has to change, that much is certain, but the locking 
strategy has yet to be decided. ARP entries are read far more often than 
they are written, so it seems reasonable that a different lock is used.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: how do you bring IPv6 live without reboot?

2007-06-22 Thread Bruce M. Simpson


ghozzy wrote:


I've found a way:

# sysctl net.inet6.ip6.auto_linklocal=1
# ifconfig em0 down up
will assign link-local address to interface.

after all required interfaces have link-local addresses,
run /etc/rc.d/network_ipv6 start and all will be set ! :)
Well, this may work now, however, don't depend on this behaviour in 
future releases.


The fact that it does work at all is to do with how protocol domain 
attach works with struct ifnet. I am thinking that in future a lot of 
this should change, in order to avoid a number of issues we currently 
have -- this (the inability to re-attach IPv6 without taking down the 
entire interface) is one of them.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Vimage virtual networking and 7.0

2007-06-22 Thread Bruce M. Simpson


Julian Elischer wrote:


In the future I am hoping to be able to use vimage in our products.
They are based at the moment on 6.1, but I can see in a year they will 
be based on 7.x.


Patches for 7.0 and vimage are currently available in perforce.
What I would like to see is if there are any parts of that patch that
would allow us to make adding of vimage to 7.1 an easier task.

For example, Anything that would prevent vimage from
needing an API change that would prevent it from being added later.


My concern is that this may have already happened. I've been trying to 
do my bit as the years edge on to clean up the networking stack and fix 
bugs. One of my concerns is that the vimage change, which attempts to 
take network stack globals and wrap them into one big structure, may 
intrude on this or be subject to bitrot due to other development.





I am quite disappointed that despite Marko's best efforts, we miss the 
7.0
release but if it can be made nonintrusive enough I'd really like to 
see if it can get in 7.1.




I appreciate all the hard work Marko has done on this, though I wonder 
if even 7.1 is ambitious.


Personally, if I were "god" I'd put it in now because it can be 
compiled out.

and it wouldn't be compiled by default.Maybe only just bits of it..
for sure I want the ability to have many routing tables.
and I'm not thrilled about the requirement to have my own patch sets 
for this and thus not allowing others to use this feature.


I think there are deeper issues in the network stack overall which need 
to be addressed, such as our lack of support for multipathing, scoped 
addresses, and all the tidyups which need to happen in struct ifnet to 
deal with this.


My concern is that vimage may be a very intrusive change indeed where 
these matters are concerned, unless the vimage patches are being kept 
up-to-date and regression tested as issues are resolved and new features 
added.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: how do you bring IPv6 live without reboot?

2007-06-24 Thread Bruce M. Simpson


George Michaelson wrote:

its interesting that when I sent-pr'd this, I got tut-tutted back to
freebsd questions. In my books, not being able to do this kind of V6
maintenance work on the interface without taking it down probably
deserves to be kept as an open bug!
  


I agree. Please mail me the PR number and I'll reopen it. However, I 
can't make any commitment about when I personally would get time to do 
this as I need to go off and work for a living.


It does however strike me as a sound design choice to make. The network 
stack design in Windows mandates that this is how it has to be -- TDI 
bindings must be explicitly made between the stack and the NDIS 
driver(s). Loopback is handled at TDI layer and does not appear until 
the PF_INET6 domain is attached to the system. Linux has gone part of 
the way down this road.


I beleive BSD should do so as well, for a plethora of reasons including 
this one, as well as disentangling protocol domain stuff from struct 
ifnet. As a result ifconfig would get shaken up a bit.


At the moment, the way the BSD stack works, neither IPv4 nor IPv6 are 
attached to struct ifnet until an address is explicitly configured, 
either by the user or by the kernel configuring a link-local address, so 
if you do need to purge protocol domain wide state, your current options 
are to remove the interface (which has been shown to cause problems, 
some of which I have been trying to fix) or reboot.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kern/113842: enabling IPv6 post-boot didn't work: required reboot

2007-06-24 Thread Bruce M Simpson

Synopsis: enabling IPv6 post-boot didn't work: required reboot

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: bms
Responsible-Changed-When: Sun Jun 24 22:37:28 UTC 2007
Responsible-Changed-Why: 
Real issue. I may get around to this but am currently allocated
on non FreeBSD stuff.

http://www.freebsd.org/cgi/query-pr.cgi?pr=113842
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: IPv6 Woes...

2007-06-25 Thread Bruce M. Simpson

Is your routing table correct? My default route entry for IPv6 just 
looks like this:


default   fe80::%gif0   
UGSgif0


and gif0 just looks like this:

gif0: flags=8051 mtu 1280
   tunnel inet a.b.c.d -> x.x.x.x
   inet6 fe80::XXX:XXX:%gif0 prefixlen 64 scopeid 0x8
   inet6 2001:ZZZ:ZZZ::: prefixlen 128

In the output you posted, the next-hop of 2001:4980:1::5 will need to be 
resolved via NDP (hence the LW flags).


You already have a 1:1 endpoint mapping due to the use of the gif IPIP 
header, so the upstream shouldn't need any other tag to demux your 
traffic. You shouldn't need to do anything special with Ethernet in your 
configuration.


Hope this helps.

BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: IPv6 Woes...

2007-06-25 Thread Bruce M. Simpson


Eric F Crist wrote:


My problem isn't getting out to 2001:4980:1::5, it's getting to my 
LAN, the 2001:4980:1:111::/64 network.  My gateway, the machine from 
which I posted the routing and ifconfig information, is able to ping 
across the tunnel, and to the internet just fine.  Nothing is able to 
get from the gateway to my LAN, however.  Is it a problem with the fxp 
driver, or perhaps my setup with the ethernet bridging?


You appear to have a /64 network address on the inside of your v6 
router. Are you using stateless address auto-configuration? You appear 
to have statically assigned ::145 as a host address on that net.


My setup works fine if I ping the network address of my v6 router from 
the v6 enabled hosts in my lab.


When you ping local machines on the inside LAN from that router, do you 
see NDP entries being created?


You shouldn't need to use bridging to achieve what you want in this 
scenario, in fact it makes no sense because you want to route v6 traffic 
over the gif, therefore ethernet bridging is not relevant here.


regards
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 6.2 mtu now limits size of incomming packet

2007-07-13 Thread Bruce M. Simpson


Mike Karels wrote:

I'd be happy to see the change undone as well.  I (well, our test
group) found this change in a similar way, and it didn't agree with
our previous usage.
  


In -CURRENT my changes to the ethernet input path maintain the use of 
ETHER_MAX_FRAME() however the check is folded under #ifdef DIAGNOSTIC. I 
don't recall adding this conditional or touching it so it seems to be 
something which was already thereo radded by someone else.


Could be pilot error; its use in -CURRENT seems to apply strictly to the 
use of large-receive offload (LRO).


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Call for testers: multicast forwarding

2007-07-22 Thread Bruce M Simpson


Hi,

I may have some commercial work coming up which requires me to make 
modifications to the IPv4 multicast forwarding code in Linux. It is 
likely I will prototype the work in FreeBSD. It will probably not be 
released publicly.


To prepare for this I have started cleaning up the MROUTING code; using 
more appropriate data structures, working on removal of the 32 vif 
limitation and other refinements, removal of legacy code which is no 
longer useful.


I'd like to hear from anyone using multicast forwarding on FreeBSD who 
would be interested in testing these changes and suggesting other 
improvements.. They will most likely not make the 7.0 release but may 
appear in future versions.


Code is not yet available as a patch set. I am working in the p4 branch 
bms_netdev.


regards,
BMS

P.S. It would be good if there were a way of giving the general public 
read-only access to the p4 tree, this is becoming a blocking limitation 
of the tool for open development.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

[PATCH] add check for IP Router Alert

2007-07-25 Thread Bruce M. Simpson

Please see the following patch which adds a check for the  IP Router 
Alert option, for use by in-kernel IPv4 protocol domain consumers:

   http://people.freebsd.org/~bms/dump/ipoptions-routeralert.patch

Comments/review before commit appreciated.

regards
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: divert and deadlock issues

2007-07-31 Thread Bruce M. Simpson


Christian S.J. Peron wrote:

...
One idea was to duplicate the socket options mbuf and pass in a NULL pointer
for the multi-cast options.  Keep in mind that these are multicast options
associated with a divert socket.

So I guess the questions:

(1) Are there any users that are specifying multicast options on divert sockets?
(2) Are there any users that are specifying socket options in general for
divert sockets?
  


The LOR is obviously being triggered by ip_output()'s acquisition of 
in_multi_mtx, due to a datagram being sent to a multicast destination 
and a subsequent lookup being required.


I can't think of a reason why a user would wish to supply any multicast 
socket options to a divert socket, other than the 'small' ones, i.e. 
IP_MULTICAST_TTL/IF/LOOP/VIF.


See the comments about idempotence inside in_mcast.c on the HEAD branch, 
about why you can't just wish them away. It seems reasonable that this 
subset of the multicast options are supported for divert sockets given 
the likely use cases, even if IPPROTO_DIVERT supports IP_HDRINCL, 
because IP_MULTICAST_TTL does not do what you think it does (see 
in_mcast.c comments again).


Joining groups on a divert socket SHOULD NOT be supported (it does not 
make sense semantically) and we should deliberately return EINVAL for 
multicast options other than the above subset.


Dropping the inpcb lock over ip_output() looks like the easy option. 
Alternatively, we could just not support multicast options on divert 
sockets given that it is a rare use case as per above.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: divert and deadlock issues

2007-07-31 Thread Bruce M. Simpson


Christian S.J. Peron wrote:
I can't think of a reason why a user would wish to supply any multicast 
socket options to a divert socket, other than the 'small' ones, i.e. 
IP_MULTICAST_TTL/IF/LOOP/VIF.



Why would these options ever be set on the divert socket itself though?
To me it would make sense if these options were set on the network
socket that originally sent the multicast packet itself.
  
They shouldn't be necessary, however I can foresee situations where 
someone might well want to redirect multicast datagrams traversing an 
IPPROTO_DIVERT socket, by using these socket options. [Recall that 
FreeBSD's IPv4 stack currently uses the destination address as the sole 
primary key for lookups in the forwarding information base's radix trie.]


This is however very unlikely, so my last suggestion, that multicast 
options be deprecated or forbidden for IPPROTO_DIVERT sockets, stands.


Kind regards
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: divert and deadlock issues

2007-08-01 Thread Bruce M. Simpson


Christian S.J. Peron wrote:

Well, it's still the intent to keep the ability to divert and re-inject
multicast packets.  This change would basically say: "You cant specify
multicast options via the divert socket". Which in practice doesn't
happen anyway (where I looked).

I dont think we should be specifying multicast options on divert sockets.
It's not the right place to be manipulating multicast parameters.  Multicast
parameters should be set on the sockets that originally transmitted or
received the packets.  I dont think divert falls into this category.
  

Correct.

The definition of what a divert socket is and does, falls outside the 
definition of what a multicast socket endpoint is.

Divert sockets exist to munge packets as they flow up or down the stack.

If the additional complexity of treating divert sockets as multicast 
endpoints causes locking issues in the stack, common sense suggests we 
should deprecate that behaviour.


BMS


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: routing local traffic w/o using loopback interface

2007-08-17 Thread Bruce M. Simpson


rajneesh rana wrote:

hello all,

i am opening up two tap interfaces, both connected to bridge, assigning them
IP addresses and want to open up tcp connection b/w them without using
loopback interface, so i bind client socket to first tap using
SO_BINDTODEVICE option and socket server listening on other tap device.
The problem is that when i m calling connect, it is giving timeout error.
  
I am confused by your question because to the best of my knowledge the 
SO_BINDTODEVICE socket option does not exist in FreeBSD.

Is it possible two route traffic b/w two interfaces of same machine w/o
using loopback interface and kernel hacking.

Yes, I use if_bridge for this on a daily basis.

regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Failover default route?

2007-08-18 Thread Bruce M. Simpson


Tuc at T-B-O-H.NET wrote:

In my case, as always, its a bit "special". I have
2 OPENVPN tunnels, which I sent over different transits to
the same end host. On that host, I do my NAT. SO, without
getting into all sorts of hot/heavy things, is there a simple
program to install to ping something via the first tunnel,
and if it can't then switch my default route to the second
tunnel? Or, do I just use a script like here :
As Bill correctly points out, reachability detection using a routing 
protocol is often the preferred method, however this isn't always 
available. Pinging is NOT the best practice, see RFC 1122 3.3.1.4:
http://www.freesoft.org/CIE/RFC/1122/56.htm


You could use ifstated to detect changes in the tunnel interface status 
and switch default routes accordingly, though it doesn't significantly 
reduce the amount of manual scripting you have to do.


Microsoft's TCP implementation performs dead gateway detection based on 
triggered reselection as per RFC 816, however, they have a multipath 
capable FIB which can hold the multiple next-hops and their state -- 
something to consider for later.


An incrememntal piecemeal change which folks might find OK may be to add 
cost metrics back to the kernel radix trie, but that still has all the 
aggro of changing the API.


regards
BMS




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Route caching ?

2007-08-22 Thread Bruce M. Simpson


Ivo Vachkov wrote:

Does FreeBSD rtalloc*() (or any other) functions implement route
caching and how ? I looked at the code but it's not exactly easiest
thing to read / understand :)
Not really, at least, not in the way one would think. rtalloc() is a 
legacy function.


ip_output() will still call rtalloc() if you pass it a filled out 
'struct route', a structure which is not a route, but an internal 
request to look up a route.


This is a wrapper for rtalloc_ign(), which in turn is a wrapper for 
rtalloc1(), the function which does the actual lookup.


rtalloc_ign() is pretty straightforward. Note however that this approach 
only checks the RTF_UP flag and ifp, nothing more. This makes it 
suitable for implementing floating statics, but nothing more dynamic 
than that.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Allocating AF constants for vendors.

2007-08-22 Thread Bruce M. Simpson

I second Max. If you are going to introduce a bunch of AF_* constants 
into the tree you have to be very careful as AF_MAX is used to size 
arrays and figure out how many radix trie heads to allocate.


It could be argued this wastes a bunch of CPU time and memory, though I 
speculate 'not much' at the moment; I am just a bit concerned that we 
have ifnet->if_afdata which is also sized based on AF_MAX, 37, even 
though most of the protocols in it are never attached to ifnets.


The only domain I've seen which really uses if_afdata is PF_INET6. 
PF_INET does not use it at all. In my opinion, there are structures 
per-family per-ifnet which really belong hung-off ifnet on a 1:1 basis 
and would simplify some of the lazy allocations we have further down in 
the stack.


If AF_MAX increases significantly so will wasted memory. If you are 
going to make any significant changes here, please considering moving 
this stuff to a more dynamic method of allocation.


On the other hand, if you don't need to reference these constants in the 
kernel at all, and they will all exist beyond AF_MAX, then you can 
disregard what I've said and append them to the rest of the list.


That is pretty much what happens for the libpcap/bpf DLT constants 
(which are not an exact analogue of the AF constants - we don't allocate 
other, larger kernel structures based on their value).


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Route caching ?

2007-08-22 Thread Bruce M. Simpson


Ivo Vachkov wrote:

Actually there is:

struct  route_in6 ip6_forward_rt;

that "caches" the last route used (thanks blue !!!) but i think this
technique is pointless in a multiflow traffic.
  


Yes, this is why OpenBSD got rid of this form of 'route caching'.


Is it reasonable to believe that route caches can improve networking
performance or we should leave it up to the routing table itself ?
  


I believe that if one goes beyond a single radix trie, as is needed for 
multi-pathing with multicast and source policy routing, route caching is 
*required* to achieve good performance.


Also, if FreeBSD moves ARP and NDP out of the radix trie, a route cache 
would be highly preferable as it amortizes the lock acquisition which 
would other be required for ARP/NDP/other layer 2 next-hop resolution.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Route caching ?

2007-08-22 Thread Bruce M. Simpson


Claudio Jeker wrote:

Just because you believe that route caches are great doesn't mean it is
true. Show some real code and include benchmarks with various workloads
(e.g. a core router that is hit by many many many sessions).
  


It is a reasonable approach, for a uniprocessor design, to focus on 
optimizing the route lookup as much as possible. Does this approach 
scale to SMP, though? This is still a very much open question and from 
what I have seen of the OpenBSD implementation, it only addresses the 
uniprocessor case - again please correct me here if I have missed any 
details.


I believe the Linux dst cache is strongly tied to the IBM-patented 
Remote-Copy-Update algorithm based on what I've read about their LC-trie 
implementation.



Until now all caching solutions resulted in very bad performance on busy
boxes. Remember ip_fastforward or how was it called? Another example are
all crapy L3 switches that burn down if the CAM (chache) is flodded.
  


I assume you are referring to NetBSD's flow-based IP forwarding cache, 
which was implemented outside of the scope of SMP; spl-style interrupt 
priority masking was still in use at that time.


It is established that saturating content-addressable memory is going to 
lead to the slow path being taken, however, that's the trade-off one 
makes with these designs.



IMO it is better to make the route lookup faster and forget about caching.
  

My concern is that you may be comparing apples with oranges here.

In the case of SMP, locking does become a consideration, and caches, if 
carefully implemented, are one way of addressing this.


On the other hand, CPU affinity has been proposed as a limited solution, 
however it depends how this is implemented - affinity for lookups, 
forwarding, or both?


Perhaps there is something I am missing about how the OpenBSD 
implementation deals with SMP, as I am not as familiar with their code 
as FreeBSD's.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: quagga 0.99.8 on current, tcpmd5 config confusion

2007-08-24 Thread Bruce M. Simpson


Randy Bush wrote:

just did a cvsup build and portupgrade of a six month old -current
i386 system running quagga.  quagga cranked to 0.99.8.  i got
slammed by bgp tcpmd5 requirement.

bgpd[469]: can't set sockopt TCP_MD5SIG 0 to socket 17
bgpd[469]: can't set sockopt TCP_MD5SIG 0 to socket 18
bgpd[469]: can't set sockopt TCP_MD5SIG 0 to socket 22

madly googled and found that i needed to hack kernel for tcp md5
hash, even though i am not using md5 auth (these are not really
infrastructure peerings.  yes i know better for production).
  


This I haven't seen before, then again, it's been years since I've used 
Zebra/Quagga let alone hacked the patch for md5 support, which is now 
~3.5 years old. It was only ever intended as a belt-and-braces attempt 
at getting things up in a way which the sponsor was satisfied with, with 
no other refinements.


I wasn't 100% happy about how I ended up doing the kernel support, and 
had to go with what I had working in my tree because of that old demon 
'economics', rather than doing things 'the right way': i.e. in the IPSEC 
Security Policy Database (SPD), with the routing daemon loading the 
keys, rather than the Security Associations Database (SADB) and keys 
loaded manually using setkey(8).


Other individuals have since made changes to this code. Now that we have 
settled on FAST_IPSEC thanks to gnn's hard work, it will be easier for 
Someone(tm) to pick this up, as KAME IPSEC and FAST_IPSEC interfaced to 
key sockets differently enough to change the implementation of the SPD.



with this kernel, i got a lot of whining about no keys

tcp_signature_compute: SADB lookup failed for 666.42.69.96
  


I remember putting in the SADB lookup failed message to help people 
track down problems with their configuration. If TCP_MD5SIG is not 
enabled on the tcp socket, no SADB lookup should happen, so you 
shouldn't be seeing this message.


It sounds to me as though Quagga may be enabling the TCP_MD5SIG option 
unconditionally based on all of the output you've posted. This is 
obviously incorrect. I can't speak for Quagga, though it seems 
reasonable to suggest that it shouldn't be doing that unless you tell it 
to. I believe the MD5 patches only get pulled in if you request them, 
and that md5 auth specifically needs to be enabled per peer.


Still, this is nearly 4 years on and I have other things going on now.

regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: nc captures 1024 bytes

2007-08-28 Thread Bruce M. Simpson


Looks like a netcat bug, if it doesn't tune buffers to the interface MTU.

I'm not sure if nc has a 'de facto' maintainer however I believe it is 
something which was recently imported into the freebsd base system.


Still, it is better to try to field patches with the upstream maintainer 
before filing a FreeBSD PR with your patches.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [EMAIL PROTECTED]: Re: rtfree: 0xffffff00036fb1e0 has 1 refs]

2007-08-28 Thread Bruce M. Simpson


Christian S.J. Peron wrote:

I am not sure who has their hands in the routing code these days so
I figured I would just forward this message off here.  Does the
following look reasonable?
  

I'm looking, but mostly with long range goggles on.

Yes, this looks like the right change. rtalloc1() always returns an 
rtentry with the mutex for that rtentry held.


regards
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: nc captures 1024 bytes

2007-08-28 Thread Bruce M. Simpson


Weiguang Shi wrote:

nc might be waiting on all the interfaces; enumerating MTUs and choosing the 
largest
sounds complicated, especially when some interfaces can be configured to receive 
jumbo frames. Why not just use something like 64KB as the other user

suggested or something even larger?
  


That is the easy fix, yes. :^)

If the socket's pcb laddr is bound to an IP, and IP to which it is bound 
stays on the same physical interface, then the MTU may easily be 
obtained. If it's INADDR_ANY, or you expect the IP to be dynamically 
reconfigured on another interface, then auto-tuning is not possible.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [EMAIL PROTECTED]: Re: rtfree: 0xffffff00036fb1e0 has 1 refs]

2007-08-29 Thread Bruce M. Simpson

BTW: Casual inspection with kscope suggests there is a similar 
free-while-locked issue in nd6_ns_input() (netient6/nd6_nbr.c) and 
in_arpinput() (netinet/if_ether.c).


nd6_ns_input() references rt-»rt_gateway after rtfree(), a potential 
race not to mention a use-after-free.


I haven't checked Coverity for this, but it just doesn't look right.

BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: vlan stacking

2007-08-29 Thread Bruce M. Simpson


Ivan Alexandrovich wrote:

Hi

I'm wondering is anybody using double vlans ("q-in-q",
"vlan stacking", any name you like) on production hosts?
Does it play well with common ethernet device drivers in freebsd
(concerning the frame size) -  fxp, em, for example?

Looks like that almost nobody mentions q-in-q in freebsd 
maillists/forums,

except that nesting ng_vlan can be used to implement it.


I'm sure you or someone else can come up with a creative solution for 
Q-in-Q or arbitrary nesting levels. It's not something I use, so, I pass.


The mainline code doesn't support it without Netgraph; it would be 
necessary to allow vlan(4) to be nested. The ether_input() code demuxes 
802.1q encapsulation but only 1 level. The reason for this is because 
the outer VLAN tag got moved into the mbuf pkthdr structure for 
if_bridge to be able to process it.


I can't comment on the netgraph solution however.

regards
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

1 2 3 4 5 6 7 >

1 - 100 of 613 matches

Mail list logo