Re: Route messages
Paul wrote: Get these with GRE tunnel on FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun May 11 19:00:57 EDT 2008 :/usr/obj/usr/src/sys/ROUTER amd64 But do not get them with 7.0-RELEASE Any ideas what changed? :) Wish there was some sort of changelog.. # of messages per second seems consistent with packets per second on GRE interface.. No impact in routing, but definitely impact in cpu usage for all processes monitoring the route messages. RTM_MISS is actually fairly common when you don't have a default route. Messages which get enqueued don't necessarily get delivered -- and very few processes actually listen to the routing socket actively like this, so I wouldn't worry about it. If it's a real concern for you then you could try hacking in a sysctl to tell the radix trie code not to issue RTM_MISS messages on the routing socket. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HEAD UP: non-MPSAFE network drivers to be disabled (was: 8.0 network stack MPsafety goals (fwd))
Robert Watson wrote: An FYI on the state of things here: in the last month, John has updated a number of device drivers to be MPSAFE, and the USB work remains in-flight. I'm holding fire a bit on disabling IFF_NEEDSGIANT while things settle and I catch up on driver state, and will likely send out an update next week regarding which device drivers remain on the kill list, and generally what the status of this project is. Goliath needs to get stoned, it's been a major hurdle in doing IGMPv3/SSM because of the locking fandango. I look forward to it. [For those who ask, what the hell? IGMPv3 potentially makes your wireless multicast better with or without little things like SSM, because of protocol robustness, compact state-changes, and the use of a single link-local IPv4 group for state-change reports, making it easier for your switches to actually do their job.] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: BPF problems on FreeBSD 7.0
Robin Sommer wrote: Hi all, we're seeing some strange effects with our libpcap-based application (the Bro network intrusion detection system) on a FreeBSD 7-RELEASE system. As the application has always been running fine on 6.x, we're wondering whether this might be triggered by any of the changes that went into 7. ... I'm wondering whether anybody here has seen something similar or might have an idea where to start looking for the cause. Any ideas? One place to start might be: netstat -B output in 7.x (I *think* this got MFCed), this will let us see what the drop count is for the Bro process, and what the flags are for the open BPF descriptors in the system. I'm not hot on current BPF internals, but I hazard a guess this is related to BPF descriptor buffering -- an area where there have been changes, some of which I've eyeballed. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
[EMAIL PROTECTED] wrote: The only thing i can think of is that it's the UDP checksum, residing beyond hlen, which is overwritten somewhere in the call to if_simloop -- in which case perhaps a better fix is to m_pullup() the udp header as well ? It is the checksum that gets trashed, yes. ... The m_*() routines actually have reasonable comments, it just seems the wrong one was used here. Actually, m_copy() has been legacy for some time now -- see comments. I'd be concerned that the change to m_dup() (which makes a full mbuf chain copy) rather than m_copym() (which bumps refcounts) is going to eat into the mbuf clusters on fast links, though it's an easy band-aid for the problem. I agree with Luigi that some of the API contract for mbuf(9) doesn't hold any more now that we have TSO and other offload. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
[EMAIL PROTECTED] wrote: I gather you mean that a fast link on which also we're looping back the packet will be an issue? Since this packet is only going into the simloop() routine. We end up calling if_simloop() from a few "interesting" places, in particular the kernel PIM packet handler. In this particular case we're going to take a full mbuf chain copy every time we send a packet which needs to be looped back to userland. I was actually hoping, as the person who last hacked this code, that you might have a suggestion as to a "right" fix. It's been a while since I've done any in-depth FreeBSD work other than hacking on the IGMPv3 snap, and my time is largely tied up with other work these days, sadly. It doesn't seem right to my mind that we need to make a full copy of an mbuf chain with m_dup() to workaround this kind of problem. Whilst it may suffice for a band-aid workaround, we may see mbuf pool fragmentation as packet rates go up. However we are now in a "new world order" where mbuf chains may be very tied to the device where they've originated or to where they're going. It isn't clear to me where this kind of intrusion is happening. In the case of ip_mloopback(), somehow we are stomping on a read-only copy of an mbuf chain. The use of m_copy() with m_pullup() there is fine according to the documented uses of mbuf(9), although as Luigi pointed out, most likely we need to look at the upper-layer protocol too, e.g. where UDP checksums are also being offloaded. Some of the code in the IGMPv3 branch actually reworks how loopback happens i.e. the preference is not to loop back wherever possible because of the locking implications. Check the bms_netdev branch history for more info. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
[EMAIL PROTECTED] wrote: Somehow the data that the device needs to do the proper checksum offload is getting trashed here. Now, since it's clear we need a writable packet structure so that we don't trash the original, I'm wondering if the m_pullup() will be sufficient. If it's serious enough to break UDP checksumming on the wire, perhaps we should just swallow the mbuf allocator heap churn and do the m_dup() for now, but slap in a big comment about why it's there. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Code review request
M. Warner Losh wrote: I've been shepherding this patch in my p4 tree for a long time. It removes the obsolete support for other systems in if_spppsubr.c. Is there a reason I shouldn't commit this? Looks fine to me. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [CFT/R] IPv4 source address selection
Bjoern A. Zeeb wrote: Hi, I have a patch, that was inspired by work from Y!, to do porper IPv4 source address selection for unbound sockets (with multi-IP jails). Hi, This kinda overlaps with some other ideas I'd like to see go in. It looks good and if it's already been tested, it should probably go in anyway as it disentangles the logic and puts it in a separate function. I'm thinking we may wish to use criteria other than interface or jailed socket to select source address. I should point out though that we picked some stuff up from KAME to do source address selection but it's not in the IPv4 stack. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: I am implementing a library/utility which basically encompasses the features of the traditional route utilities and those of newer tools (like ip from iproute2), which are mostly specific to a particular kernel. The overpowering objective is to make the library/utility work uniformly across all different kernels, so that programs like NetworkManager have a portable library/utility to use instead of the Linux-kernel specific ip which is now being used. Why don't you just use XORP's FEA code? It already does all this under a BSD-type license. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: ... I was going through the FreeBSD and NetBSD documentation and the FreeBSD sources of netstat and route. I was suprised to see that while NetBSD's route implementation has a 'show' command, FreeBSD does not offer any such thing. Moreover it seems that one can not read the entire routing table using the PF_ROUTE sockets and RTM_GET returns information pertaining to only one destination. This suprised me because one can do such a thing with the Linux kernel's RTNETLINK. Is there a reason why this is so? Or is reading from /dev/kmem the only way to get a dump of the routing tables? You want 'netstat -rn' to dump them, this is a very common command which should be present in a number of online resources on using and administering FreeBSD so I am somewhat surprised that you didn't find it. P.S. Look in the sysctl tree if you need to snapshot the kernel IP forwarding tables. You can use kmem, but it is generally frowned upon unless you're working from core dumps -- kernels can be built without kmem support, or kmem locked down, etc. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: Why don't you just use XORP's FEA code? It already does all this under a BSD-type license. I was not aware of it. What does it do? Is it portable across other OSes or is it *BSD specific? XORP's FEA process is responsible for talking to the underlying forwarding plane. It supports *BSD, Linux, MacOS X, and Microsoft Windows. Over the last year there was a refactoring where the forwarding table management got split into plugin-like modules. It is written in C++ although it's likely this split might make integration into other projects easier. Normally that support all goes into a single process, rather than being linked into many. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to read dynamic data structures from the kernel (was Re: reading routing table)
Luigi Rizzo wrote: do you know if any of the *BSD kernels implements some good mechanism to access a dynamic kernel data structure (e.g. the routing tree/trie, or even a list or hash table) without the flaws of the two approaches i indicate above ? Hahaha. I ran into an isomorphic problem with Net-SNMP at work last week. There's a need to export the BGP routing table via SNMP. Of course doing this in our framework at work requires some IPC calls which always require a select() (or WaitForMultipleObjects()) based continuation. Net-SNMP doesn't support continuations at the table iterator level, so somehow, we need to implement an iterator which can accomodate our blocking IPC mechanism. [No, we don't use threads, and that would actually create more problems than it solves -- running single-threaded with continuations lets us run lock free, and we rely on the OS's IPC primitives to serialize our code. works just fine for us so far...] So we would end up caching the whole primary key range in the SNMP sub-agent on a table OID access, a technique which would allow us to defer the IPC calls providing we walk the entire range of the iterator and cache the keys -- but even THAT is far too much data for the BGP table, which is a trie with ~250,000 entries. I hate SNMP GETNEXT. Back to the FreeBSD kernel, though. If you look at in_mcast.c, particularly in p4 bms_netdev, this is what happens for the per-socket multicast source filters -- there is the linearization of an RB-tree for setsourcefilter(). This is fine for something with a limit of ~256 entries per socket (why RB for something so small? this is for space vs time -- and also it has to merge into a larger filter list in the IGMPv3 paths.) And the lock granularity is per-socket. However it doesn't do for something as big as a BGP routing table. C++ lends itself well to expressing these kinds of smart-pointer idioms, though. I'm thinking perhaps we need the notion of a sysctl iterator, which allocates a token for walking a shared data structure, and is able to guarantee that the token maps to a valid pointer for the same entry, until its 'advance pointer' operation is called. Question is, who's going to pull the trigger? cheers BMS P.S. I'm REALLY getting fed up with the lack of openness and transparency largely incumbent in doing work in p4. Come one come all -- we shouldn't need accounts for folk to see and contribute what's going on, and the stagnation is getting silly. FreeBSD development should not be a committer or chum-of-committer in-crowd. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Problem with IFDATA_DRIVERNAME sysctl
Whenever I call this sysctl, I get an errno of EPROGNOTAVAIL from sysctl(): »···name[0] = CTL_NET; »···name[1] = PF_LINK; »···name[2] = NETLINK_GENERIC; »···name[3] = IFMIB_IFDATA; »···name[4] = ifindex; »···name[5] = IFDATA_DRIVERNAME; »···len = IFNAMSIZ; »···if (sysctl(name, 6, dname, &len, NULL, 0) == -1) { »···»···warnc(EX_OSERR, "cannot obtain driver name for ifname %s", »···»···ifname); »···»···return (-1); »···} The ifindex is valid. "dname" is a pointer to an IFNAMSIZ sized buffer. This problem is happening on a 7.0-RELEASE system. It looks like the switch..case in that path could be fubar'd by the compiler as there are not break statements for each distinct case label, could this be due to gcc friendly fire? cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Problem with IFDATA_DRIVERNAME sysctl
Bruce M Simpson wrote: It looks like the switch..case in that path could be fubar'd by the compiler as there are not break statements for each distinct case label, could this be due to gcc friendly fire? Possibly false alarm or PEBKAC, I wasn't checking return values right in some of my code, although we should probably have "break" there anyway. Patch against RELENG_7_0. --- if_mib.c.orig 2008-09-10 00:31:25.0 +0100 +++ if_mib.c2008-09-10 00:32:15.0 +0100 @@ -90,6 +90,7 @@ switch(name[1]) { default: return ENOENT; + break; case IFDATA_GENERAL: bzero(&ifmd, sizeof(ifmd)); @@ -136,6 +137,7 @@ error = SYSCTL_IN(req, ifp->if_linkmib, ifp->if_linkmiblen); if (error) return error; + break; case IFDATA_DRIVERNAME: /* 20 is enough for 64bit ints */ @@ -152,6 +154,7 @@ error = EPERM; free(dbuf, M_TEMP); return (error); + break; } return 0; } ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: ... By the way, would you want someone to implement 'show' support for FreeBSD's route implementation? I can give it a go now. :-) For sure, we'd be very happy to see a patch like that. Many thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.
[EMAIL PROTECTED] wrote: Old Synopsis: icmp socket receives icmp replies not owned by the process. New Synopsis: [icmp]: icmp socket receives icmp replies not owned by the process. This PR is bogus because: ICMP has no concept of datagrams being "owned" by a process. There is no field in the ICMP protocol which differentiates ICMP "sessions" on a per-process basis, and this is because ICMP has no concept of "sessions" -- ICMP messages are directed at IP endpoints. The networking stack will only selectively dispatch ICMP traffic based on two conditions: 1. ip_proto number (raw sockets may selectively bind to a protocol) and 2. multicast group membership (not applicable in this instance). > It also shows that both echo requests have different identifiers in the id field which should keep the icmp streams seperated. There is absolutely no requirement for the kernel code to look at the ID field, beyond reporting it to consumers of the SOCK_RAW interface. This PR can be closed, the submitter should consult the pfSense maintainers. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.
The following reply was made to PR kern/127528; it has been noted by GNATS. From: "Bruce M. Simpson" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: freebsd-net@FreeBSD.org, [EMAIL PROTECTED] Subject: Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process. Date: Sun, 21 Sep 2008 23:12:30 +0100 [EMAIL PROTECTED] wrote: > Old Synopsis: icmp socket receives icmp replies not owned by the process. > New Synopsis: [icmp]: icmp socket receives icmp replies not owned by the > process. > This PR is bogus because: ICMP has no concept of datagrams being "owned" by a process. There is no field in the ICMP protocol which differentiates ICMP "sessions" on a per-process basis, and this is because ICMP has no concept of "sessions" -- ICMP messages are directed at IP endpoints. The networking stack will only selectively dispatch ICMP traffic based on two conditions: 1. ip_proto number (raw sockets may selectively bind to a protocol) and 2. multicast group membership (not applicable in this instance). > It also shows that both echo requests have different identifiers in the id field which should keep the icmp streams seperated. There is absolutely no requirement for the kernel code to look at the ID field, beyond reporting it to consumers of the SOCK_RAW interface. This PR can be closed, the submitter should consult the pfSense maintainers. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.
Chris Buechler wrote: This PR is bogus because: ICMP has no concept of datagrams being "owned" by a process. There is no field in the ICMP protocol which differentiates ICMP "sessions" on a per-process basis, and this is because ICMP has no concept of "sessions" -- ICMP messages are directed at IP endpoints. ICMP echo and echo replies do have "sessions" of sorts, at least unique identifying fields - identifier and sequence number. These fields do exist in ICMP, and as you point out, they are sometimes used to implement session-like behaviour. Many NAT implementations use them in this way. However there is no way of specifying them in a bind() call -- ICMP can only be received on a raw socket, and raw sockets will not filter these things on behalf of a user process, nor have they ever done to the best of my knowledge. They are not part of the address structures for a raw socket (SOCK_RAW, PF_INET, * or IPPROTO_ICMP). This was opened by a pfSense maintainer because it's a change in behavior from 6.x releases where this was never an issue, and is something we feel is a regression. Robert has replied outlining a few situations where the behaviour might have changed. Raw sockets do support binding laddr/faddr, there is the possibility this could have changed, however there is no notion of processes "owning" streams of ICMP messages, this has never been part of the ICMP protocol and to think in these terms is misleading. It sounds to me as though the application is relying on a form of filtering which isn't happening, and the way to track this down is to carefully note what, if anything, changed in the expected behaviour between releases. For example, does the application bind() to any given host addresses? This is the only form of filtering, apart from multicast SSM, that raw sockets would support, and SSM ain't in the tree [yet]. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ACE on FreeBSD?
Hi, I looked at ACE years and years ago (~1997) when Doug Schmidt was first promoting the ideas behind it. The whole Reactor/Proactor split pretty much hangs on the event dispatch which your particular OS supports. The key observation is whether your target OS implements events in an edge-triggered or level-triggered way; I am borrowing definitions from electronic engineering here. You could do a straight port with Proactor, but performance will probably suck, because both FreeBSD (and Linux, I believe) need to emulate POSIX asynchronous I/O operations. Reactor will generally "fare better" on UNIX derived systems such as FreeBSD and Linux, because its event handling primitives are geared towards the level-triggered facilities provided by select(). In Windows, Winsock events use asynchronous notifications which may be tied to Win32 EVENT objects, and the usual Kernel32.DLL thread primitives are used around this. This makes Proactor more appropriate in that environment. XORP does some similar stuff to ACE under the hood to support the native socket facilities of both Windows and FreeBSD/Linux. It's hybridized but it behaves more like Reactor because we run in a single thread, and you have to force Winsock's helper thread to run, by preempting you, using some file handle and socket tricks. I don't currently know about stability of ACE on FreeBSD. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
Hi, I agree with the intent of the change that IPv4 and IPv6 input queues should have a tunable queue length. However, the change provided is going to make the definition of IFQ_MAXLEN global and dependent upon a variable. [EMAIL PROTECTED] wrote: Hi, It turns out that the last time anyone looked at this constant was before 1994 and it's very likely time to turn it into a kernel tunable. On hosts that have a high rate of packet transmission packets can be dropped at the interface queue because this value is too small. Rather than make a sweeping code change I propose the following change to the macro and updating a couple of places in the IP and IPv6 stacks that were using this macro to set their own global variables. This isn't appropriate for many uses of ifq's which might be internal to a given driver or subsystem, and which may use IFQ_MAXLEN for convenience, as Ruslan has pointed out. I have code elsewhere which does this. Can you please do this on a per-protocol stack basis? i.e. give IPv4 and IPv6 their own TUNABLE queue length. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
[EMAIL PROTECTED] wrote: ... I found no occurrences of the above in our code base. I used cscope to search all of src/sys. Are you aware of any occurrences of this? I have been using IFQ_MAXLEN to size buffer queues internal to some IGMPv3 stuff. I don't feel comfortable with a change which sizes the queues for both IPv4 and IPv6 stacks, from a variable which is obscured by a macro. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: lost routes
Giulio Ferro wrote: There are no messages in the logs, and no interface has been touched. Anyway, since there are a lot of routes and only one gets deleted I don't think it depends on interface changing (it would delete them all, wouldn't it?) Normally static routes only get touched if the state of the underlying ifp/ifa changes. There are paths in netinet which will cause routes to be deleted in this situation. Occasionally the idea of a floating static re-surfaces... look in the PR database with this term for possibly related reports. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Initialisation of a networking protocol
Hi Ryan, Did you initialize the .pr_init member of struct protosw for MPLS? AFAIK, MPLS does not use an outer IP header, so adding a struct ipprotosw won't work; they are similar structs however. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Freeing an mbuf cluster
Yony Yossef wrote: Hi All, I'm trying to manually build an mbuf chain with clusters in various sizes. I'm doing it using the MGETHDR and MEXTADD macros, it works fine. Now I'm looking for the simplest way to free an mbuf cluster, since I want to free the clusters seperately. This function will be given as a parameter to MEXTADD. Is there a simple command like 'free(buf)' to free an mbuf cluster? You don't specify if you are trying to add the external storage from a pool you manage, in which case, you're on your own. m_free() for a cluster or mbuf should just "do the right thing". Since the UMA cleanup there are destructor functions which should free the mbuf or cluster using the right pool. m_freem() works on chains, of course. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
How to support an Ethernet PHY without ID registers?
Hi, I have been trying to get FreeBSD onto the Freecom FSG3 Storage Gateway. It is an xScale based ARM system. Whilst the npe(4) driver appears to attach, the PHY does not. It is a Realtel RTL8305SB switch chip in dual miibus mode. Unfortunately the RTL8305SB does not have ID registers. The RTL8305SC does, but it's a totally different chip. We do have a driver in the tree for the RTL8305SC, however these chips are different enough for this to cause problems. Is there any way I could for example force ukphy(4) to attach? Note: Because there are no ID registers, mii_phy_probe_gen() WILL NOT work. It looks like I'd have to override this by hacking if_npe.c itself. Can anyone clarify? cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: How to support an Ethernet PHY without ID registers?
Sepherosa Ziehau wrote: Are you sure you could read from BMSR? Return invalid value from BMSR is the usual cause of miibus attaching/probing failure. For ID1/ID2 reading, you could just fake some values in npe(4)'s miibus_readreg implementation. Thanks for the tip (from you and Pyun). I had to spoof the BMSR read to get npe(4) to attach just to begin with. For whatever reason the chip doesn't seem to respond on any of the PHY IDs which the Linux folk are using (5 and 4 for npe0 (-B) and npe1 (-C) respectively). I noticed the ucLinux folk needed a similar patch to force driver attach under Linux w/the IXP: http://mailman.uclinux.org/pipermail/uclinux-dev/2005-March/031419.html The switch pretty much disappears after npe(4) attaches, I don't see any activity lights or link lights at that point. This seems to happen after any mii register access. If I frob things to allow rlswitch to attach, by using hints and hacking if_npe.c, I can get dumps of the PHY register space, but it's all ones, suggesting that it failed at xScale register level -- that would suggest the PHY IDs are *wrong*, or something else isn't right. Pyun also suggested trying to manually take the PHYs out of power-down mode. I tried that with a code snippet I sent him, but still no dice. I can't even be sure that the PHYs are being addressed right. At this point I kind of have to go, whoah, wish I had a logic analyzer and grabbers! I believe the firmware configures the switch chip in a certain VLAN configuration which isn't meant to be disrupted, although Freecom's own SnapGear-based distro apparently does the right thing. I've looked through all of their GPL materials and cannot find the driver for the switch. I suppose one thing I could try is re-flashing the box with the official Freecom firmware, and using mii-diag to dump out what Linux thinks the registers are. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Vimage howto
Julian, Thank you (and Marko) very much for preparing this document. The VIMAGE import has had me at something of an impasse re: the IGMPv3 branch and clearly written documentation is a big help indeed. Julian Elischer wrote: Well not completely, but I've had a number of questions over the last few months about what it is, so, as Marko and I have written the following "how to virtualize your module" document, I've been directing people to it. After another couple of questions I think this could do with wider distribition.. Thank you also for providing it here on the list, as opposed to relying on Perforce alone. Whilst I understand committers rate p4 for experimental work in the FreeBSD sphere, sadly it is simply not accessible to the not-so-silent majority in the FreeBSD sphere who are not committers, which makes its continued use questionable at best. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to program a driver?
Espartano wrote: Actually i know how to program with C language in a basic level but i don't know nothing about hardware or computer organization, what topics i should study for gain knowledges about net-drivers ? or if someone can recommend me books about this topic i will be very thankful. Try "The Indispensable PC Hardware Book" by Hans-Peter Messmer for a general overview of PC architecture. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to program a driver?
[Resend to list for everyone] Espartano wrote: Actually i know how to program with C language in a basic level but i don't know nothing about hardware or computer organization, what topics i should study for gain knowledges about net-drivers ? or if someone can recommend me books about this topic i will be very thankful. The seminal work is TCP/IP Illustrated Volume 2 (Gary Wright and W. Richard Stevens, Addison-Wesley). Whilst dated it will give you an overview of how all the parts in the BSD networking stack fit together. It really needs to be updated, however enough things are in flux right now that summarising all the changes would be difficult until say after FreeBSD 8.0 dust is settled. For computer architecture, probably best to learn PC architecture these days -- x86 is here to stay, kids, and Netbooks are something of a reactionary response triggered by the One-Laptop-Per-Child (OLPC) project. In my day, I learned 68000 assembly and C on the Amiga. Hans-Peter Messmer's "The Indispensable PC Hardware Book" is a huge book which cost me about 50 GBP new when I first bought it -- I was working in a reasonably well paid job at the time, but it can be found second hand no doubt around the world. Cover to cover it will tell you what you need to know about how the PC architecture fits together, but if you need more detail e.g. on stuff like FreeBSD network drivers, again, it's best to refer back to the source code itself. Hope this helps. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Heads up --- Thinking about UDP and tunneling
Hi, I am missing context of what Max's suggestion was, do you have a reference to an old email thread? Style bugs: * needs style(9) and whitespace cleanup. * C typedefs should be suffixed with _t for consistency with other kernel typedefs. * Function typedefs usually named like foo_func_t (see other subsystems) Have you looked at m_apply() ? It already exists for stuff like this i.e. functions which act on an mbuf chain, although it doesn't necessarily expect chain heads. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: last call for L2/L3 rewrite code review
Hi, Just skimming this I notice it uses the if_afdata[AF_INET] pointer purely for lltbl purposes; this clashes with the IGMPv3 code drop. Please look in the bms_netdev branch, where I introduce a 'struct ip_ifinfo' to make more general use of that slot. IGMPv3 needs to store per-interface state for AF_INET, so this slot really needs to be shared with other AF_INET stuff. Looks like it needs to be updated for VIMAGE also, hopefully others more familiar with this can help -- I am busy enough with non-programming activity as it is to get up to speed on this, although I have at least managed to print Julian's write-up... Other than that, it looks like a much needed improvement and we are all very grateful for our work on this. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: .. Based on the discussion in the link above, it doesn't seem like the problem was entirely resolved by the patches mentioned in this thread. Has anything been done since this discussion took place. Surely there must be a way to get limited broadcast to work under FreeBSD. You will need to go to the pcap layer to send limited broadcasts w/o any IPv4 addresses configured in a BSD stack for now. If you have an IP on the interface, you can just use IP_ONESBCAST. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: ... It's really a matter of time. We didn't anticipate limited broadcast being broken in FreeBSD and we're scrambling to come up with a solution. To be quite frank I haven't done anything with IPv6 before so it would be more research to get up to speed on this option. It seems our best option is scapy, which unfortunately I also haven't used before... It's not broken -- it has always been this way in all BSD derived networking stacks. Limited broadcast addresses just don't contain any information about where the datagram should go, and this is the case in all other implementations. They are similar to multicast addresses in that regard. Linux has a knob SO_BINDTODEVICE which is partly there to workaround this problem, however it isn't the ideal semantic fit. The folk who point out that link-local addresses could be used, have an interesting suggestion which might work for you. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: The folk who point out that link-local addresses could be used, have an interesting suggestion which might work for you. It's definitely interesting, but it is very likely that some of our customers will want to be able to set their own IP ranges and not be limited to 169.254/16. So we need a more generic solution. Sounds like it's bpf/pcap city for you guys. A similar bump-in-the-stack to SO_BINDTODEVICE, e.g. let's call it IP_SENDIF has been on the drawing board, but it needs appropriate security screening -- the ability to bypass the forwarding tables, whilst specifying an interface e.g. by index or name, would be desirable only for certain privileged processes. BTW: If you guys are already looking at scapy, you may also wish to give pcs.sourceforge.net a look as an alternative. It is a Python project which I did some hacking on with George Neville-Neill who started it. It has BPF/PCAP support out of the box and has a number of powerful features, including a packet-level expect() facility, which works in a very similar manner to pexpect (Python expect for text streams). I added a scapy-like concatenation syntax ('/' operator) to it as that makes plugging packet chains together that much easier. I have the beginnings of an IGMPv3 test suite in my home repo written using PCS, it uses pcap capture. I imagine a DHCP like protocol could easily be implemented using PCS too. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: ... I personally like this idea, but I'm not sure I can sell it to the others. Are there any restrictions to these 169.254.x.y addresses? 169.254.0.0/16 must never appear outside a link -- it is strictly scoped to that link. Currently the IPv4 BSD stack has no concept of link-scoped addresses, but IPv6 does. Link is a realized concept there because of KAME's support for the % syntax. Internally, interface indexes get used. In practice this shouldn't be an issue as long as you can guarantee different addresses are used for the 169.254.0.0/16 block on each interface, however, it would mean any app using sockets would need to explicitly bind to the local address to ensure the correct interface is used. Furthermore, we effectively need to be able to support multiple next-hops for the 169.254.0.0/16 prefix, otherwise we can support only one such interface w/o significant kernel code rewrites. So, really, LL may not buy you anything at all, and it's likely you need to go straight to pcap for your app. These restrictions have existed for years, and the fact that they haven't been addressed has largely been because there has been no community strategy to deal with it. I speculate some BSD-using organisations might have already solved these problems, however, without evidence (and code sharing), that's pure speculation. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Bruce M. Simpson wrote: Peter Steele wrote: ... I personally like this idea, but I'm not sure I can sell it to the others. Are there any restrictions to these 169.254.x.y addresses? 169.254.0.0/16 must never appear outside a link -- it is strictly scoped to that link. P.S. I checked in a change to ip_forward() a while back which enforces this, as forwarding such traffic between interfaces without NATting it or otherwise proxying it is a really bad idea (and also breaks the IPv4 LL RFC). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: howto determine network device unit number? device.hints?
Yony Yossef wrote: Thanks for the explanation. So there's no way to determine this in advance.. I must build a script that contains my own mapping between MAC addresses and the wanted interface names and run it after each driver load, rename the interfaces if necessary. It seems quite wrong, don't you agree? And how come the unit number is given an arbitrary value? Is there a good reason for that? Normally the PCI probe runs in the opposite direction from that of Linux. It's largely to do with how the NEWBUS code walks the PCI bus. From a systems management point of view, yeah, it's irritating, however it would probably take more effort (i.e. kernel code) to try to patch it to work differently, and not everyone has free time to sit down and patch the kernel. That and (unlike Solaris) there is no *direct* mapping between the card's driver number on the bus and its network driver number. In your case I'm not sure why your two cards would flip order. Could it be how your BIOS and hardware set up the PCI IDSEL lines at boot? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: howto determine network device unit number? device.hints?
Yony, Bruce M. Simpson wrote: And how come the unit number is given an arbitrary value? Is there a good reason for that? ... In your case I'm not sure why your two cards would flip order. Could it be how your BIOS and hardware set up the PCI IDSEL lines at boot? If this is the case on your system, then you really need to provide more data about your hardware, i.e. motherboard, BIOS, vendor information etc. as others point out. Based on the data you've provided about the issue to date, my best guess is that something in the above is different on your system (which is why I mentioned IDSEL lines -- the mechanism PCI uses to actually assign bus numbers electrically). Normally the behaviour of FreeBSD's bus probes is well known -- nexus is walked for child buses, then these buses are plumbed into NEWBUS, e.g. cpu0...cpuN on nexus itself, PCI buses, and PCI subordinate buses in that order. * You mention you don't encounter the issue with Linux, but you may already be aware that udev can tie driver instance number(s) to specific MAC addresses, although this process isn't fully automatic and any given distro may or may not create the persistent udev rules on a first run -- so this is comparing apples with oranges. * [PCI-Express is a special case though, and I've had to sit down and do some work with commercial clients to make sure their appliance was able to detect devices being in particular slot numbers. Again, though, it's just as subject to the PCI enumeration order further up on the bus hierarchy as non-PCI-Express drivers.] So your issue may not be a simple matter of "this seems wrong, this doesn't work", though I am sorry to hear it isn't working for you right now. There are a lot of dynamic factors in the overall picture of the system, and what seems to work as expected for many users, may not be working for you, and we really need basic hardware information, when folk see things like this happening, for any volunteer(s) out there to come up with the right solution, let alone the true picture of what's actually going on in your specific case. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: howto determine network device unit number? device.hints?
Eygene Ryabinkin wrote: ... I wanted to stress only one point: simple 'kldunload ' and 'kldload ' makes devices to flip for Yony's case. This means that unless some PCI hotplug stuff is here (which I don't believe to be present, because no physical cards are touched and there is actually a small amount of PCI hotplug support in FreeBSD), no physical PCI devices get added or removed from the PCI child tree. It looks like that something goes wrong during the PCI tree reprobe on the driver module loading. BTW: Thanks for looking further at the software layer first. VIM is a wee bit easier to use than a bus analyzer. Most motherboards don't support PCI geographical addressing, so... I wager it's the network driver code which may be the source of the problem, based on your analysis! If this code just doing a blind bump of an instance count and using that as a "unit number"... well, that's OK and expected for software virtual devices, but is counter-intuitive for something like hardware. But I don't have any mtnic source, so this is pure speculation on my part. Correct me if I am wrong, but pci_driver_added from /sys/pci/pci.c will invoke device_get_children() to get the list of the attached devices, and for PCI case the list should be static. Yup, that's right. I guess that when Yony will enable verbose boot and will show us kernel messages from two successive kldunload/kldload sequences, we will get some additional information about what's going on. Hopefully he will chime in... [bms does some google searching *before* he thinks about throwing his toys out of the pram at the Orignal.Poster.] ding :-) [a light bulb above bms' head] So... Yony. you're writing a driver. Maybe there's a bug in it? That's cool, dude. Hope it's a nice card and you plan on sharing the sweets with the rest of the class. ;-) But seriously, please mention that you are writing a driver in general questions you might ask about the whole system, otherwise, FreeBSD volunteers will run around going "Is core code broken?" and that's not so good for community stress levels as a whole. with lemonade, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: IGMP+WiFi panic on recent kernel - in igmp_fasttimo()
Sam, Sam Leffler wrote: This patches avoids the crash. Not sure how ifma_protospec is supposed to be handled so I'm not committing it. Thanks for this. I have a test machine ready to be prepped but it's missing a CF card (I have none) so need to pick one up from a friend. I have a pci-cardbus adapter + a ral(4) CardBus card, but no CardBus ath(4) -- I imagine this ain't specific to ath(4) so that should be fine. I'll try to look at this Sun/Mon, I have a -CURRENT image built for the 1U box now that just needs bootstrapping (it has a CF slot). thanks, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
Matthias Apitz wrote: I went today evening with my EeePC and CURRENT on USB key to that Greek restaurant; DHCP does not get IP in CURRENT either; this is somehow good news, isn't it :-) This may be orthogonal, but: A lab colleague and I have been seeing a sporadic problem where the ath0 exhibits the symptoms of being disassociated from its AP. We are running RELENG_7 on the EeePC 701 since the open source HAL merge. In the behaviour we're seeing, we don't see any problem with the initial dhclient run, the ath0 just seems to get disassociated within 5-10 minutes of associating. If we leave 'ping ' running in the background, we don't see this problem. We have yet to produce a tcpdump to catch it 'in the act' and observe the DLT_IEEE80211 traffic when it actually happens, I have only seen the symptoms. The AP does not show the EeePC units as being associated any more at this point, but ath0 still shows 'status: associated'. The AP involved is a Netgear WG602 V2, and is running the vendor's firmware. I'll try to get set up with 'tcpdump -y ieee802_11' from initial boot (including dhcp and anything we bump into). cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
The following reply was made to PR kern/132722; it has been noted by GNATS. From: Bruce M Simpson To: Matthias Apitz Cc: bug-follo...@freebsd.org, Sam Leffler , freebsd-net@freebsd.org, "Sean C. Farley" Subject: Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work Date: Mon, 23 Mar 2009 18:44:42 + Matthias Apitz wrote: > I went today evening with my EeePC and CURRENT on USB key > to that Greek restaurant; DHCP does not get IP in CURRENT either; > this is somehow good news, isn't it :-) > This may be orthogonal, but: A lab colleague and I have been seeing a sporadic problem where the ath0 exhibits the symptoms of being disassociated from its AP. We are running RELENG_7 on the EeePC 701 since the open source HAL merge. In the behaviour we're seeing, we don't see any problem with the initial dhclient run, the ath0 just seems to get disassociated within 5-10 minutes of associating. If we leave 'ping ' running in the background, we don't see this problem. We have yet to produce a tcpdump to catch it 'in the act' and observe the DLT_IEEE80211 traffic when it actually happens, I have only seen the symptoms. The AP does not show the EeePC units as being associated any more at this point, but ath0 still shows 'status: associated'. The AP involved is a Netgear WG602 V2, and is running the vendor's firmware. I'll try to get set up with 'tcpdump -y ieee802_11' from initial boot (including dhcp and anything we bump into). cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
ath0 apparent silent disassociation
[Repost without attachment] OK. We've managed to reproduce this set of symptoms now in our work area. [If anyone needs to see a pcap, please Cc: me offlist.] Timebase: beginning of the pcap is in sync with a bringup from single-user mode; the tcpdump runs in the background from init whilst the system is brought up. OK, so I timed the apparent loss of connectivity as 6m 30s from that point I hit the stopwatch, to when I hit it again when the AP's Web GUI no longer shows the STA affected as being associated. Obviously such a timing is subject to human/visual jitter, and how often Netgear's firmware pulls the STA association list from the AP into the web GUI. What stands out in the pcap is that 302.291s in (almost 5m exactly), the STA (ath0) sends an IEEE 802.11 NULL frame to the AP with the PWR MGT bit set (I'm going to sleep!). This more or less coincides with a normal beacon from the Netgear AP. It does not advertise Auto Power Save Delivery (apsd), that bit is 0. This is puzzling as we don't enable power management by default. As I understand it, this may be an AP feature in some environments... I can try reproducing this with an explicit 'ifconfig ath0 -powersave' and see if it reoccurs. You'll see that after this NULL frame is sent, there is another Probe Request, and the Netgear AP does Probe Respond, but this makes no difference (I ended the capture around 150s after the NULL frame was sent). At this point we can't send traffic from the ath0, or rather, the AP is acting as though it never even heard the STA. The STA learns the AP's IP address/MAC mapping through passive ARP -- we still see broadcasts on the SSID -- but the AP has started to totally ignore the STA, and seemed to have ignored its ARP requests also. We are using MAC address ACL control with this AP, and the ath0 affected is definitely listed in its ACL table, configured up, rebooted etc. It is as though the STA is entering power saving mode when not explicitly told to, and the AP is not waking up the STA as it should. If any more information needed, or where to look, please let me know what's involved (I MFCed the change after all, so I'll help where I can until I'm on holiday this week...) My lab colleague is just working around this with 'ping ' for now, that keeps things up, as does OpenVPN... cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/124282: [libc] socket(2): INP_PORTHIGH and INP_ONESBCAST share same value
bru...@freebsd.org wrote: Synopsis: [libc] socket(2): INP_PORTHIGH and INP_ONESBCAST share same value Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: brucec Responsible-Changed-When: Mon Mar 23 21:45:54 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). rwatson@ saw this crop up in -CURRENT and I believe he has a fix. Not sure about MFC but it clearly needs to get fixed... cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
John Hay wrote: I found doing a -bgscan before it happens, make it not happen. I now have -bgscan in my rc.conf. That's exactly the workaround I needed. Thanks John. As Sam points out, the root fix is probably already in HEAD; it would be nice to find time to backport, but this works for us for now as a workaround (we are just using ath0 as a STA for testing in the lab at the moment, it is likely we will use hostap later). cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
The following reply was made to PR kern/132722; it has been noted by GNATS. From: Bruce M Simpson To: John Hay Cc: Matthias Apitz , freebsd-net@freebsd.org, Sam Leffler , "Sean C. Farley" , bug-follo...@freebsd.org Subject: Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work Date: Tue, 24 Mar 2009 01:08:33 + John Hay wrote: > I found doing a -bgscan before it happens, make it not happen. I now > have -bgscan in my rc.conf. > That's exactly the workaround I needed. Thanks John. As Sam points out, the root fix is probably already in HEAD; it would be nice to find time to backport, but this works for us for now as a workaround (we are just using ath0 as a STA for testing in the lab at the moment, it is likely we will use hostap later). cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: CARP as a module; followup thoughts
Hi, Will Andrews wrote: Hello, I've written a patch (against 8.0-CURRENT as of r191369) which makes it possible to build, load, run, & unload CARP as a module, using the GENERIC kernel. It can be obtained from: http://firepipe.net/patches/carp-as-module-20090421.diff There's no need to implement the in*_proto_register() stuff in that patch, you should just be able to re-use the encap_attach_func() functions. Look at how PIM is implemented in ip_mroute.c for an example. Other than that it looks like a good start... but would hold off on committing as-is. the more general case of registering a MAC address on an interface should be considered. cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: intel 802.11 2200BG routing
Da Rock wrote: So I could use some guidance as to what I can do to rectifiy this problem. I have 2 goals: 1. setup iwi to start on boot, and attach to my ap whenever its in range. 2. make sure iwi stays connected without manually monitoring it. 3. prioritise my routes via the rl0 and iwi if's so that cable is used over wifi, but both can be used to access the network. Umm, that's 3 goals. :^) The short answer is, you can't do what you're trying to do, yet. You can cut over without rebooting, you just need to remember to kill off all dhclient processes and manually remove the default route, as in FreeBSD all forwarding entries ('routes') reference an interface pointer, and the PRC_IFDOWN handler will not touch routes marked RTF_STATIC. No one as far as I know has rolled a 'cutover' script. What would be really useful is a port which can do this cutover in a more general way until the stack is changed. This isn't that different from say Microsoft Windows where a manual cutover is needed, although the OS having a multipath FIB ('routing table') helps. The long answer is, it's possible, and it requires some things in the network stack to be carefully reworked. I have looked at these issues in some depth; there are at least 3 items on the Network Stack Wiki which are directly relevant to making the kind of clean cut-over between wireless/wired interfaces possible. Notably looking at the PRC_IFDOWN handler in netinet, making forwarding entry lookup skip interfaces marked down, and introducing route preference into the routing trie. There are historical reasons why the code is the way it is. It will take a while to get these issues addressed going forward. Regards, BMS P.S. routed isn't going to help you at all in this situation, it's just an implementation of the RIPv2 routing protocol; it may have helped as the routes it introduces to the kernel are !RTF_STATIC. One thing I haven't tried is IPv4 Router Discovery (rdisc), that may help update the default route quickly. The problem with this of course is the additional network configuration in the infrastructure itself. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPv6 Router Alert breaks forwarding
I can only speak about IPv4 router alert in detail; we do nothing with IPv4 RA nor would it appear that it would make any real difference in performance given how the code is laid out. RSVP packets should be passed verbatim to userland from ip_input() via rip_input() there. I think your IPv6 fix is good for now but will wait to hear further from [EMAIL PROTECTED] I am heading out the door so if someone could add an item for this to http://wiki.FreeBSD.org/Networking I should be most grateful. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Source-specific multicast
I am very close to merging support for RFC 3678 to -CURRENT. I will make a patch available before I commit. The only userland consumer in the tree which is likely to be affected by the removal of ip_multicast_if() from the kernel is routed, which I will update to use the new setsourcefilter() API. The SSM code does change some of the coupling between sockets and IGMP, and changes some logic in udp_input; strict multicast membership becomes the default. For systems which deal with many multicast sockets and traffic, they may benefit from an additional hash table. I haven't finished touching the raw IP input path. Given current looming commitments I'm open to someone volunteering to finish the work of merging IGMPv3 and MLDv2, or possibly to fund the work. I wish to get at least the socket part of ASM/SSM merged before I come back to Yar's PR with vlan and pfsync, which I have not had reason to investigate thoroughly; I have had no further reports of problems with carp(4) in -CURRENT. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: A radical restructuring of IPsec...
I'm all for this in principle. I believe that the case for FAST_IPSEC over KAME IPSEC is fairly clear for those of us who have read the USENIX paper. Qualitatively speaking I can say FAST_IPSEC has been more pleasant to work with when introducing the TCP-MD5 support. I will try to look at the patch in more detail as time permits. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Spillover routing?
Rajkumar S wrote: Hi, I have a low cost 128kbps and a high cost 512 kbps link to internet. Is it possible to do a "spillover" routing so that the high cost link is used only when the low cost link is, say, used more than 80%. This feature is almost certainly not going to be present in the base system. What you would need to do to implement this is to configure a part of the kernel to perform bandwidth measurements and make an upcall to bring up the other link in a dial-on-demand style configuration. Add NAT into the mix and it gets even more interesting. I believe pf+altq may have the potential to do this however I could not help you with where to begin re configuring it to do so, so I wish you best of luck in your research. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Call for testers: olsrd and IP_ONESBCAST
Hi, For a while now I have had a patch available to teach olsrd to use IP_ONESBCAST instead of using libnet/bpf just to send broadcast datagrams in FreeBSD, which has had IP_ONESBCAST for a few years now. If anyone is using olsrd on FreeBSD I would greatly appreciate testing and feedback for this patch: http://people.freebsd.org/~bms/dump/olsrd-onesbcast.diff Thanks! BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Interface index hack in IP_ADD_MEMBERSHIP
Yar Tikhiy wrote: Quagga still uses it, too, if its configure script detects FreeBSD or NetBSD. I'm afraid it was me who submitted the patch to the Quagga folks when I'd found that Quagga's ospfd couldn't handle unnumbered P2P interfaces in FreeBSD because their local IPs weren't unique. Unfortunately, Quagga doesn't seem to use the protocol independent part of the RFC 3678 API yet. A preliminary patch for the Rhyolite.com routed is available at: http://people.freebsd.org/~bms/dump/routed.rfc3678.diff The upcoming rewrite of IPv4 multicast host-mdoe logic (currently in bms_netdev) adds support for the Linux-derived 'struct ip_mreqn' for specifying interface indexes to IP_MULTICAST_IF. The RFC 3678 API is implemented; IGMPv3 and MLDv2 may be hooked in later on subject to available resources. The RFC 1724 hack has been completely removed from the kernel in this spin. The new code passes the existing regression tests for any-source multicast. I hope to have source-specific multicast regression tests in the main tree ASAP, I am very close to a code drop. Whilst the radical approach of rewriting this stuff may break legacy applications, they should probably be updated to support the new APIs anyway, given that Linux 2.6 and Microsoft Windows "Longhorn" both support RFC 3678. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
[CODE DROP] KDE support for Avahi service browsing on FreeBSD
Hi, As part of my ongoing work to support Zero-configuration networking in FreeBSD, please check out the following patches. I have been able to browse and connect to services from with the KDE environment using these. At the moment, the basic kernel support for Zeroconf/Bonjour is in place in FreeBSD. The next challenge there is to support address scope and preference for IPv4. http://people.freebsd.org/~bms/dump/nss_mdns.diff is a patch for the FreeBSD port of nss_mdns which I may commit shortly as it fixes a dynamic symbol issue found by Pat Lashley. nss_mdns must be installed and configured in FreeBSD's /etc/nsswitch.conf files before proceeding. http://people.freebsd.org/~bms/dump/avahi-qt3.diff is a patch for the FreeBSD avahi port to build and install the QT3 bindings for Avahi. After applying this patch and reinstalling the avahi port, please manually change the following file's 'prefix' line to point to ${X11BASE}, at ${LOCALBASE}libdata/pkgconfig/avahi-qt3.pc. e.g. "prefix=/usr/local" would become "prefix=/usr/X11R6". This is to allow kdnssd-avahi's configure script to find QT's Meta-object compiler (moc) which the FreeBSD ports system installs under ${X11BASE} by default. [Help from a ports committer to convert this patch into a 'slave port' would be very appreciated.] http://people.freebsd.org/~bms/dump/kdnssd_avahi.tar is a port for kdnssd-avahi. Installing this port will overwrite the default libkdnssd.so.1 library which is installed by the kdelibs port. After applying both of these changes to your system, you must completely restart KDE for them to take effect. Please read the pkg-message file in kdnssd_avahi.tar for step-by-step information on how to test the Avahi support for KDE in FreeBSD. I would greatly welcome your further testing and feedback. I apologise in advance for the unpolished nature of this work, however, integration of Avahi with KDE is an ongoing challenge for many other open source projects, and I would hope that the loose ends on FreeBSD become tied together in the near future. Thanks again, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [PATCH] kern/681110 re-roll of RFC3522 (Eifel detection) patchset
First of all, Xin: Many thanks for your excellent work on bringing the code up to date. Mike Silbersack wrote: No. That is not going into FreeBSD if I can help it. http://www.ietf.org/ietf/IPR/ERICSSON-EIFEL On top of that, we don't need yet another complication to the already too-complex retransmission code. I wasn't aware of Ericsson's submission on this basis. Whilst FreeBSD's license is recognised by the OSI, the implications of having code in the kernel which are covered by an Ericsson patent are quite grim if anyone wishes to use FreeBSD for commercial purposes. I therefore agree with you that that this change should not go in, and have removed it from the Wiki. Kind regards, BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ipv6 multicast refcnt panic
Andrew Thompson wrote: I have come across this panic which appears to be from incorrect refcounting on the inet6 multicast code. I'm assuming this is in -CURRENT, as the refcount code has not yet been MFCed. ... in6m_refcount is still 1 so the in6_multi is not freed. I'll try to investigate further as time permits. Thanks for pointing this out, I suspect the same problem affects vlan and other nested cloners. Regards, BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: altq unfortunately queuing vlan traffic.
I can't speak for ALTQ at the moment however I believe dummynet may work on vlan devices. I was careful not to break this when rewriting ether_input() in -CURRENT, as ip_dn_check_rule() is always called any time ether_demux() is entered (regardless if ether_input() has been re-entered due to the presence of M_PROMISC on a given mbuf chain). Regards, BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ipv6 multicast refcnt panic
Andrew Thompson wrote: I have come across this panic which appears to be from incorrect refcounting on the inet6 multicast code. I can reproduce this panic, however I don't entirely understand what's going on. When the same IPv6 unicast address is configured twice on the edsc0 interface, the ifmcstat(8) utility reports that the refcnt for two IPv6 multicast addresses changed. I do not understand why the duplicate unicast address isn't rejected, or why groups are being joined twice for the same address. I strongly suspect this is a bug in KAME the kind of which existed in netinet (whereby the 224.0.0.1 address was being joined more than once per ifnet) which the refcounting change has exposed as a panic, a very brief look at the ifaddr code in netinet6 suggests this is the case. Before second address assignment: edsc0: inet6 f00f::1 group ff01::1%edsc0 refcnt 1 mcast-macaddr 33:33:00:00:00:01 refcnt 1 group ff02::2:f23c:3567%edsc0 refcnt 1 mcast-macaddr 33:33:f2:3c:35:67 refcnt 1 group ff02::1%edsc0 refcnt 1 mcast-macaddr 33:33:00:00:00:01 refcnt 1 group ff02::1:ff00:1%edsc0 refcnt 1 mcast-macaddr 33:33:ff:00:00:01 refcnt 1 After second address assignment: edsc0: inet6 f00f::1 group ff02::1:ff00:1%edsc0 refcnt 1 mcast-macaddr 33:33:ff:00:00:01 refcnt 1 group ff01::1%edsc0 refcnt 2 mcast-macaddr 33:33:00:00:00:01 refcnt 1 group ff02::2:f23c:3567%edsc0 refcnt 2 mcast-macaddr 33:33:f2:3c:35:67 refcnt 1 group ff02::1%edsc0 refcnt 2 mcast-macaddr 33:33:00:00:00:01 refcnt 1 The order of the addresses in the list has flipped around, which makes visual comparison that much more difficult. Flipping those around to the same order as the first sample yields: edsc0: inet6 f00f::1 group ff01::1%edsc0 refcnt 2 mcast-macaddr 33:33:00:00:00:01 refcnt 1 group ff02::2:f23c:3567%edsc0 refcnt 2 mcast-macaddr 33:33:f2:3c:35:67 refcnt 1 group ff02::1%edsc0 refcnt 2 mcast-macaddr 33:33:00:00:00:01 refcnt 1 group ff02::1:ff00:1%edsc0 refcnt 1 mcast-macaddr 33:33:ff:00:00:01 refcnt 1 So we can be sure the addresses themselves haven't changed, but the refcount on the IPv6 multicast entries has gone up by 1. The refcount is no longer proxied to the ifnet-level ifma object since the code was changed. I don't entirely understand the relationship between the protocol-level multicast addresses and the unicast address in netinet6, or why attempting to configure the same unicast address on the same interface more than once wasn't rejected with an error. As far as I can tell the code is correct for the single address case. I've attached a patch which makes the netinet6 detach path more like the netinet one, though this isn't going to make a great deal of difference apart from code style; the net code already calls in6_ifdetach() in the right order. We can weaken the error checking in if_delmulti() to get an operational kernel, but this kind of defeats the point of doing the error checking (which is there to expose such problems). When reporting problems with the networking code it is helpful to use ifmcstat, INVARIANTS, and the DIAGNOSTIC kernel option as I tend to add code to catch cases like this. Regards, BMS //depot/user/bms/netdev/sys/netinet6/in6_ifattach.c#1 - /home/bms/p4/netdev/sys/netinet6/in6_ifattach.c --- /tmp/tmp.3746.0 Thu Apr 12 12:32:23 2007 +++ /home/bms/p4/netdev/sys/netinet6/in6_ifattach.c Thu Apr 12 12:25:55 2007 @@ -76,6 +76,7 @@ static int get_ifid __P((struct ifnet *, struct ifnet *, struct in6_addr *)); static int in6_ifattach_linklocal __P((struct ifnet *, struct ifnet *)); static int in6_ifattach_loopback __P((struct ifnet *)); +static void in6_purgemaddrs __P((struct ifnet *)); #define EUI64_GBIT 0x01 #define EUI64_UBIT 0x02 @@ -731,8 +732,6 @@ struct rtentry *rt; short rtflags; struct sockaddr_in6 sin6; - struct in6_multi *in6m; - struct in6_multi *in6m_next; /* remove neighbor management table */ nd6_purge(ifp); @@ -790,18 +789,10 @@ IFAFREE(&oia->ia_ifa); } - /* leave from all multicast groups joined */ - in6_pcbpurgeif0(&udbinfo, ifp); in6_pcbpurgeif0(&ripcbinfo, ifp); - - for (in6m = LIST_FIRST(&in6_multihead); in6m; in6m = in6m_next) { - in6m_next = LIST_NEXT(in6m, in6m_entry); - if (in6m->in6m_ifp != ifp) - continue; - in6_delmulti(in6m); - in6m = NULL; - } + /* leave from all multicast groups joined */ + in6_purgemaddrs(ifp); /* * remove neighbor management table. we call it twice just to make @@ -889,4 +880,23 @@ } splx(s); +}
Re: ipv6 multicast refcnt panic
I speculate that the problem you are seeing in netinet6 is due to it not freeing referenced in6_multi objects when the interface address changes or the same address is re-added, as the same bug was present in netinet. Previous to the introduction of refcounting, FreeBSD would just leak memory. Further to this: The problem Yar was seeing with vlan and pfsync, which I pointed out, was an older bug which has been progressively shuffled around the stack due to code rewrites. I have a fix for the kernel panic caused by pfsync's member interface being detached which is now checked into bms_netdev, it should probably go straight into -CURRENT. The fix is cumulative -- pfsync's detach handler is called after netinet has torn down all inet state for an instance of ifnet, therefore it should not be trying to call in_delmulti(), however it should mark the ifp as no longer valid for pfsync's use. A suggested architectural fix going forward, is to change the semantics of objects owned by the netinet and netinet6 protocol domains, such as multicast group objects, to tear down hardware state when the ifnet instance goes away, yet allow consumers elsewhere in the kernel to retain handles for such objects. This is what the lower-level net code now does for ifmultiaddr objects. if_delmulti_locked() accepts an argument which specifies whether it is being called from if_detach(). If so, hardware state is torn down, and internal structures are freed, but the object *is not* freed if its reference count is not zero as someone still holds a pointer. In plainer language: netinet and netinet6 should probably be doing the same thing as net now does, insofaras this only apples to ifmultiaddr, the same should be done for in_multi and in6_multi. Of course, it would be easier to do this if per-protocol-domain state in ifnet were e.g. moved to the if_afdata[] array currently defined in ifnet for this purpose, this is guaranteed to break the ABI. The situation in ifnet as it stands just now strikes me as one of confusion. Regards, BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [PATCH] Zeroconf: avahi-autoipd support for FreeBSD
Bruce M Simpson wrote: Comments and feedback, particularly more in-depth testing by another contributor, are very welcome. I have tested this on my local 802.11 wireless segment with ath(4). Before this can be committed to ports or pushed upstream, it is missing an rc script. There has been feedback from the Avahi guys. I've updated the BSD-specific patch (which is against our present port), and the code has just been checked into Avahi SVN with some fixes. It would be great if someone could find time to look at integrating this. This way, we get a working autoipd until Fredrik (who will be working on Zeroconf for his Google SoC project) can make progress on a flavour of autoipd which is suitable for the base system. P.S. If anyone out there is working on wide-area DNS-SD, please make yourselves known to us... Regards, BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
[CODE DROP] Source-Specific Multicast for FreeBSD 7: Phase 1
I am proud to announce the first code drop of SSM support for FreeBSD 7.0. From the README file: %%% Source-Specific Multicast for FreeBSD 7.0 -- Phase I This change brings FreeBSD closer to the standard of multicast API support offered by Linux 2.6 and Microsoft Windows "Longhorn". It is mostly of interest to organizations and individuals working with Internet multimedia applications, and IPv4/IPv6 routing, such as ISPs. It represents several weeks of work. The code is written to accomodate IPv6 and MLDv2 with only a little additional work. A regression test is included under src/tools/regression/netinet/ipmulticast in the code drop. The code is available in the bms_netdev branch on perforce.freebsd.org, or as a patch against -CURRENT extracted from this branch (with additional files, relative to src) available at: http://people.freebsd.org/~bms/ssm_phase1.tar The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html %%% Regards, BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: WiFi channel bonding with netgraph - possible ? Back end needed ?
Gore Jarold wrote: Comments ? The idea is to bond things into one, single, usable connection that could provide multiple connections worth of bandwidth for single-threaded network transactions (like downloading a single file from an ftp server). Perhaps there is a better tool to do this with than netgraph ? You could try pf's load-balanced NAT feature to deal with the NAT. This of course assumes you can configure a FreeBSD router directly with line presentation i.e. not using an intermediate box between it and the link to the ISP. However, I can't say I like the idea much of trying to tie all those nodes together with tunnels transiting the ISP. Sounds like a clear cut case for 802.11s ESS Mesh... which isn't available yet. BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: fake MAC addresses and ARP
Some ideas: 1. Enable IFF_STATICARP on your interface to stop ARP sending out to resolve the IP/MAC address tuple. 2. Consider that you can deal with resolution in userland (RTF_RESOLVE) but this involves changing the net's entry (route) in the FTE. You'd then process RTM_RESOLVE messages and install routes yourself -- it's possible to do arp in userland with this. 3. Try to avoid using the 169.254.0.0/16 prefix as it has a specific meaning. We don't implement interface scoping for these addresses yet so the FTE can't deal with them appearing more than once for the same subnet; it may be easier to pick something else -- note that if ARP is enabled for an interface with one of these addresses, all ARP traffic is forced to be broadcast as per the zeroconf RFCs. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: rtentry and rtrequest
Alan Garfield wrote: Hi all! One word HOW! :) I've no clue what this FreeBSD ARP stuff is all about, there is little or no documentation, there are 14 different sock_addr's which seem to have a bazillion different fields, and I cannot output a simple debug statement without getting 'error: dereferencing pointer to incomplete type' errors! The ARP code is pretty well documented in TCP/IP Illustrated Volume 2 and hasn't really significantly changed. Whilst I personally dislike how reentry happens in some of the paths, it works. In BSD, ARP lives in the routing table, which can be confusing to newcomers; such entries have the RTF_LLINFO flag set. From the sounds of it, if you are having to fake MAC addresses, you would be better off just enabling static mode ARP on the interface, possibly also enabling IFF_SMART ('manages own routes') on your interface and explicitly purging and re-adding your ARP entries from within your driver rather than trying to hack the rtrequest code to munge things on the fly. arp_rtrequest() is driver-independent code and will get hooked up to your code anyway when the net/ framework notices that your driver is one of IFT_ETHER. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
MFC of ether_input() changes
Hi, Does anyone want to see these changes MFCed, or otherwise object to such an MFC? The introduction of M_PROMISC did the following: * Drop frames immediately if the interface is not marked IFF_UP. * Always trim off the frame checksum if present. * Always use M_VLANTAG in preference to passing 802.1Q frames to consumers. * Use __func__ consistently for KASSERT(). * Use the M_PROMISC flag to detect situations where ether_input() may reenter itself on the same call graph with the same mbuf which was promiscuously received on behalf of subsystems such as netgraph, carp, and vlan. * 802.1P frames (that is, VLAN frames with an ID of 0) will now be passed to layer 3 input paths. * Deal with the special case for CARP in a sane way. For end users the main change of interest will be the ability for FreeBSD to receive 802.1p frames, even if it doesn't do anything with the priority fields right now. If I hear 'yeses' I will try to MFC this as time permits. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [CODE DROP] Source-Specific Multicast for FreeBSD 7: Phase 1
I've had some feedback from Robert Watson which has been factored into the branch. Thanks, Robert! If I hear no objections I'll aim to commit this code to -CURRENT within the next week, subject to approval. No MFC is planned because of the magnitude of the change. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: MFC of ether_input() changes
Actually, I thought the change which moved the VLAN tag out of the mbuf tag pool and into the mbuf packet header had also been MFCed. It has not. As CURRENT is the branch normally used for feature development it is probably best I don't MFC this unless the VLAN tag change is MFCed also. Therefore there is not a lot of point in merging this change apart from to benefit from the code cleanup which M_PROMISC offers, so I'll back off for now. Cheers... BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Why can't I sendto() to 127.255.255.255
Abraham K. Mathen wrote: Is it possible to successfully sendto() on a UDP socket with 127.255.255.255 as the destination address? If yes, how can that be done. No, because in FreeBSD, lo(4) is not implemented as a broadcast interface. It is a multicast capable software loopback interface. It has no concept of a broadcast domain. Unicast traffic, as well as multicast traffic, is looped back on this interface. You can see that in the output of 'ifconfig lo0', the BROADCAST flag is not set. RFC 3330 says: "A datagram sent by a higher level protocol to an address anywhere within this block should loop back inside the host." A few quick tests suggests this does not happen by default on FreeBSD. I suspect that this is because although lo0 is configured with 127.0.0.1/8 by default, a cloning interface route is not added as ARP does not run on such an interface. Therefore only a host route for 127.0.0.1 appears in the table. To tell the stack to transmit datagrams destined for 127/8 via lo0 you'd do the following: route -n add 127.0.0.0/8 -net -iface lo0 Nothing will reply as nothing is listening on that address (127.255.255.255). You can configure multiple lo interfaces, they just don't participate in a broadcast domain, as they are not broadcast interfaces. However, how lo(4) is implemented has the peculiar side-effect that all loopback interfaces are in the same 'transmission domain'... tcpdumping on lo0 will show you traffic on lo1. All loopback ifnet instances see each other's traffic, it's just up to the stack to reject it if it's not destined for a configured address on that instance. To try that, you'd 'ifconfig lo1 create' and 'ifconfig lo1 127.0.0.2/32' as FreeBSD's network stack does not really allow you to have more than one interface configured on the same subnet. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPPNP
Ozgur Ozdemircili wrote: Hello, I have a network of 10.10.10.0 and the gw is at 10.10.10.1. GW is giving out ip with DHCP. If the client pc is configured with DHCP they can get the ip from the server and go out to internet easily. But if the client has* static Ip configured*, for example 192.168.0.2 with gw 192.168.0.1, they cannot find the GW and cannot go out. I need the clients to be able to find the gw and *go out to internet without changing their ip configurations. * I have searched for technology behind this and it seems like IPPNP is the solution. This technology is implemented in most of the hotspot gateways (nomadix, dlink etc) This sounds like another buzzword that the vendors just made up when everyone with a clue wasn't looking. Someone's created a Sourceforge project for it which is still empty, and search engine matches find many vendors describing this feature as 'unique'. The way it has been verbally beaten on Linux related lists is something I can't disagree with. :^) http://article.gmane.org/gmane.linux.network.bridge.ebtables.user/896 The short answer is that this appears to be some kind of MAC based gateway protocol; basically, an 'internet access device' (stub router for home/small office) will forward any traffic it sees for a subnet which it isn't configured with, by spoofing ARP traffic so as to make it appear as though it is on that subnet. This sounds like a configuration nightmare to implement correctly, and goodness help you if you have more than one of these things connected to the same network. Whilst it probably can be done in the network stack, I speculate it couldn't be turned on at the same time as a number of other features such as Proxy ARP, or CARP, and may have problems scaling to more than a two-armed router (that is, 1 WAN uplink, and 1 Ethernet interface running this stuff). It also seems to rely on a few assumptions (subnet is a /24, and 0 is the subnetwork address). I think it also assumes that the network is 802.1x or MAC address ACL authenticated, and that clients are directly attached to the Layer 2 domain to which the interface which runs the quirk is attached. I see no IETF standard for this quirk, and it doesn't seem to have been as well thought out as the Zeroconf proposals. So whilst it may seem like a quick fix, it would have to be implemented, and the easiest thing for FreeBSD users to do in this situation is probably just to configure a separate network alias on their internal LAN interface -- something which is obviously more difficult to do with the kind of device the quirk is intended for. Now that I think on it, if IPv4 addresses are scoped, it might be possible to implement a knob which says "All IPv4 addresses learned on *this* interface have local scope", which in turn implies a 1:M NAT. It relies of course on the NAT module e.g. pf being able to join the dots and notice that an IP address outside of a configured subnet is being used on that interface, however, it would stop the forwarding code doing the wrong thing and forwarding the datagram back out the WAN interface again after pf has demuxed the inbound datagram there. To implement this feature properly requires that the forwarding code is changed to allow it. The changes required are similar to those needed for doing unnumbered IP. An interim solution could probably be implemented in userland using bpf which would stash the appropriate firewall rules to rewrite the outgoing traffic i.e. make the FreeBSD router appear on the subnet which the client thinks it's on. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/108197: [ipv6] IPv6-related crash if if_delmulti
Andre Oppermann wrote: Synopsis: [ipv6] IPv6-related crash if if_delmulti Responsible-Changed-From-To: freebsd-net->bms Responsible-Changed-By: andre Responsible-Changed-When: Sun May 13 18:36:25 UTC 2007 Responsible-Changed-Why: Send over to BMS. He's active in that area and may have fixed the bug already. http://www.freebsd.org/cgi/query-pr.cgi?pr=108197 Sorry, but I have no time to look at this at the moment. Is someone else free to look at it? The fix probably needs to be borrowed from the IPv4 code which adds an address to an interface. This wouldn't be the final fix; the root issue, to my mind, is that protocol specific state is contained within struct ifnet, when it probably shouldn't be. The address configuration code in both cases is therefore somewhat convoluted; FreeBSD lazy-allocates protocol domain structures for an instance of struct ifnet, rather than making the attachment of a protocol domain to an ifnet an explicit operation. Thanks, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: asymetric speeds over gigE link
Wilkinson, Alex wrote: 0n Mon, May 21, 2007 at 07:39:06PM +0100, Tom Judge wrote: > I have also seen 700Mb/s sustained FreeBSD - FreeBSD using the openssh HPN > patch set and no extra tuning of the network stack. Which makes me > think that maybe the linux stack needs some tuning? What is the "HPN patch" ? http://www.psc.edu/networking/projects/hpn-ssh/ Pittsburgh Supercomputing Center high performance networking patches, which have been around for a few years and are maintained, available as part of ports/security/openssh-portable. Sadly, my patches for the ROT13 cipher have not made it into OpenSSH/OpenSSL as of yet. + Regards, BMS + Very capable of line rate encryption. And based on a mature USENET technology... ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Implement Multi-Protocol Label Switching (MPLS)
Luca Da Col wrote: Hello, I would like to know if someone is actively working on MPLS project for FreeBSD. I would also like to know if James Leu's MPLS implementation for Linux has been considered as starting point for this project. No one is actively working on this to the best of my knowledge, however, there has been work on updating the kernel support for the Click Modular Router which would probably be a more appropriate starting point for producing an MPLS implementation which would work in the FreeBSD kernel. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: driver packet coalesce
Jack Vogel wrote: On 5/31/07, Wilkinson, Alex <[EMAIL PROTECTED]> wrote: 0n Wed, May 30, 2007 at 04:45:05PM -0700, Jack Vogel wrote: > Does any driver do this now? And if a driver were to coalesce > packets and send something up the stack that violates mss > will it barf? erm, what is meant by "coalesce" ? combining packets before sending to the stack, aka LRO. Yup - the firmware for the card's LRO engine would have to know not to coalesce packets not destined for the local host. I speculate many cards are not smart enough to do this, and LRO is an all-or-nothing proposition, as it's a technology designed to optimize for hosts, not routers; see recent discussions/slanging matches on end2end. At the moment there is no central place where we track all layer 2 addresses for which traffic should be delivered locally. This would logically belong in struct ifnet, and clients e.g. CARP would have to be taught to add their layer 2 endpoint addresses there. It seems acceptable to disable LRO if bridging is on and document this behaviour. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Firewalling NFS
Eygene Ryabinkin wrote: NFSD binds to the port nfsd (2049) and for my -CURRENT both lockd and statd have '-p' options: - $ man rpc.lockd rpc.statd | grep -- -p rpc.lockd [-d debug_level] [-g grace period] [-p port] -p The -p option allow to force the daemon to bind to the specified rpc.statd [-d] [-p port] -p The -p option allow to force the daemon to bind to the specified - Are we talking about same entities? I added the -p switch to mountd(8) a few years ago, as I needed to run a read-only NFS server exposed to the outside world; to firewall it I needed a deterministic RPC port number, which is what -p gives you. Otherwise you have to rely on the TCP wrapper support built into rpcbind(8). The rpc.lockd and rpc.statd daemons were recently changed to incorporate this switch too, although I don't think it has been backported to the 6-STABLE branch yet. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: new ARP code review
Julian Elischer wrote: I have some thoughts on this. firstly, while it is interesting to have an arp table (ok LLA table) on each interface, I'm not sure that it gains you very much. Unfortunately maintaining a single ARP table is insufficient for supporting multiple paths within the IPv4 stack. Even without supporting multiple routing paths, we would still need to break out the ARP cache in this way so as to support being attached to the same layer 2 domain properly (ie two network cards on the same Ethernet segment or switch). At the moment if_bridge and netgraph are our get-out-of-jail-free cards, they cause the IPv4 stack to be bypassed. As mentioned elsewhere, the connection of the arp information with the routing table menas that the arp lookup is virtually free. Or, at least it used to be in the Uniprocessor world. It's hard to beat free. It's hard to beat hard figures, which is something we don't have at the moment. What we do have is a set of design considerations. Intuition would suggest that one lock performs better than two, however, it depends on the nature of the lock and on the nature of the data structure lookup. The comment "Eventually, with this structure you can do the route lookup only when you need to find the next hop (e.g. when a route changes etc.) and just the much-cheaper L3-L2 map in other cases." makes me wonder..If we are not caching the arp code in the route any more, then how do we avoid doing a route lookup on each packet? I don't think you can ever avoid doing a lookup of any kind per packet if you're running a router. What you can do is amortize lookup cost over time, e.g. two expensive initial lookups followed by one cheaper lookup for subsequent packets. Whatever happens, though, has to play nice with policy forwarding and source selection. This is what complicates matters - otherwise I'd just suggest keeping a per-interface hash of ARP entries, an IPv4 routing trie, and a per-destination cache hash which returns the combined lookup against the trie and the L2 hash -- pretty much what Luigi is suggesting. BTW having a per interface arp table does make sense if there a s a particular thread that is responsible for that interface as only it would need access to teh table and it could be done lock-free if one was careful enough. The ARP code has to change, that much is certain, but the locking strategy has yet to be decided. ARP entries are read far more often than they are written, so it seems reasonable that a different lock is used. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how do you bring IPv6 live without reboot?
ghozzy wrote: I've found a way: # sysctl net.inet6.ip6.auto_linklocal=1 # ifconfig em0 down up will assign link-local address to interface. after all required interfaces have link-local addresses, run /etc/rc.d/network_ipv6 start and all will be set ! :) Well, this may work now, however, don't depend on this behaviour in future releases. The fact that it does work at all is to do with how protocol domain attach works with struct ifnet. I am thinking that in future a lot of this should change, in order to avoid a number of issues we currently have -- this (the inability to re-attach IPv6 without taking down the entire interface) is one of them. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Vimage virtual networking and 7.0
Julian Elischer wrote: In the future I am hoping to be able to use vimage in our products. They are based at the moment on 6.1, but I can see in a year they will be based on 7.x. Patches for 7.0 and vimage are currently available in perforce. What I would like to see is if there are any parts of that patch that would allow us to make adding of vimage to 7.1 an easier task. For example, Anything that would prevent vimage from needing an API change that would prevent it from being added later. My concern is that this may have already happened. I've been trying to do my bit as the years edge on to clean up the networking stack and fix bugs. One of my concerns is that the vimage change, which attempts to take network stack globals and wrap them into one big structure, may intrude on this or be subject to bitrot due to other development. I am quite disappointed that despite Marko's best efforts, we miss the 7.0 release but if it can be made nonintrusive enough I'd really like to see if it can get in 7.1. I appreciate all the hard work Marko has done on this, though I wonder if even 7.1 is ambitious. Personally, if I were "god" I'd put it in now because it can be compiled out. and it wouldn't be compiled by default.Maybe only just bits of it.. for sure I want the ability to have many routing tables. and I'm not thrilled about the requirement to have my own patch sets for this and thus not allowing others to use this feature. I think there are deeper issues in the network stack overall which need to be addressed, such as our lack of support for multipathing, scoped addresses, and all the tidyups which need to happen in struct ifnet to deal with this. My concern is that vimage may be a very intrusive change indeed where these matters are concerned, unless the vimage patches are being kept up-to-date and regression tested as issues are resolved and new features added. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how do you bring IPv6 live without reboot?
George Michaelson wrote: its interesting that when I sent-pr'd this, I got tut-tutted back to freebsd questions. In my books, not being able to do this kind of V6 maintenance work on the interface without taking it down probably deserves to be kept as an open bug! I agree. Please mail me the PR number and I'll reopen it. However, I can't make any commitment about when I personally would get time to do this as I need to go off and work for a living. It does however strike me as a sound design choice to make. The network stack design in Windows mandates that this is how it has to be -- TDI bindings must be explicitly made between the stack and the NDIS driver(s). Loopback is handled at TDI layer and does not appear until the PF_INET6 domain is attached to the system. Linux has gone part of the way down this road. I beleive BSD should do so as well, for a plethora of reasons including this one, as well as disentangling protocol domain stuff from struct ifnet. As a result ifconfig would get shaken up a bit. At the moment, the way the BSD stack works, neither IPv4 nor IPv6 are attached to struct ifnet until an address is explicitly configured, either by the user or by the kernel configuring a link-local address, so if you do need to purge protocol domain wide state, your current options are to remove the interface (which has been shown to cause problems, some of which I have been trying to fix) or reboot. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/113842: enabling IPv6 post-boot didn't work: required reboot
Synopsis: enabling IPv6 post-boot didn't work: required reboot Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: bms Responsible-Changed-When: Sun Jun 24 22:37:28 UTC 2007 Responsible-Changed-Why: Real issue. I may get around to this but am currently allocated on non FreeBSD stuff. http://www.freebsd.org/cgi/query-pr.cgi?pr=113842 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPv6 Woes...
Is your routing table correct? My default route entry for IPv6 just looks like this: default fe80::%gif0 UGSgif0 and gif0 just looks like this: gif0: flags=8051 mtu 1280 tunnel inet a.b.c.d -> x.x.x.x inet6 fe80::XXX:XXX:%gif0 prefixlen 64 scopeid 0x8 inet6 2001:ZZZ:ZZZ::: prefixlen 128 In the output you posted, the next-hop of 2001:4980:1::5 will need to be resolved via NDP (hence the LW flags). You already have a 1:1 endpoint mapping due to the use of the gif IPIP header, so the upstream shouldn't need any other tag to demux your traffic. You shouldn't need to do anything special with Ethernet in your configuration. Hope this helps. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPv6 Woes...
Eric F Crist wrote: My problem isn't getting out to 2001:4980:1::5, it's getting to my LAN, the 2001:4980:1:111::/64 network. My gateway, the machine from which I posted the routing and ifconfig information, is able to ping across the tunnel, and to the internet just fine. Nothing is able to get from the gateway to my LAN, however. Is it a problem with the fxp driver, or perhaps my setup with the ethernet bridging? You appear to have a /64 network address on the inside of your v6 router. Are you using stateless address auto-configuration? You appear to have statically assigned ::145 as a host address on that net. My setup works fine if I ping the network address of my v6 router from the v6 enabled hosts in my lab. When you ping local machines on the inside LAN from that router, do you see NDP entries being created? You shouldn't need to use bridging to achieve what you want in this scenario, in fact it makes no sense because you want to route v6 traffic over the gif, therefore ethernet bridging is not relevant here. regards BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.2 mtu now limits size of incomming packet
Mike Karels wrote: I'd be happy to see the change undone as well. I (well, our test group) found this change in a similar way, and it didn't agree with our previous usage. In -CURRENT my changes to the ethernet input path maintain the use of ETHER_MAX_FRAME() however the check is folded under #ifdef DIAGNOSTIC. I don't recall adding this conditional or touching it so it seems to be something which was already thereo radded by someone else. Could be pilot error; its use in -CURRENT seems to apply strictly to the use of large-receive offload (LRO). regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Call for testers: multicast forwarding
Hi, I may have some commercial work coming up which requires me to make modifications to the IPv4 multicast forwarding code in Linux. It is likely I will prototype the work in FreeBSD. It will probably not be released publicly. To prepare for this I have started cleaning up the MROUTING code; using more appropriate data structures, working on removal of the 32 vif limitation and other refinements, removal of legacy code which is no longer useful. I'd like to hear from anyone using multicast forwarding on FreeBSD who would be interested in testing these changes and suggesting other improvements.. They will most likely not make the 7.0 release but may appear in future versions. Code is not yet available as a patch set. I am working in the p4 branch bms_netdev. regards, BMS P.S. It would be good if there were a way of giving the general public read-only access to the p4 tree, this is becoming a blocking limitation of the tool for open development. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
[PATCH] add check for IP Router Alert
Please see the following patch which adds a check for the IP Router Alert option, for use by in-kernel IPv4 protocol domain consumers: http://people.freebsd.org/~bms/dump/ipoptions-routeralert.patch Comments/review before commit appreciated. regards BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: divert and deadlock issues
Christian S.J. Peron wrote: ... One idea was to duplicate the socket options mbuf and pass in a NULL pointer for the multi-cast options. Keep in mind that these are multicast options associated with a divert socket. So I guess the questions: (1) Are there any users that are specifying multicast options on divert sockets? (2) Are there any users that are specifying socket options in general for divert sockets? The LOR is obviously being triggered by ip_output()'s acquisition of in_multi_mtx, due to a datagram being sent to a multicast destination and a subsequent lookup being required. I can't think of a reason why a user would wish to supply any multicast socket options to a divert socket, other than the 'small' ones, i.e. IP_MULTICAST_TTL/IF/LOOP/VIF. See the comments about idempotence inside in_mcast.c on the HEAD branch, about why you can't just wish them away. It seems reasonable that this subset of the multicast options are supported for divert sockets given the likely use cases, even if IPPROTO_DIVERT supports IP_HDRINCL, because IP_MULTICAST_TTL does not do what you think it does (see in_mcast.c comments again). Joining groups on a divert socket SHOULD NOT be supported (it does not make sense semantically) and we should deliberately return EINVAL for multicast options other than the above subset. Dropping the inpcb lock over ip_output() looks like the easy option. Alternatively, we could just not support multicast options on divert sockets given that it is a rare use case as per above. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: divert and deadlock issues
Christian S.J. Peron wrote: I can't think of a reason why a user would wish to supply any multicast socket options to a divert socket, other than the 'small' ones, i.e. IP_MULTICAST_TTL/IF/LOOP/VIF. Why would these options ever be set on the divert socket itself though? To me it would make sense if these options were set on the network socket that originally sent the multicast packet itself. They shouldn't be necessary, however I can foresee situations where someone might well want to redirect multicast datagrams traversing an IPPROTO_DIVERT socket, by using these socket options. [Recall that FreeBSD's IPv4 stack currently uses the destination address as the sole primary key for lookups in the forwarding information base's radix trie.] This is however very unlikely, so my last suggestion, that multicast options be deprecated or forbidden for IPPROTO_DIVERT sockets, stands. Kind regards BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: divert and deadlock issues
Christian S.J. Peron wrote: Well, it's still the intent to keep the ability to divert and re-inject multicast packets. This change would basically say: "You cant specify multicast options via the divert socket". Which in practice doesn't happen anyway (where I looked). I dont think we should be specifying multicast options on divert sockets. It's not the right place to be manipulating multicast parameters. Multicast parameters should be set on the sockets that originally transmitted or received the packets. I dont think divert falls into this category. Correct. The definition of what a divert socket is and does, falls outside the definition of what a multicast socket endpoint is. Divert sockets exist to munge packets as they flow up or down the stack. If the additional complexity of treating divert sockets as multicast endpoints causes locking issues in the stack, common sense suggests we should deprecate that behaviour. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: routing local traffic w/o using loopback interface
rajneesh rana wrote: hello all, i am opening up two tap interfaces, both connected to bridge, assigning them IP addresses and want to open up tcp connection b/w them without using loopback interface, so i bind client socket to first tap using SO_BINDTODEVICE option and socket server listening on other tap device. The problem is that when i m calling connect, it is giving timeout error. I am confused by your question because to the best of my knowledge the SO_BINDTODEVICE socket option does not exist in FreeBSD. Is it possible two route traffic b/w two interfaces of same machine w/o using loopback interface and kernel hacking. Yes, I use if_bridge for this on a daily basis. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Failover default route?
Tuc at T-B-O-H.NET wrote: In my case, as always, its a bit "special". I have 2 OPENVPN tunnels, which I sent over different transits to the same end host. On that host, I do my NAT. SO, without getting into all sorts of hot/heavy things, is there a simple program to install to ping something via the first tunnel, and if it can't then switch my default route to the second tunnel? Or, do I just use a script like here : As Bill correctly points out, reachability detection using a routing protocol is often the preferred method, however this isn't always available. Pinging is NOT the best practice, see RFC 1122 3.3.1.4: http://www.freesoft.org/CIE/RFC/1122/56.htm You could use ifstated to detect changes in the tunnel interface status and switch default routes accordingly, though it doesn't significantly reduce the amount of manual scripting you have to do. Microsoft's TCP implementation performs dead gateway detection based on triggered reselection as per RFC 816, however, they have a multipath capable FIB which can hold the multiple next-hops and their state -- something to consider for later. An incrememntal piecemeal change which folks might find OK may be to add cost metrics back to the kernel radix trie, but that still has all the aggro of changing the API. regards BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Route caching ?
Ivo Vachkov wrote: Does FreeBSD rtalloc*() (or any other) functions implement route caching and how ? I looked at the code but it's not exactly easiest thing to read / understand :) Not really, at least, not in the way one would think. rtalloc() is a legacy function. ip_output() will still call rtalloc() if you pass it a filled out 'struct route', a structure which is not a route, but an internal request to look up a route. This is a wrapper for rtalloc_ign(), which in turn is a wrapper for rtalloc1(), the function which does the actual lookup. rtalloc_ign() is pretty straightforward. Note however that this approach only checks the RTF_UP flag and ifp, nothing more. This makes it suitable for implementing floating statics, but nothing more dynamic than that. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Allocating AF constants for vendors.
I second Max. If you are going to introduce a bunch of AF_* constants into the tree you have to be very careful as AF_MAX is used to size arrays and figure out how many radix trie heads to allocate. It could be argued this wastes a bunch of CPU time and memory, though I speculate 'not much' at the moment; I am just a bit concerned that we have ifnet->if_afdata which is also sized based on AF_MAX, 37, even though most of the protocols in it are never attached to ifnets. The only domain I've seen which really uses if_afdata is PF_INET6. PF_INET does not use it at all. In my opinion, there are structures per-family per-ifnet which really belong hung-off ifnet on a 1:1 basis and would simplify some of the lazy allocations we have further down in the stack. If AF_MAX increases significantly so will wasted memory. If you are going to make any significant changes here, please considering moving this stuff to a more dynamic method of allocation. On the other hand, if you don't need to reference these constants in the kernel at all, and they will all exist beyond AF_MAX, then you can disregard what I've said and append them to the rest of the list. That is pretty much what happens for the libpcap/bpf DLT constants (which are not an exact analogue of the AF constants - we don't allocate other, larger kernel structures based on their value). regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Route caching ?
Ivo Vachkov wrote: Actually there is: struct route_in6 ip6_forward_rt; that "caches" the last route used (thanks blue !!!) but i think this technique is pointless in a multiflow traffic. Yes, this is why OpenBSD got rid of this form of 'route caching'. Is it reasonable to believe that route caches can improve networking performance or we should leave it up to the routing table itself ? I believe that if one goes beyond a single radix trie, as is needed for multi-pathing with multicast and source policy routing, route caching is *required* to achieve good performance. Also, if FreeBSD moves ARP and NDP out of the radix trie, a route cache would be highly preferable as it amortizes the lock acquisition which would other be required for ARP/NDP/other layer 2 next-hop resolution. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Route caching ?
Claudio Jeker wrote: Just because you believe that route caches are great doesn't mean it is true. Show some real code and include benchmarks with various workloads (e.g. a core router that is hit by many many many sessions). It is a reasonable approach, for a uniprocessor design, to focus on optimizing the route lookup as much as possible. Does this approach scale to SMP, though? This is still a very much open question and from what I have seen of the OpenBSD implementation, it only addresses the uniprocessor case - again please correct me here if I have missed any details. I believe the Linux dst cache is strongly tied to the IBM-patented Remote-Copy-Update algorithm based on what I've read about their LC-trie implementation. Until now all caching solutions resulted in very bad performance on busy boxes. Remember ip_fastforward or how was it called? Another example are all crapy L3 switches that burn down if the CAM (chache) is flodded. I assume you are referring to NetBSD's flow-based IP forwarding cache, which was implemented outside of the scope of SMP; spl-style interrupt priority masking was still in use at that time. It is established that saturating content-addressable memory is going to lead to the slow path being taken, however, that's the trade-off one makes with these designs. IMO it is better to make the route lookup faster and forget about caching. My concern is that you may be comparing apples with oranges here. In the case of SMP, locking does become a consideration, and caches, if carefully implemented, are one way of addressing this. On the other hand, CPU affinity has been proposed as a limited solution, however it depends how this is implemented - affinity for lookups, forwarding, or both? Perhaps there is something I am missing about how the OpenBSD implementation deals with SMP, as I am not as familiar with their code as FreeBSD's. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: quagga 0.99.8 on current, tcpmd5 config confusion
Randy Bush wrote: just did a cvsup build and portupgrade of a six month old -current i386 system running quagga. quagga cranked to 0.99.8. i got slammed by bgp tcpmd5 requirement. bgpd[469]: can't set sockopt TCP_MD5SIG 0 to socket 17 bgpd[469]: can't set sockopt TCP_MD5SIG 0 to socket 18 bgpd[469]: can't set sockopt TCP_MD5SIG 0 to socket 22 madly googled and found that i needed to hack kernel for tcp md5 hash, even though i am not using md5 auth (these are not really infrastructure peerings. yes i know better for production). This I haven't seen before, then again, it's been years since I've used Zebra/Quagga let alone hacked the patch for md5 support, which is now ~3.5 years old. It was only ever intended as a belt-and-braces attempt at getting things up in a way which the sponsor was satisfied with, with no other refinements. I wasn't 100% happy about how I ended up doing the kernel support, and had to go with what I had working in my tree because of that old demon 'economics', rather than doing things 'the right way': i.e. in the IPSEC Security Policy Database (SPD), with the routing daemon loading the keys, rather than the Security Associations Database (SADB) and keys loaded manually using setkey(8). Other individuals have since made changes to this code. Now that we have settled on FAST_IPSEC thanks to gnn's hard work, it will be easier for Someone(tm) to pick this up, as KAME IPSEC and FAST_IPSEC interfaced to key sockets differently enough to change the implementation of the SPD. with this kernel, i got a lot of whining about no keys tcp_signature_compute: SADB lookup failed for 666.42.69.96 I remember putting in the SADB lookup failed message to help people track down problems with their configuration. If TCP_MD5SIG is not enabled on the tcp socket, no SADB lookup should happen, so you shouldn't be seeing this message. It sounds to me as though Quagga may be enabling the TCP_MD5SIG option unconditionally based on all of the output you've posted. This is obviously incorrect. I can't speak for Quagga, though it seems reasonable to suggest that it shouldn't be doing that unless you tell it to. I believe the MD5 patches only get pulled in if you request them, and that md5 auth specifically needs to be enabled per peer. Still, this is nearly 4 years on and I have other things going on now. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: nc captures 1024 bytes
Looks like a netcat bug, if it doesn't tune buffers to the interface MTU. I'm not sure if nc has a 'de facto' maintainer however I believe it is something which was recently imported into the freebsd base system. Still, it is better to try to field patches with the upstream maintainer before filing a FreeBSD PR with your patches. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [EMAIL PROTECTED]: Re: rtfree: 0xffffff00036fb1e0 has 1 refs]
Christian S.J. Peron wrote: I am not sure who has their hands in the routing code these days so I figured I would just forward this message off here. Does the following look reasonable? I'm looking, but mostly with long range goggles on. Yes, this looks like the right change. rtalloc1() always returns an rtentry with the mutex for that rtentry held. regards BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: nc captures 1024 bytes
Weiguang Shi wrote: nc might be waiting on all the interfaces; enumerating MTUs and choosing the largest sounds complicated, especially when some interfaces can be configured to receive jumbo frames. Why not just use something like 64KB as the other user suggested or something even larger? That is the easy fix, yes. :^) If the socket's pcb laddr is bound to an IP, and IP to which it is bound stays on the same physical interface, then the MTU may easily be obtained. If it's INADDR_ANY, or you expect the IP to be dynamically reconfigured on another interface, then auto-tuning is not possible. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [EMAIL PROTECTED]: Re: rtfree: 0xffffff00036fb1e0 has 1 refs]
BTW: Casual inspection with kscope suggests there is a similar free-while-locked issue in nd6_ns_input() (netient6/nd6_nbr.c) and in_arpinput() (netinet/if_ether.c). nd6_ns_input() references rt-»rt_gateway after rtfree(), a potential race not to mention a use-after-free. I haven't checked Coverity for this, but it just doesn't look right. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: vlan stacking
Ivan Alexandrovich wrote: Hi I'm wondering is anybody using double vlans ("q-in-q", "vlan stacking", any name you like) on production hosts? Does it play well with common ethernet device drivers in freebsd (concerning the frame size) - fxp, em, for example? Looks like that almost nobody mentions q-in-q in freebsd maillists/forums, except that nesting ng_vlan can be used to implement it. I'm sure you or someone else can come up with a creative solution for Q-in-Q or arbitrary nesting levels. It's not something I use, so, I pass. The mainline code doesn't support it without Netgraph; it would be necessary to allow vlan(4) to be nested. The ether_input() code demuxes 802.1q encapsulation but only 1 level. The reason for this is because the outer VLAN tag got moved into the mbuf pkthdr structure for if_bridge to be able to process it. I can't comment on the netgraph solution however. regards BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"