Re: reassembled packets and pfil

2010-04-13 Thread Matthew Luckie

Is there any particular reason why reassembled packets were not
checked?  If the answer is no, I'll send in a PR.


I think it was just a random decision -- either pass packets to
the firewall before reassembly as we do, or after reassembly, as
linux does. Both have pros and cons.
Going through the firewall twice, however, is problematic because
far too many things (counters, dummynet, etc.) expect to see each
packet only once.


ok, thanks for letting me know.


I think that a patch like the one you propose is very useful (for
ipv4 as well) but it requires a sysctl or other mechanism to make
sure that when it is enabled we don't pass fragments through the
firewall.


i've looked further into this and I now wonder if is a byproduct of my 
use of ipfw.  the problem seems to be that offset will always be 
non-zero with any packet with a v6 fragment header, so a rule requiring 
offset to be zero is never run.  i'll spend a bit more time on this 
tomorrow, and come back with a patch for ipfw.


Note this code:

  offset = ((struct ip6_frag *)ulp)->ip6f_offlg & IP6F_OFF_MASK;
  /* Add IP6F_MORE_FRAG for offset of first
   * fragment to be != 0. */
  offset |= ((struct ip6_frag *)ulp)->ip6f_offlg & IP6F_MORE_FRAG;
  if (offset == 0) {
printf("IPFW2: IPV6 - Invalid Fragment Header\n");
if (fw_deny_unknown_exthdrs)
  return (IP_FW_DENY);
break;
  }

This code seems to be incorrect, per rfc 2460:

   In response to an IPv6 packet that is sent to an IPv4 destination
   (i.e., a packet that undergoes translation from IPv6 to IPv4), the
   originating IPv6 node may receive an ICMP Packet Too Big message
   reporting a Next-Hop MTU less than 1280.  In that case, the IPv6 node
   is not required to reduce the size of subsequent packets to less than
   1280, but must include a Fragment header in those packets so that the
   IPv6-to-IPv4 translating router can obtain a suitable Identification
   value to use in resulting IPv4 fragments.  Note that this means the
   payload may have to be reduced to 1232 octets (1280 minus 40 for the
   IPv6 header and 8 for the Fragment header), and smaller still if
   additional extension headers are used.

A stack can send an IPv6 packet with a fragment header attached that 
does not have the MF bit set.  I'm 90% sure that FreeBSD itself will do 
this when it receives a PTB with an MTU value of (say) 1000.


Matthew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Host only TCP/IP implementation based on FreeBSD

2010-04-13 Thread Ivo Vachkov
Hello,

You can look at http://www.rtems.com/ Their TCP/IP stack is derived
from FreeBSD and is probably better suited for 'extraction' than
current FreeBSD TCP/IP implementation. Also, the QNX public SVN
repository should contain QNX's fork of the NetBSD network stack as
separate resource manager.

On Tue, Apr 13, 2010 at 12:54 AM, Vineet Dixit  wrote:
> Hi -
>
> My apologies in advance in case my question in not appropriate for this list.
>
> I am evaluating TCP/IP stack implementation  for a device which
> requires only the end-device network features. I am keen on FreeBSD's
> TCP/IP implementation due to it's long history of development, use on
> wide range of devices and code maturity. However my requirements are
> limited to only transport protocols, IPv4 and IPv6, ARP, DHCP client
> and so on. I don't need advanced routing and forwarding, multicast,
> IPSec, QoS which are part of the distribution. FreeBSD's SMP support
> and fine-grained locking are certainly a bonus but isn't part of MUST
> have features.
>
> Is there an implementation that is a trimmed down TCP/IP stack based
> on BSD which I could port to another RTOS? Looking for either
> commercial or open source implementation.
>
> Thanks.
>
> -- Vineet
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
"UNIX is basically a simple operating system, but you have to be a
genius to understand the simplicity." Dennis Ritchie
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/145462

2010-04-13 Thread Gleb Smirnoff
The following reply was made to PR kern/145462; it has been noted by GNATS.

From: Gleb Smirnoff 
To: Aleksey 
Cc: bug-follo...@freebsd.org
Subject: Re: kern/145462
Date: Tue, 13 Apr 2010 15:36:58 +0400

  IMO, this patch would be better:
 
 Index: ng_ipfw.c
 ===
 --- ng_ipfw.c   (revision 206495)
 +++ ng_ipfw.c   (working copy)
 @@ -264,11 +264,8 @@
  * Node must be loaded and corresponding hook must be present.
  */
 if (fw_node == NULL || 
 -  (hook = ng_ipfw_findhook1(fw_node, fwa->rule.info)) == NULL) {
 -   if (tee == 0)
 -   m_freem(*m0);
 +  (hook = ng_ipfw_findhook1(fw_node, fwa->rule.info)) == NULL)
 return (ESRCH); /* no hook associated with this rule */
 -   }
  
 /*
  * We have two modes: in normal mode we add a tag to packet, which is
 
 
 Can you please test it and if you don't mind I will commit it.
 
 -- 
 Totus tuus, Glebius.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


pf stalls connection when using route-to

2010-04-13 Thread Lin Jui-Nan Eric
Hi listers,

We recently found that when the traffic passes pf with route-to, the
connection stalls.
Turning off TSO solves the problem. Our pf.conf is very simple:

table  const {10/8, 172.16/12, 192.168/16}
pass out quick route-to (em0 10.1.1.1) from  to !  no state

And we have a tcpdump capture file. It shows that there's lots of
duplicate packets and
retransmissions while TSO is enabled. Our NIC is an Intel PRO/1000:

em0:  port 0x2000-0x201f
mem 0xdf20-0xdf21 irq 18 at device 0.0 on pci4
em0: Using MSI interrupt
em0: [FILTER]

Screenshot: http://cf.files.jnlin.org/with-tso.png

Any suggestion? I just turn off the TSO, but I think it is only a workaround.



Sincerely,

                Jui-Nan
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


m_copymdata() bug?

2010-04-13 Thread Jacques Fourie
It seems as if the m_copymdata() function defined in uipc_mbuf.c has a
bug. It uses m_apply to copy data from the source mbuf to the target
but in the callback function m_bcopyxxx() the arguments are
interpreted in the wrong order. Swapping the 's' and 't' arguments in
the declaration of m_bcopyxxx() fixes the problem for me.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


kernel - userspace communication, ng_socket

2010-04-13 Thread serena zanetta
I have a question involving the userspace and kernel communication.

I need to get a point where I can exchange data between netgraph nodes and a
program in the userland.

I have a netgraph node which receives network packets and sends them to the
program to be processed. The modified packets are then sent back to the same
node.



I thought to do it with the ng_socket node type. I’ve written a program that
creates two sockets (data and control) as reported in “All about netgraph”.
Then I’ve tried to assign the node a name with bind(), as follow:



#include 

#include 

#include 

#include 

#include 

#include 

#include 

#include 

#include 

#include 

#include 

#include 

#include 



#include 



#define NGSA_OVERHEAD   (offsetof(struct sockaddr_ng, sg_data))



int

main()

{

  int s_control = -1, s_data = -1;

  struct sockaddr_ng *sg;



  s_control = socket(PF_NETGRAPH,SOCK_DGRAM,NG_CONTROL);



  strcpy(sg->sg_data,"SOCKET");

  sg->sg_family = AF_NETGRAPH;

  sg->sg_len = strlen(sg->sg_data) + 1 + NGSA_OVERHEAD;

  bind(s_control,(struct sockaddr *)sg,sg->sg_len)<0);

   …



  return 0;

}


If I look at ngctl to verify the ng_socket creation, no trace of it can be
found.

So I don't know how to proceed .. any suggestion?

Thank you,

Serena
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: pf stalls connection when using route-to

2010-04-13 Thread Lin Jui-Nan Eric
On Tue, Apr 13, 2010 at 11:19 PM, Jeremy Chadwick
 wrote:
>
> What FreeBSD version?  uname -a output please.
>
I have tried 7.2-R and 8.0-R. Both version stalls, too.

8.0-RELEASE:
# uname -a
FreeBSD bsd8 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #3: Wed Mar  3
17:15:52 CST 2010 r...@bsd8:/usr/obj/usr/src/sys/KERNEL  amd64

We only added "carp" in kernel config for HA.

# cat /etc/sysctl.conf
# $FreeBSD: src/etc/sysctl.conf,v 1.8.34.1.2.1 2009/10/25 01:10:29
kensmith Exp $
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#

# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
debug.bootverbose=1
kern.ipc.maxsockbuf=2097152
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.maxprocperuid=65536
net.inet.tcp.delayed_ack=0
debug.bootverbose=1
kern.ipc.maxsockbuf=2097152
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.maxprocperuid=65536
net.inet.tcp.delayed_ack=0
net.inet.carp.preempt=1
net.inet.carp.arpbalance=1
kern.randompid=9
net.inet.flowtable.enable=0

# cat /boot/loader.conf
#
coretemp_load="YES"
geom_mirror_load="YES"
geom_stripe_load="YES"
if_em_load="YES"
kbdmux_load="YES"
random_load="YES"
ukdb_load="YES"
zfs_load="YES"
#
kern.ipc.nmbclusters="0"
kern.maxproc="65536"
net.inet.tcp.reass.maxsegments="1600"


7.2-RELEASE:
# uname -a
FreeBSD bsd7 7.2-RELEASE-p7 FreeBSD 7.2-RELEASE-p7 #0: Fri Feb 26
22:28:05 UTC 2010
r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
# cat /etc/sysctl.conf
debug.bootverbose=1
kern.ipc.maxsockbuf=2097152
kern.ipc.somaxconn=32768
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.maxprocperuid=65536
kern.randompid=9
net.inet.icmp.icmplim=65536
net.inet.ip.fastforwarding=1
net.inet.ip.portrange.first=4096
net.inet.tcp.delayed_ack=0
net.inet.tcp.fast_finwait2_recycle=1
net.inet.tcp.maxtcptw=65535
net.inet.tcp.msl=1500
net.inet.tcp.nolocaltimewait=1
vfs.lookup_shared=1
vfs.nfs.prime_access_cache=0
vm.pmap.shpgperproc=2000
# cat /boot/loader.conf
#
coretemp_load="YES"
geom_mirror_load="YES"
geom_stripe_load="YES"
kbdmux_load="YES"
random_load="YES"
ukdb_load="YES"
zfs_load="YES"
#
kern.ipc.nmbclusters="0"
kern.maxproc="65536"
vfs.zfs.prefetch_disable="1"
vm.kmem_size="1G"
vm.kmem_size_max="1G"
net.inet.tcp.reass.maxsegments="1600"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[PATCH FOR REVIEW] Fix SIOCGIFDESCR when buffer is too small

2010-04-13 Thread Xin LI
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Here is a patch that addressed the issue, where when SIOCGIFDESCR is fed
with a smaller buffer.  As reported by Bernhard, this would cause an
infinite loop in ifconfig(8).

The previous implementation claims that the 'length' field would be set
to the number of length returned, and an error is returned.  However,
our ioctl(2) system call will not do copyout if there is errno being
set, as discussed on -arch@ and thus the API needs to be tweaked.

To minimize impact on ABI I have choose to use buffer as an indicator
that the buffer length from userland is not sufficient, instead of
returning ENAMETOOLONG.

I'll also submit a patch for libpcap if this proposed change is
considered be a good one.  The libpcap in contrib/libpcap is not
affected since it doesn't support dynamic length description.

Cheers,
- -- 
Xin LI http://www.delphij.net/
FreeBSD - The Power to Serve!  Live free or die
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (FreeBSD)

iQEcBAEBAgAGBQJLxMXWAAoJEATO+BI/yjfBWc4H/jO7i2Rm+GqeYXX2eNWUjE2W
5dpNFq0kxqQWpLTr8qPskQ7o/ZDIl8ASbNJPdr/G+U1mYGVwNWVa6z0TR3huZZCB
gPnR+84a+C/8rwtJjhOuyFKt/fdZfD4kI+rnWB+9Cq/uLX4aqziY1YO7SIAtb/1b
RrjyM6rgYsMcnrqJKrmAQQEU1k6Yqkcy5PEEzU6MTSsHYL4wuKujZzmIYdZRg4rI
OLSdLQEWq+u4PuOnrRMrvrrZZCObOURCWpjnJiP1yyMBE/ZW6itfMp6BE6k29vUz
vZcDtqUFj3j1tVvaA4MzuX+isMUqnO8DvcnIawjwefs9Rq0mWY796kGSEjZYxuQ=
=lyPJ
-END PGP SIGNATURE-
Index: sbin/ifconfig/ifconfig.c
===
--- sbin/ifconfig/ifconfig.c(revision 206558)
+++ sbin/ifconfig/ifconfig.c(working copy)
@@ -922,19 +922,21 @@
ifr.ifr_buffer.buffer = descr;
ifr.ifr_buffer.length = descrlen;
if (ioctl(s, SIOCGIFDESCR, &ifr) == 0) {
-   if (strlen(descr) > 0)
-   printf("\tdescription: %s\n", descr);
-   break;
-   } else if (errno == ENAMETOOLONG)
-   descrlen = ifr.ifr_buffer.length;
-   else
-   break;
-   } else {
+   if (ifr.ifr_buffer.buffer == descr) {
+   if (strlen(descr) > 0)
+   printf("\tdescription: %s\n",
+   descr);
+   break;
+   } else if (ifr.ifr_buffer.length > descrlen) {
+   descrlen = ifr.ifr_buffer.length;
+   continue;
+   }
+   }
+   } else
warn("unable to allocate memory for interface"
"description");
-   break;
-   }
-   };
+   break;
+   }
 
if (ioctl(s, SIOCGIFCAP, (caddr_t)&ifr) == 0) {
if (ifr.ifr_curcap != 0) {
Index: share/man/man4/netintro.4
===
--- share/man/man4/netintro.4   (revision 206558)
+++ share/man/man4/netintro.4   (working copy)
@@ -292,8 +292,11 @@
 struct passed in as parameter, and the length would include
 the terminating nul character.
 If there is not enough space to hold the interface length,
-no copy would be done and an
-error would be returned.
+no copy would be done and the
+.Va buffer
+field of
+.Va ifru_buffer
+would be set to NULL.
 The kernel will store the buffer length in the
 .Va length
 field upon return, regardless whether the buffer itself is
Index: sys/net/if.c
===
--- sys/net/if.c(revision 206558)
+++ sys/net/if.c(working copy)
@@ -2049,14 +2049,13 @@
case SIOCGIFDESCR:
error = 0;
sx_slock(&ifdescr_sx);
-   if (ifp->if_description == NULL) {
-   ifr->ifr_buffer.length = 0;
+   if (ifp->if_description == NULL)
error = ENOMSG;
-   } else {
+   else {
/* space for terminating nul */
descrlen = strlen(ifp->if_description) + 1;
if (ifr->ifr_buffer.length < descrlen)
-   error = ENAMETOOLONG;
+   ifr->ifr_buffer.buffer = NULL;
else
error = copyout(ifp->if_description,
ifr->ifr_buffer.buffer, descrlen);
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: reassembled packets and pfil

2010-04-13 Thread Matthew Luckie
> >I think that a patch like the one you propose is very useful (for
> >ipv4 as well) but it requires a sysctl or other mechanism to make
> >sure that when it is enabled we don't pass fragments through the
> >firewall.
> 
> i've looked further into this and I now wonder if is a byproduct of my 
> use of ipfw.  the problem seems to be that offset will always be 
> non-zero with any packet with a v6 fragment header, so a rule requiring 
> offset to be zero is never run.  i'll spend a bit more time on this 
> tomorrow, and come back with a patch for ipfw.

Here's a patch to ipfw.  We keep a copy of the MF bit for IPv6
fragments so it can be passed to ipfw_log.  Otherwise, the offset
field no longer has the MF bit embedded in it as before.

Note that apart from the various transport-layer checks that require
offset to be zero, the O_FRAG opcode now has a different behaviour.
Only subsequent fragments will match this rule.  If you want the exact
same behaviour as before, then

   case O_FRAG:
 match = (offset != 0);
 break;

should become

   case O_FRAG:
 match = (offset != 0 || ext_hd & EXT_FRAGMENT);
 break;

If you are generally happy with this patch, let me know and I'll file
a PR so it doesn't get lost.


--- ip_fw2.c.orig   2008-11-25 15:59:29.0 +1300
+++ ip_fw2.c2010-04-14 10:05:46.0 +1200
@@ -758,6 +758,7 @@ ipfw_log(struct ip_fw *f, u_int hlen, st
char *action;
int limit_reached = 0;
char action2[40], proto[128], fragment[32];
+   u_short mf = 0;
 
fragment[0] = '\0';
proto[0] = '\0';
@@ -903,6 +904,8 @@ ipfw_log(struct ip_fw *f, u_int hlen, st
snprintf(dst, sizeof(dst), "[%s]",
ip6_sprintf(ip6buf, &args->f_id.dst_ip6));
 
+   mf = offset & IP6F_MORE_FRAG;
+   offset &= IP6F_OFF_MASK;
ip6 = (struct ip6_hdr *)ip;
tcp = (struct tcphdr *)(((char *)ip) + hlen);
udp = (struct udphdr *)(((char *)ip) + hlen);
@@ -972,13 +975,13 @@ ipfw_log(struct ip_fw *f, u_int hlen, st
 
 #ifdef INET6
if (IS_IP6_FLOW_ID(&(args->f_id))) {
-   if (offset & (IP6F_OFF_MASK | IP6F_MORE_FRAG))
+   if (offset || mf)
snprintf(SNPARGS(fragment, 0),
" (frag %08x:%...@%d%s)",
args->f_id.frag_id6,
ntohs(ip6->ip6_plen) - hlen,
-   ntohs(offset & IP6F_OFF_MASK) << 3,
-   (offset & IP6F_MORE_FRAG) ? "+" : "");
+   ntohs(offset) << 3,
+   mf ? "+" : "");
} else
 #endif
{
@@ -2151,16 +2154,13 @@ ipfw_chk(struct ip_fw_args *args)
 
/*
 * offset   The offset of a fragment. offset != 0 means that
-*  we have a fragment at this offset of an IPv4 packet.
-*  offset == 0 means that (if this is an IPv4 packet)
-*  this is the first or only fragment.
-*  For IPv6 offset == 0 means there is no Fragment Header. 
-*  If offset != 0 for IPv6 always use correct mask to
-*  get the correct offset because we add IP6F_MORE_FRAG
-*  to be able to dectect the first fragment which would
-*  otherwise have offset = 0.
+*  we have a fragment at this offset.
+*  offset == 0 means that this is the first or only fragment.
+*
+* mf   The MF bit masked out of IPv6 packets.
 */
u_short offset = 0;
+   u_short mf = 0;
 
/*
 * Local copies of addresses. They are only valid if we have
@@ -2311,17 +2311,8 @@ do {\
proto = ((struct ip6_frag *)ulp)->ip6f_nxt;
offset = ((struct ip6_frag *)ulp)->ip6f_offlg &
IP6F_OFF_MASK;
-   /* Add IP6F_MORE_FRAG for offset of first
-* fragment to be != 0. */
-   offset |= ((struct ip6_frag *)ulp)->ip6f_offlg &
+   mf = ((struct ip6_frag *)ulp)->ip6f_offlg &
IP6F_MORE_FRAG;
-   if (offset == 0) {
-   printf("IPFW2: IPV6 - Invalid Fragment "
-   "Header\n");
-   if (fw_deny_unknown_exthdrs)
-   return (IP_FW_DENY);
-   break;
-   }
args->f_id.frag_id6 =