Re: GNU Mach interface to packet filters

Richard Braun Fri, 14 Apr 2006 07:13:19 -0700

OK, so, after getting some feedback, here is a new patch (the change in
kern/queue.h was inappropriate, and a useful comment was added).
I'm also going to describe the changes brought to the interface to packet
filters more clearly.


The main changes to the interface are visible in include/device/net_status.h.
First, keep in mind that we support two types of packet filters which are
NETF (also called CSPF) and BPF. The interface must provide a way to
specify the type of filter sent to the kernel. Next, with this patch,
egress packets are sent to packet filters. Therefore the interface must
provide a way to specify which type of packets (egress, ingress, or both)
a filter will be applied to. Another requirement is that, once packets are
sent to userspace listeners, those listeners must be able to know if the
packet captured is an egress or ingress packet (can't be both here).
In order to achieve all this, we decided to use the first filter_t object
of any filter as a header. Common filter_t objects are split in two
parts, one for the operator (6 bits) and one for the argument (10 bits).
We decided to match this separation in the header, which in turn
contains the filter type on 6 bits and some flags on 10 bits. Four
macros were added to net_status.h, which are :
 o NETF_TYPE_MASK : mask to use on the header to get the type of the filter
                   (can also be used to get only the flags)
 o NETF_BPF : for BPF filters
 o NETF_IN : flag to apply the filter on ingress packets
 o NETF_OUT : same for egress packets

NETF filters are considered to be native filters, so there is no macro
for them (a requirement for all non-native filters is to be non-zero).

Macros were also added in include/device/bpf.h :
#define BPF_BEGIN       NETF_BPF
#define BPF_IN          NETF_IN
#define BPF_OUT         NETF_OUT

Here is the NETF filter that would be used in pfinet :
static short ether_filter[] =
{
  NETF_IN, /* header */
  NETF_PUSHLIT | NETF_NOP,
  1
};

And here is the BPF filter used in the currently developed BPF translator :
static const struct bpf_insn bpf_header =
{
  BPF_BEGIN | BPF_IN, /* BPF_OUT is OR'ed if the ``see sent'' flag is
  0,                  /* enabled - see bpf(4) on a BSD system. */
  0,
  0
};

FYI, this structure is called bpf_header because a compliant BPF implementation
doesn't have such a header. When a listener such as tcpdump sends a filter
to the BPF translator, a new filter (bpf_header + the filter received) will
be forged and sent to GNU Mach.

Finally, a new field of type boolean_t was added to struct net_rcv_msg
to tell listeners that the packet is egress or ingress. The field is
named ``sent'' to match the BIOCSEESENT ioctl(). It is also used inside
the kernel to know which list of filters (ifp->if_(rcv|snd)_port_list) is
going to be used when applying filters on a packet.

Here are the two patches we've been working on. The second one makes
Hurd's pfinet use the new interface just discussed.


18_packet_filters.patch:
2006-04-14 Richard Braun <[EMAIL PROTECTED]>

        * device/if_hdr.h: Added a port list for egress packets and its
        lock.
        * device/net_io.c: Added and fixed a patch from
        Manuel Menal <[EMAIL PROTECTED]> to improve BPF support.
        Filters can be applied to ingress packets, egress packets, or both.
        * device/subrs.c: Initialize the new port list and its lock.
        * include/device/bpf.h: Uncommented and added some macros and
        type definitions.
        * include/device/net_status.h: Added macros and changed the
        definition of struct net_rcv_msg. This changes the interface to
        packet filters.
        * linux/dev/glue/net.c: Mark ingress packets as received and
        inject egress packets into packet filters.

diff -Nurp gnumach-20060408.dfsg.1.orig/device/if_hdr.h 
gnumach-20060408.dfsg.1/device/if_hdr.h
--- gnumach-20060408.dfsg.1.orig/device/if_hdr.h        2006-04-14 
10:29:33.000000000 +0000
+++ gnumach-20060408.dfsg.1/device/if_hdr.h     2006-04-14 10:31:16.000000000 
+0000
@@ -79,8 +79,11 @@ struct ifnet {
        char    *if_address;            /* pointer to hardware address */
        struct ifqueue if_snd;          /* output queue */
        queue_head_t if_rcv_port_list;  /* input filter list */
+       queue_head_t if_snd_port_list;  /* output filter list */
        decl_simple_lock_data(,
-               if_rcv_port_list_lock)  /* lock for filter list */
+               if_rcv_port_list_lock)  /* lock for input filter list */
+       decl_simple_lock_data(,
+               if_snd_port_list_lock)  /* lock for output filter list */
 /* statistics */
        int     if_ipackets;            /* packets received */
        int     if_ierrors;             /* input errors */
diff -Nurp gnumach-20060408.dfsg.1.orig/device/net_io.c 
gnumach-20060408.dfsg.1/device/net_io.c
--- gnumach-20060408.dfsg.1.orig/device/net_io.c        2006-04-14 
10:29:32.000000000 +0000
+++ gnumach-20060408.dfsg.1/device/net_io.c     2006-04-14 13:28:04.000000000 
+0000
@@ -288,7 +288,8 @@ net_kmsg_more(void)
  * filter for a single session.
  */
 struct net_rcv_port {
-       queue_chain_t   chain;          /* list of open_descriptors */
+       queue_chain_t   input;          /* list of input open_descriptors */
+       queue_chain_t   output;         /* list of output open_descriptors */
        ipc_port_t      rcv_port;       /* port to send packet to */
        int             rcv_qlimit;     /* port's qlimit */
        int             rcv_count;      /* number of packets received */
@@ -348,15 +349,15 @@ decl_simple_lock_data(,net_hash_header_l
        } while ((elt) != (head));
 
 
-#define FILTER_ITERATE(ifp, fp, nextfp) \
-       for ((fp) = (net_rcv_port_t) queue_first(&(ifp)->if_rcv_port_list);\
-            !queue_end(&(ifp)->if_rcv_port_list, (queue_entry_t)(fp));    \
-            (fp) = (nextfp)) {                                            \
-               (nextfp) = (net_rcv_port_t) queue_next(&(fp)->chain);
+#define FILTER_ITERATE(if_port_list, fp, nextfp, chain)        \
+       for ((fp) = (net_rcv_port_t) queue_first(if_port_list); \
+            !queue_end(if_port_list, (queue_entry_t)(fp));     \
+            (fp) = (nextfp)) {                                 \
+               (nextfp) = (net_rcv_port_t) queue_next(chain);
 #define FILTER_ITERATE_END }
 
 /* entry_p must be net_rcv_port_t or net_hash_entry_t */
-#define ENQUEUE_DEAD(dead, entry_p) { \
+#define ENQUEUE_DEAD(dead, entry_p, chain) {                   \
        queue_next(&(entry_p)->chain) = (queue_entry_t) (dead); \
        (dead) = (queue_entry_t)(entry_p);                      \
 }
@@ -711,23 +712,36 @@ net_filter(kmsg, send_list)
        queue_entry_t           dead_entp = (queue_entry_t) 0;
        unsigned int            ret_count;
 
+       queue_head_t *if_port_list;
+
        int count = net_kmsg(kmsg)->net_rcv_msg_packet_count;
        ifp = (struct ifnet *) kmsg->ikm_header.msgh_remote_port;
        ipc_kmsg_queue_init(send_list);
 
+       if (net_kmsg(kmsg)->sent)
+           if_port_list = &ifp->if_snd_port_list;
+       else
+           if_port_list = &ifp->if_rcv_port_list;
+
        /*
         * Unfortunately we can't allocate or deallocate memory
-        * while holding this lock.  And we can't drop the lock
-        * while examining the filter list.
+        * while holding these locks. And we can't drop the locks
+        * while examining the filter lists.
+        * Both locks are hold in case a filter is removed from both
+        * queues.
         */
        simple_lock(&ifp->if_rcv_port_list_lock);
-       FILTER_ITERATE(ifp, infp, nextfp)
-       {
+       simple_lock(&ifp->if_snd_port_list_lock);
+       FILTER_ITERATE(if_port_list, infp, nextfp,
+                      net_kmsg(kmsg)->sent ? &infp->output : &infp->input)
+       {
            entp = (net_hash_entry_t) 0;
-           if (infp->filter[0] == NETF_BPF) {
-               ret_count = bpf_do_filter(infp, net_kmsg(kmsg)->packet, count,
-                                         net_kmsg(kmsg)->header,
-                                         &hash_headp, &entp);
+           if ((infp->filter[0] & NETF_TYPE_MASK) == NETF_BPF) {
+               ret_count = bpf_do_filter(infp, net_kmsg(kmsg)->packet
+                                         + sizeof(struct packet_header),
+                                         count, net_kmsg(kmsg)->header,
+                                         ifp->if_header_size, &hash_headp,
+                                         &entp);
                if (entp == (net_hash_entry_t) 0)
                  dest = infp->rcv_port;
                else
@@ -754,9 +768,15 @@ net_filter(kmsg, send_list)
                     */
 
                    if (entp == (net_hash_entry_t) 0) {
-                       queue_remove(&ifp->if_rcv_port_list, infp,
-                                    net_rcv_port_t, chain);
-                       ENQUEUE_DEAD(dead_infp, infp);
+                       if (infp->filter[0] & NETF_IN)
+                           queue_remove(&ifp->if_rcv_port_list, infp,
+                                        net_rcv_port_t, input);
+                       if (infp->filter[0] & NETF_OUT)
+                           queue_remove(&ifp->if_snd_port_list, infp,
+                                        net_rcv_port_t, output);
+
+                       /* Use input only for queues of dead filters. */
+                       ENQUEUE_DEAD(dead_infp, infp, input);
                        continue;
                    } else {
                        hash_ent_remove (ifp,
@@ -808,22 +828,27 @@ net_filter(kmsg, send_list)
                 * See if ordering of filters is wrong
                 */
                if (infp->priority >= NET_HI_PRI) {
-                   prevfp = (net_rcv_port_t) queue_prev(&infp->chain);
-                   /*
-                    * If infp is not the first element on the queue,
-                    * and the previous element is at equal priority
-                    * but has a lower count, then promote infp to
-                    * be in front of prevfp.
-                    */
-                   if ((queue_t)prevfp != &ifp->if_rcv_port_list &&
-                       infp->priority == prevfp->priority) {
-                       /*
-                        * Threshold difference to prevent thrashing
-                        */
-                       if (net_filter_queue_reorder
-                           && (100 + prevfp->rcv_count < rcount))
-                               reorder_queue(&prevfp->chain, &infp->chain);
+#define REORDER_PRIO(chain)                                            \
+                   prevfp = (net_rcv_port_t) queue_prev(&infp->chain); \
+                   /*                                                  \
+                    * If infp is not the first element on the queue,   \
+                    * and the previous element is at equal priority    \
+                    * but has a lower count, then promote infp to      \
+                    * be in front of prevfp.                           \
+                    */                                                 \
+                   if ((queue_t)prevfp != if_port_list &&              \
+                       infp->priority == prevfp->priority) {           \
+                       /*                                              \
+                        * Threshold difference to prevent thrashing    \
+                        */                                             \
+                       if (net_filter_queue_reorder                    \
+                           && (100 + prevfp->rcv_count < rcount))      \
+                           reorder_queue(&prevfp->chain, &infp->chain);\
                    }
+
+                   REORDER_PRIO(input);
+                   REORDER_PRIO(output);
+
                    /*
                     * High-priority filter -> no more deliveries
                     */
@@ -833,7 +858,7 @@ net_filter(kmsg, send_list)
            }
        }
        FILTER_ITERATE_END
-
+       simple_unlock(&ifp->if_snd_port_list_lock);
        simple_unlock(&ifp->if_rcv_port_list_lock);
 
        /*
@@ -872,7 +897,7 @@ net_do_filter(infp, data, data_count, he
 #define        header_word     ((unsigned short *)header)
 
        sp = &stack[NET_FILTER_STACK_DEPTH];
-       fp = &infp->filter[0];
+       fp = &infp->filter[1]; /* filter[0] used for flags */
        fpe = infp->filter_end;
 
        *sp = TRUE;
@@ -999,6 +1024,10 @@ parse_net_filter(filter, count)
        register filter_t       *fpe = &filter[count];
        register filter_t       op, arg;
 
+       /*
+        * count is at least 1, and filter[0] is used for flags.
+        */
+       filter++;
        sp = NET_FILTER_STACK_DEPTH;
 
        for (; filter < fpe; filter++) {
@@ -1099,6 +1128,7 @@ net_set_filter(ifp, rcv_port, priority, 
     int                                i;
     int                                ret, is_new_infp;
     io_return_t                        rval;
+    boolean_t                  in, out;
 
     /*
      * Check the filter syntax.
@@ -1107,13 +1137,19 @@ net_set_filter(ifp, rcv_port, priority, 
     filter_bytes = CSPF_BYTES(filter_count);
     match = (bpf_insn_t) 0;
 
-    if (filter_count > 0 && filter[0] == NETF_BPF) {
+    if (filter_count == 0) {
+       return (D_INVALID_OPERATION);
+    } else if (!((filter[0] & NETF_IN) || (filter[0] & NETF_OUT))) {
+       return (D_INVALID_OPERATION); /* NETF_IN or NETF_OUT required */
+    } else if ((filter[0] & NETF_TYPE_MASK) == NETF_BPF) {
        ret = bpf_validate((bpf_insn_t)filter, filter_bytes, &match);
        if (!ret)
            return (D_INVALID_OPERATION);
-    } else {
+    } else if ((filter[0] & NETF_TYPE_MASK) == 0) {
        if (!parse_net_filter(filter, filter_count))
            return (D_INVALID_OPERATION);
+    } else {
+       return (D_INVALID_OPERATION);
     }
 
     rval = D_SUCCESS;                  /* default return value */
@@ -1129,8 +1165,8 @@ net_set_filter(ifp, rcv_port, priority, 
        is_new_infp = TRUE;
     } else {
         /*
-        * If there is a match instruction, we assume there will
-        * multiple session with a common substructure and allocate
+        * If there is a match instruction, we assume there will be
+        * multiple sessions with a common substructure and allocate
         * a hash table to deal with them.
         */
        my_infp = 0;
@@ -1143,70 +1179,87 @@ net_set_filter(ifp, rcv_port, priority, 
      * Look for filters with dead ports (for GC).
      * Look for a filter with the same code except KEY insns.
      */
-    
-    simple_lock(&ifp->if_rcv_port_list_lock);
-    
-    FILTER_ITERATE(ifp, infp, nextfp)
+    void check_filter_list(queue_head_t *if_port_list)
     {
+       FILTER_ITERATE(if_port_list, infp, nextfp,
+                       (if_port_list == &ifp->if_rcv_port_list)
+                       ? &infp->input : &infp->output)
+       {
            if (infp->rcv_port == MACH_PORT_NULL) {
-                   if (match != 0
-                       && infp->priority == priority
-                       && my_infp == 0
-                       && (infp->filter_end - infp->filter) == filter_count
-                       && bpf_eq((bpf_insn_t)infp->filter,
-                                 filter, filter_bytes))
-                           {
-                                   my_infp = infp;
-                           }
-
-                   for (i = 0; i < NET_HASH_SIZE; i++) {
-                           head = &((net_hash_header_t) infp)->table[i];
-                           if (*head == 0)
-                                   continue;
-
-                           /*
-                            * Check each hash entry to make sure the
-                            * destination port is still valid.  Remove
-                            * any invalid entries.
-                            */
-                           entp = *head;
-                           do {
-                                   nextentp = (net_hash_entry_t) entp->he_next;
+               if (match != 0
+                   && infp->priority == priority
+                   && my_infp == 0
+                   && (infp->filter_end - infp->filter) == filter_count
+                   && bpf_eq((bpf_insn_t)infp->filter,
+                             filter, filter_bytes))
+                   my_infp = infp;
+
+               for (i = 0; i < NET_HASH_SIZE; i++) {
+                   head = &((net_hash_header_t) infp)->table[i];
+                   if (*head == 0)
+                       continue;
+
+                   /*
+                    * Check each hash entry to make sure the
+                    * destination port is still valid.  Remove
+                    * any invalid entries.
+                    */
+                   entp = *head;
+                   do {
+                       nextentp = (net_hash_entry_t) entp->he_next;
   
-                                   /* checked without 
-                                      ip_lock(entp->rcv_port) */
-                                   if (entp->rcv_port == rcv_port
-                                       || !IP_VALID(entp->rcv_port)
-                                       || !ip_active(entp->rcv_port)) {
-                               
-                                           ret = hash_ent_remove (ifp,
-                                               (net_hash_header_t)infp,
-                                               (my_infp == infp),
-                                               head,
-                                               entp,
-                                               &dead_entp);
-                                           if (ret)
-                                                   goto hash_loop_end;
-                                   }
+                       /* checked without 
+                          ip_lock(entp->rcv_port) */
+                       if (entp->rcv_port == rcv_port
+                           || !IP_VALID(entp->rcv_port)
+                           || !ip_active(entp->rcv_port)) {
+                           ret = hash_ent_remove (ifp,
+                               (net_hash_header_t)infp,
+                               (my_infp == infp),
+                               head,
+                               entp,
+                               &dead_entp);
+                           if (ret)
+                               goto hash_loop_end;
+                       }
                        
-                                   entp = nextentp;
-                           /* While test checks head since hash_ent_remove
-                              might modify it.
-                              */
-                           } while (*head != 0 && entp != *head);
-                   }
+                       entp = nextentp;
+                   /* While test checks head since hash_ent_remove
+                      might modify it.
+                    */
+                   } while (*head != 0 && entp != *head);
+               }
+
                hash_loop_end:
                    ;
-                   
            } else if (infp->rcv_port == rcv_port
                       || !IP_VALID(infp->rcv_port)
                       || !ip_active(infp->rcv_port)) {
-                   /* Remove the old filter from list */
-                   remqueue(&ifp->if_rcv_port_list, (queue_entry_t)infp);
-                   ENQUEUE_DEAD(dead_infp, infp);
+
+                   /* Remove the old filter from lists */
+                   if (infp->filter[0] & NETF_IN)
+                       queue_remove(&ifp->if_rcv_port_list, infp,
+                                    net_rcv_port_t, input);
+                   if (infp->filter[0] & NETF_OUT)
+                       queue_remove(&ifp->if_snd_port_list, infp,
+                                    net_rcv_port_t, output);
+
+                   ENQUEUE_DEAD(dead_infp, infp, input);
            }
+       }
+       FILTER_ITERATE_END
     }
-    FILTER_ITERATE_END
+
+    in = (filter[0] & NETF_IN) != 0;
+    out = (filter[0] & NETF_OUT) != 0;
+
+    simple_lock(&ifp->if_rcv_port_list_lock);
+    simple_lock(&ifp->if_snd_port_list_lock);
+
+    if (in)
+       check_filter_list(&ifp->if_rcv_port_list);
+    if (out)
+       check_filter_list(&ifp->if_snd_port_list);
 
     if (my_infp == 0) {
        /* Allocate a dummy infp */
@@ -1217,6 +1270,7 @@ net_set_filter(ifp, rcv_port, priority, 
        }
        if (i == N_NET_HASH) {
            simple_unlock(&net_hash_header_lock);
+           simple_unlock(&ifp->if_snd_port_list_lock);
            simple_unlock(&ifp->if_rcv_port_list_lock);
 
             ipc_port_release_send(rcv_port);
@@ -1257,10 +1311,21 @@ net_set_filter(ifp, rcv_port, priority, 
        }
 
        /* Insert my_infp according to priority */
-       queue_iterate(&ifp->if_rcv_port_list, infp, net_rcv_port_t, chain)
-           if (priority > infp->priority)
-               break;
-       enqueue_tail((queue_t)&infp->chain, (queue_entry_t)my_infp);
+       if (in) {
+           queue_iterate(&ifp->if_rcv_port_list, infp, net_rcv_port_t, input)
+               if (priority > infp->priority)
+                   break;
+
+           queue_enter(&ifp->if_rcv_port_list, my_infp, net_rcv_port_t, input);
+       }
+
+       if (out) {
+           queue_iterate(&ifp->if_snd_port_list, infp, net_rcv_port_t, output)
+               if (priority > infp->priority)
+                   break;
+
+           queue_enter(&ifp->if_snd_port_list, my_infp, net_rcv_port_t, 
output);
+       }
     }
     
     if (match != 0)
@@ -1284,9 +1349,9 @@ net_set_filter(ifp, rcv_port, priority, 
 
        ((net_hash_header_t)my_infp)->ref_count++;
        hash_entp->rcv_qlimit = net_add_q_info(rcv_port);
-
     }
     
+    simple_unlock(&ifp->if_snd_port_list_lock);
     simple_unlock(&ifp->if_rcv_port_list_lock);
 
 clean_and_return:
@@ -1537,11 +1602,12 @@ net_io_init()
  */
 
 int
-bpf_do_filter(infp, p, wirelen, header, hash_headpp, entpp)
+bpf_do_filter(infp, p, wirelen, header, hlen, hash_headpp, entpp)
        net_rcv_port_t  infp;
        char *          p;              /* packet data */
        unsigned int    wirelen;        /* data_count (in bytes) */
        char *          header;
+       unsigned int    hlen;           /* header len (in bytes) */
        net_hash_entry_t        **hash_headpp, *entpp;  /* out */
 {
        register bpf_insn_t pc, pc_end;
@@ -1551,8 +1617,11 @@ bpf_do_filter(infp, p, wirelen, header, 
        register int k;
        long mem[BPF_MEMWORDS];
 
+       /* Generic pointer to either HEADER or P according to the specified 
offset. */
+       char *data = NULL;
+
        pc = ((bpf_insn_t) infp->filter) + 1;
-                                       /* filter[0].code is BPF_BEGIN */
+                               /* filter[0].code is (NETF_BPF | flags) */
        pc_end = (bpf_insn_t)infp->filter_end;
        buflen = NET_RCV_MAX;
        *entpp = 0;                     /* default */
@@ -1596,58 +1665,53 @@ bpf_do_filter(infp, p, wirelen, header, 
 
                case BPF_LD|BPF_W|BPF_ABS:
                        k = pc->k;
-                       if ((u_int)k + sizeof(long) <= buflen) {
-#ifdef BPF_ALIGN
-                               if (((int)(p + k) & 3) != 0)
-                                       A = EXTRACT_LONG(&p[k]);
-                               else
-#endif
-                                       A = ntohl(*(long *)(p + k));
-                               continue;
-                       }
 
-                       k -= BPF_DLBASE;
-                       if ((u_int)k + sizeof(long) <= NET_HDW_HDR_MAX) {
+               load_word:
+                       if ((u_int)k + sizeof(long) <= hlen)
+                            data = header;
+                       else if ((u_int)k + sizeof(long) <= buflen) {
+                            k -= hlen;
+                            data = p;
+                       } else
+                            return 0;
+
 #ifdef BPF_ALIGN
-                               if (((int)(header + k) & 3) != 0)
-                                       A = EXTRACT_LONG(&header[k]);
-                               else
+                       if (((int)(data + k) & 3) != 0)
+                            A = EXTRACT_LONG(&data[k]);
+                       else
 #endif
-                                       A = ntohl(*(long *)(header + k));
-                               continue;
-                       } else {
-                               return 0;
-                       }
+                            A = ntohl(*(long *)(data + k));
+                       continue;
 
                case BPF_LD|BPF_H|BPF_ABS:
                        k = pc->k;
-                       if ((u_int)k + sizeof(short) <= buflen) {
-                               A = EXTRACT_SHORT(&p[k]);
-                               continue;
-                       }
 
-                       k -= BPF_DLBASE;
-                       if ((u_int)k + sizeof(short) <= NET_HDW_HDR_MAX) {
-                               A = EXTRACT_SHORT(&header[k]);
-                               continue;
-                       } else {
-                               return 0;
-                       }
+               load_half:
+                       if ((u_int)k + sizeof(short) <= hlen)
+                            data = header;
+                       else if ((u_int)k + sizeof(short) <= buflen) {
+                            k -= hlen;
+                            data = p;
+                       } else
+                            return 0;
+
+                       A = EXTRACT_SHORT(&data[k]);
+                       continue;
 
                case BPF_LD|BPF_B|BPF_ABS:
-                       k = pc->k;
-                       if ((u_int)k < buflen) {
-                               A = p[k];
-                               continue;
-                       }
-                       
-                       k -= BPF_DLBASE;
-                       if ((u_int)k < NET_HDW_HDR_MAX) {
-                               A = header[k];
-                               continue;
-                       } else {
-                               return 0;
-                       }
+                       k = pc->k;
+
+               load_byte:
+                       if ((u_int)k < hlen)
+                            data = header;
+                       else if ((u_int)k < buflen) {
+                            data = p;
+                            k -= hlen;
+                       } else
+                            return 0;
+
+                       A = data[k];
+                       continue;
 
                case BPF_LD|BPF_W|BPF_LEN:
                        A = wirelen;
@@ -1659,35 +1723,27 @@ bpf_do_filter(infp, p, wirelen, header, 
 
                case BPF_LD|BPF_W|BPF_IND:
                        k = X + pc->k;
-                       if (k + sizeof(long) > buflen)
-                               return 0;
-#ifdef BPF_ALIGN
-                       if (((int)(p + k) & 3) != 0)
-                               A = EXTRACT_LONG(&p[k]);
-                       else
-#endif
-                               A = ntohl(*(long *)(p + k));
-                       continue;
-
+                       goto load_word;
+                       
                case BPF_LD|BPF_H|BPF_IND:
                        k = X + pc->k;
-                       if (k + sizeof(short) > buflen)
-                               return 0;
-                       A = EXTRACT_SHORT(&p[k]);
-                       continue;
+                       goto load_half;
 
                case BPF_LD|BPF_B|BPF_IND:
                        k = X + pc->k;
-                       if (k >= buflen)
-                               return 0;
-                       A = p[k];
-                       continue;
+                       goto load_byte;
 
                case BPF_LDX|BPF_MSH|BPF_B:
                        k = pc->k;
-                       if (k >= buflen)
-                               return 0;
-                       X = (p[pc->k] & 0xf) << 2;
+                       if (k < hlen)
+                            data = header;
+                       else if (k < buflen) {
+                            data = p;
+                            k -= hlen;
+                       } else
+                            return 0;
+
+                       X = (data[k] & 0xf) << 2;
                        continue;
 
                case BPF_LD|BPF_IMM:
@@ -1855,7 +1911,11 @@ bpf_validate(f, bytes, match)
        register bpf_insn_t p;
 
        len = BPF_BYTES2LEN(bytes);
-       /* f[0].code is already checked to be BPF_BEGIN. So skip f[0]. */
+
+       /*
+        * f[0].code is already checked to be (NETF_BPF | flags).
+        * So skip f[0].
+        */
 
        for (i = 1; i < len; ++i) {
                /*
@@ -1984,7 +2044,7 @@ bpf_match (hash, n_keys, keys, hash_head
 /*
  * Removes a hash entry (ENTP) from its queue (HEAD).
  * If the reference count of filter (HP) becomes zero and not USED,
- * HP is removed from ifp->if_rcv_port_list and is freed.
+ * HP is removed from the corresponding port lists and is freed.
  */
 
 int
@@ -1998,13 +2058,18 @@ hash_ent_remove (ifp, hp, used, head, en
        hp->ref_count--;
 
        if (*head == entp) {
-
                if (queue_empty((queue_t) entp)) {
                        *head = 0;
-                       ENQUEUE_DEAD(*dead_p, entp);
+                       ENQUEUE_DEAD(*dead_p, entp, chain);
                        if (hp->ref_count == 0 && !used) {
-                               remqueue((queue_t) &ifp->if_rcv_port_list,
-                                        (queue_entry_t)hp);
+                               if (((net_rcv_port_t)hp)->filter[0] & NETF_IN)
+                                       queue_remove(&ifp->if_rcv_port_list,
+                                                    (net_rcv_port_t)hp,
+                                                    net_rcv_port_t, input);
+                               if (((net_rcv_port_t)hp)->filter[0] & NETF_OUT)
+                                       queue_remove(&ifp->if_snd_port_list,
+                                                    (net_rcv_port_t)hp,
+                                                    net_rcv_port_t, output);
                                hp->n_keys = 0;
                                return TRUE;
                        }
@@ -2015,7 +2080,7 @@ hash_ent_remove (ifp, hp, used, head, en
        }
 
        remqueue((queue_t)*head, (queue_entry_t)entp);
-       ENQUEUE_DEAD(*dead_p, entp);
+       ENQUEUE_DEAD(*dead_p, entp, chain);
        return FALSE;
 }    
 
@@ -2069,7 +2134,7 @@ net_free_dead_infp (dead_infp)
 
        for (infp = (net_rcv_port_t) dead_infp; infp != 0; infp = nextfp)
        {
-               nextfp = (net_rcv_port_t) queue_next(&infp->chain);
+               nextfp = (net_rcv_port_t) queue_next(&infp->input);
                ipc_port_release_send(infp->rcv_port);
                net_del_q_info(infp->rcv_qlimit);
                zfree(net_rcv_zone, (vm_offset_t) infp);
diff -Nurp gnumach-20060408.dfsg.1.orig/device/subrs.c 
gnumach-20060408.dfsg.1/device/subrs.c
--- gnumach-20060408.dfsg.1.orig/device/subrs.c 2006-04-14 10:29:32.000000000 
+0000
+++ gnumach-20060408.dfsg.1/device/subrs.c      2006-04-14 10:31:16.000000000 
+0000
@@ -82,7 +82,9 @@ void if_init_queues(ifp)
 {
        IFQ_INIT(&ifp->if_snd);
        queue_init(&ifp->if_rcv_port_list);
+       queue_init(&ifp->if_snd_port_list);
        simple_lock_init(&ifp->if_rcv_port_list_lock);
+       simple_lock_init(&ifp->if_snd_port_list_lock);
 }
 
 
diff -Nurp gnumach-20060408.dfsg.1.orig/include/device/bpf.h 
gnumach-20060408.dfsg.1/include/device/bpf.h
--- gnumach-20060408.dfsg.1.orig/include/device/bpf.h   2006-04-14 
10:30:37.000000000 +0000
+++ gnumach-20060408.dfsg.1/include/device/bpf.h        2006-04-14 
10:31:16.000000000 +0000
@@ -72,7 +72,8 @@
 #ifndef _DEVICE_BPF_H_
 #define _DEVICE_BPF_H_
 
-#if 0  /* not used in MK now */
+#include <sys/types.h> /* u_short */
+
 /*
  * Alignment macros.  BPF_WORDALIGN rounds up to the next 
  * even multiple of BPF_ALIGNMENT. 
@@ -115,14 +116,14 @@ struct bpf_version {
 #define DLT_PPP                9       /* Point-to-point Protocol */
 #define DLT_FDDI       10      /* FDDI */
 
-#endif /* 0 */
-
 /*
  * The instruction encondings.
  */
 
-/* Magic number for the first instruction */
-#define BPF_BEGIN NETF_BPF
+/* Magic number and flags for the first instruction */
+#define BPF_BEGIN      NETF_BPF
+#define BPF_IN         NETF_IN
+#define BPF_OUT                NETF_OUT
 
 /* instruction classes */
 #define BPF_CLASS(code) ((code) & 0x07)
diff -Nurp gnumach-20060408.dfsg.1.orig/include/device/net_status.h 
gnumach-20060408.dfsg.1/include/device/net_status.h
--- gnumach-20060408.dfsg.1.orig/include/device/net_status.h    2006-04-14 
10:30:37.000000000 +0000
+++ gnumach-20060408.dfsg.1/include/device/net_status.h 2006-04-14 
10:31:16.000000000 +0000
@@ -98,6 +98,11 @@ struct net_status {
  *  If the final value of the filter operation is true, then the packet is
  *  accepted for the filter.
  *
+ *  The first filter_t object is a header which allows to set flags for the
+ *  filter code. Main flags concern the direction of packets. This header is
+ *  split in the same way NETF words are : the 6 MSB bits indicate the type
+ *  of filter while the 10 LSB bits are the flags. For native NETF filters,
+ *  clear the 6 MSB bits (which is why there is no dedicated macro).
  */
 
 typedef        unsigned short  filter_t;
@@ -112,6 +117,14 @@ typedef filter_t   *filter_array_t;
 #define        NETF_ARG(word)  ((word) & 0x3ff)
 #define        NETF_OP(word)   (((word)>>NETF_NBPA)&0x3f)
 
+/*  filter types  */
+#define NETF_TYPE_MASK (((1 << NETF_NBPO) - 1) << NETF_NBPA)
+#define NETF_BPF       (1 << NETF_NBPA)
+
+/*  flags  */
+#define NETF_IN                0x1
+#define NETF_OUT       0x2
+
 /*  binary operators  */
 #define NETF_NOP       (0<<NETF_NBPA)
 #define NETF_EQ                (1<<NETF_NBPA)
@@ -131,7 +144,6 @@ typedef filter_t    *filter_array_t;
 #define        NETF_RSH        (15<<NETF_NBPA)
 #define        NETF_ADD        (16<<NETF_NBPA)
 #define        NETF_SUB        (17<<NETF_NBPA)
-#define NETF_BPF       (((1 << NETF_NBPO) - 1) << NETF_NBPA)
 
 
 /*  stack arguments  */
@@ -178,6 +190,7 @@ struct net_rcv_msg {
        char            header[NET_HDW_HDR_MAX];
        mach_msg_type_t packet_type;
        char            packet[NET_RCV_MAX];
+       boolean_t       sent;
 };
 typedef struct net_rcv_msg     *net_rcv_msg_t;
 #define        net_rcv_msg_packet_count packet_type.msgt_number
diff -Nurp gnumach-20060408.dfsg.1.orig/linux/dev/glue/net.c 
gnumach-20060408.dfsg.1/linux/dev/glue/net.c
--- gnumach-20060408.dfsg.1.orig/linux/dev/glue/net.c   2006-04-14 
10:29:42.000000000 +0000
+++ gnumach-20060408.dfsg.1/linux/dev/glue/net.c        2006-04-14 
10:50:43.000000000 +0000
@@ -290,6 +290,9 @@ netif_rx (struct sk_buff *skb)
   eh = (struct ether_header *) (net_kmsg (kmsg)->header);
   ph = (struct packet_header *) (net_kmsg (kmsg)->packet);
   memcpy (eh, skb->data, sizeof (struct ether_header));
+
+  /* packet is prefixed with a struct packet_header,
+     see include/device/net_status.h.  */
   memcpy (ph + 1, skb->data + sizeof (struct ether_header),
          skb->len - sizeof (struct ether_header));
   ph->type = eh->ether_type;
@@ -298,6 +301,8 @@ netif_rx (struct sk_buff *skb)
 
   dev_kfree_skb (skb, FREE_READ);
 
+  net_kmsg(kmsg)->sent = FALSE; /* Mark packet as received.  */
+
   /* Pass packet up to the microkernel.  */
   net_packet (&dev->net_data->ifnet, kmsg,
              ph->length, ethernet_priority (kmsg));
@@ -484,6 +489,34 @@ device_write (void *d, ipc_port_t reply_
     }
   splx (s);
 
+  /* Send packet to filters.  */
+  {
+    struct packet_header *packet;
+    struct ether_header *header;
+    ipc_kmsg_t kmsg;
+
+    kmsg = net_kmsg_get ();
+
+    if (kmsg != IKM_NULL)
+      {
+        /* Suitable for Ethernet only.  */
+        header = (struct ether_header *) (net_kmsg (kmsg)->header);
+        packet = (struct packet_header *) (net_kmsg (kmsg)->packet);
+        memcpy (header, skb->data, sizeof (struct ether_header));
+
+        /* packet is prefixed with a struct packet_header,
+           see include/device/net_status.h.  */
+        memcpy (packet + 1, skb->data + sizeof (struct ether_header),
+                skb->len - sizeof (struct ether_header));
+        packet->length = skb->len - sizeof (struct ether_header)
+                         + sizeof (struct packet_header);
+        packet->type = header->ether_type;
+        net_kmsg (kmsg)->sent = TRUE; /* Mark packet as sent.  */
+        net_packet (&dev->net_data->ifnet, kmsg, packet->length,
+                    ethernet_priority (kmsg));
+      }
+  }
+
   return MIG_NO_REPLY;
 }
 

pfinet_packet_filter.patch:
2006-03-06 Richard Braun <[EMAIL PROTECTED]>

        * pfinet/ethernet.c: Update the NETF filter to include the new
        mandatory header.

diff -Nurp pfinet/ethernet.c.orig pfinet/ethernet.c
--- pfinet/ethernet.c.orig      2006-03-06 10:48:04.000000000 +0100
+++ pfinet/ethernet.c   2006-03-06 07:28:10.000000000 +0100
@@ -68,6 +68,7 @@ ethernet_set_multi (struct device *dev)
 
 static short ether_filter[] =
 {
+  NETF_IN,
   NETF_PUSHLIT | NETF_NOP,
   1
 };


-- 
Richard Braun

signature.asc
Description: Digital signature

_______________________________________________
Bug-hurd mailing list
Bug-hurd@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-hurd

Re: GNU Mach interface to packet filters

Reply via email to