On Fri, Sep 23, 2011 at 11:04:43AM -0700, Jesse Gross wrote: > On Fri, Sep 23, 2011 at 9:16 AM, Ben Pfaff <b...@nicira.com> wrote: > > On Thu, Sep 22, 2011 at 03:59:30PM -0700, Jesse Gross wrote: > >> On Thu, Sep 22, 2011 at 1:11 PM, Ben Pfaff <b...@nicira.com> wrote: > >> > On Mon, Sep 19, 2011 at 03:00:08PM -0700, Jesse Gross wrote: > >> >> Currently it is possible for a client on a single port to generate > >> >> a huge number of packets that miss in the kernel flow table and > >> >> monopolize the userspace/kernel communication path. ??This > >> >> effectively DoS's the machine because no new flow setups can take > >> >> place. ??This adds some additional fairness by separating each upcall > >> >> type for each object in the datapath onto a separate socket, each > >> >> with its own queue. ??Userspace then reads round-robin from each > >> >> socket so other flow setups can still succeed. > >> >> > >> >> Since the number of objects can potentially be large, we don't always > >> >> have a unique socket for each. ??Instead, we create 16 sockets and > >> >> spread the load around them in a round robin fashion. ??It's > >> >> theoretically > >> >> possible to do better than this with some kind of active load balancing > >> >> scheme but this seems like a good place to start. > >> >> > >> >> Feature #6485 > >> > > >> > I'm not sure that we should increment last_assigned_upcall from > >> > dpif_linux_execute__() as well as from dpif_flow_put(). ??Most > >> > dpif_flow_put() calls are right after a dpif_linux_execute__() for the > >> > same flow (manually sending the first packet), so this will tend to > >> > use only the even-numbered upcall sockets. > >> > > >> > Also, manually sending the first packet of a flow with one > >> > upcall_sock, then setting up a kernel flow that uses a different > >> > upcall_sock could mean that userspace sees packet reordering, if it > >> > checks the upcall_socks in the "wrong" order. > >> > > >> > One way to avoid these problems would be to choose the upcall_sock > >> > using a hash of the flow instead of a counter. > >> > >> I think that probably makes sense and at the same time it addresses > >> Pravin's concerns about assigning ports to sockets in a purely > >> round-robin fashion. > >> > >> This is easier to do correctly if we have "struct flow" here instead > >> of the Netlink attributes. ??It actually seems more correct to do the > >> conversion in dpif-linux anyways. ??In theory, it could result in some > >> unnecessary conversions for things that are bouncing back and forth > >> between userspace and kernel but I think in practice we end up doing > >> the conversions for most operations anyways. ??Is there a reason to > >> layer it the way it is? > > > > It's layered this way to support decoupling the kernel and user flow > > keys, as described in commit 856081f683: > > > > ?? ??datapath: Report kernel's flow key when passing packets up to > > userspace. > > > > ?? ??One of the goals for Open vSwitch is to decouple kernel and userspace > > ?? ??software, so that either one can be upgraded or rolled back > > independent of > > ?? ??the other. ??To do this in full generality, it must be possible to > > change > > ?? ??the kernel's idea of the flow key separately from the userspace > > version. > > > > ?? ??This commit takes one step in that direction by making the kernel > > report > > ?? ??its idea of the flow that a packet belongs to whenever it passes a > > packet > > ?? ??up to userspace. ??This means that userspace can intelligently figure > > out > > ?? ??what to do: > > > > ?? ?? ?? - If userspace's notion of the flow for the packet matches the > > kernel's, > > ?? ?? ?? ?? then nothing special is necessary. > > > > ?? ?? ?? - If the kernel has a more specific notion for the flow than > > userspace, > > ?? ?? ?? ?? for example if the kernel decoded IPv6 headers but userspace > > stopped > > ?? ?? ?? ?? at the Ethernet type (because it does not understand IPv6), > > then again > > ?? ?? ?? ?? nothing special is necessary: userspace can still set up the > > flow in > > ?? ?? ?? ?? the usual way. > > > > ?? ?? ?? - If userspace has a more specific notion for the flow than the > > kernel, > > ?? ?? ?? ?? for example if userspace decoded an IPv6 header but the kernel > > ?? ?? ?? ?? stopped at the Ethernet type, then userspace can forward the > > packet > > ?? ?? ?? ?? manually, without setting up a flow in the kernel. ??(This case > > is > > ?? ?? ?? ?? bad from a performance point of view, but at least it is > > correct.) > > > > ?? ??This commit does not actually make userspace flexible enough to handle > > ?? ??changes in the kernel flow key structure, although userspace does now > > ?? ??have enough information to do that intelligently. ??This will have to > > wait > > ?? ??for later commits. > > > > At the time I thought that we'd want to implement the userspace half > > of this quickly, but now it's become clear that it's a low priority, > > so it's fine with me if you want to change the layering for now. > > I don't want to move away from the direction that we plan to go long > term. I think that we can achieve both platform-independence and > userspace/kernel decoupling at the same time but for what I'm trying > to achieve right now it's not really necessary so I'll wait. > > Pravin suggested tying flows to the socket for their associated in > port, which is probably the best policy for DoS. It also solves the > issues with flow add/execute packet combinations and is easy enough to > pull out of the Netlink flow key so I'm going to do that instead.
OK, sounds good to me. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev