On Tue, Nov 24, 2015 at 08:16:25PM +0100, Hannes Frederic Sowa wrote: > Hello, > > On Tue, Nov 24, 2015, at 19:59, Alexei Starovoitov wrote: > > On Tue, Nov 24, 2015 at 07:23:30PM +0100, Hannes Frederic Sowa wrote: > > > Hello, > > > > > > On Tue, Nov 24, 2015, at 17:25, Florian Westphal wrote: > > > > Its a well-written document, but I don't see how moving the burden of > > > > locking a single logical tcp connection (to prevent threads from > > > > reading a partial record) from userspace to kernel is an improvement. > > > > > > > > If you really have 100 threads and must use a single tcp connection > > > > to multiplex some arbitrarily complex record-format in atomic fashion, > > > > then your requirements suck. > > > > > > Right, if we are in a datacenter I would probably write a script and use > > > all those IPv6 addresses to set up mappings a la: > > > > > > for each $cpu; do > > > $ip address add 2000::$host:$cpu/64 dev if0 pref_cpu $cpu > > > done > > > > interesting idea, but then remote host will be influencing local cpu > > selection? > > how remote can figure out the number of local cpus? > > Via rpc! :) > > The configuration shouldn't change all the time and some get_info rpc > call could provide info for the topology of the machine, or...
Configuration changes all the time. Machines crash, traffic redirected because of load, etc, etc > > Consider scenario where you have a ton of tcp sockets feeding into > > bigger or smaller set of kcm sockets processed by threads or fibers. > > Pinning sockets to cpu is not going to work. > > > > Also note that opimizing byte copies between kernel and user space is > > important, > > but we lose a lot more in user space due to scheduling and re-scheduling > > when demux-ing user space thread is feeding other worker threads. > > ...also ipvs/netfilter could be used to only inspect the header and > reroute the packet to some better fitting CPU. Complete hierarchies > could be build with NUMA and addresses, packets could be rerouted into > namespaces, etc. or tc+bpf redirect... but the reason it won't work is the same as af_packet+bpf fanout doesn't apply: It's not packet based demuxing. Kernel needs to deal with TCP stream first and different messages within single TCP stream go to different workers. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html