[ovs-dev] [PATCH v15 3/3] Add MPLS recirculation tests
From: Joe Stringer This patch introduces a python script to generate about 1500 tests for permutations of mpls_push,mpls_pop,dec_mpls_ttl,dec_ttl where recirculation occurs up to (and including) three times. Signed-off-by: Joe Stringer Signed-off-by: Simon Horman --- v15 * Add additional flow removal tests to cover double,multiple recirculation v14 * Allow testing remote datapaths - Expose egress bridge, datapath under test, destination mac parameters * Add option to remove flows after each test v13 * Use the new force-miss-model config option to test MPLS on ofproto-dpif through the code paths for both handle_flow_miss_with_facet() and handle_flow_miss_without_facet(). v12 * First post --- tests/automake.mk | 1 + tests/ofproto-dpif.at | 153 tests/test-mpls.py| 491 ++ 3 files changed, 645 insertions(+) create mode 100644 tests/test-mpls.py diff --git a/tests/automake.mk b/tests/automake.mk index 755d88e..9e6935f 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -326,6 +326,7 @@ CHECK_PYFILES = \ tests/test-jsonrpc.py \ tests/test-ovsdb.py \ tests/test-reconnect.py \ + tests/test-mpls.py \ tests/MockXenAPI.py \ tests/test-unix-socket.py \ tests/test-unixctl.py \ diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index dd45a9b..b873a5b 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -1224,6 +1224,159 @@ NXST_FLOW reply: OVS_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([ofproto-dpif - MPLS recirculation]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=with-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) +AT_CHECK([$PYTHON $srcdir/test-mpls.py 1]) + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS recirculation, no facets]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=without-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) +AT_CHECK([$PYTHON $srcdir/test-mpls.py 1]) + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS recirculation, delete rules between]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) +AT_CHECK([$PYTHON $srcdir/test-mpls.py -c 1]) + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS double-recirculate]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=with-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {0..150..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py 2 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS double-recirculate, no facets]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=without-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {0..150..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py 2 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS double-recirculate, delete rules between]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {0..150..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py -c 2 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS multi-recirculate]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=with-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {0..550..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py 3 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS multi-recirculate, no facets]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=without-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {0..550..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py 3 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS multi-recirculate, delete rules between]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {0..550..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py -c 3 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS multi-recirculate 2]) +OVS_VSWITCHD_START([dnl + add-port br0 p1 -- set Interface p1 type=dummy -- \ + set Open_vSwitch . other_config:force-miss-model=with-facets +]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) + +for i in {600..1200..50}; do +AT_CHECK([$PYTHON $srcdir/test-mpls.py 3 $i]) +done + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - MPLS multi-recircula
[ovs-dev] [PATCH v15 0/3] Add packet recirculation
[ CC Ethan Jackson. Ethan, this makes changes to ofproto-dpif and ofproto-dpif-xlate. We would appreciate some feedback with how these changes fit with your ongoing refactoring in this area. ] Recirculation is a technique to allow a frame to re-enter frame processing. This is intended to be used after actions have been applied to the frame with modify the frame in some way that makes it possible for richer processing to occur. An example is and indeed targeted use case is MPLS. If an MPLS frame has an mpls_pop action applied with the IPv4 ethernet type then it becomes possible to decode the IPv4 portion of the frame. This may be used to construct a facet that modifies the IPv4 portion of the frame. This is not possible prior to the mpls_pop action as the contents of the frame after the MPLS stack is not known to be IPv4. Design: * New recirculation action. ovs-vswitchd adds a recirculation action to the end of a list of datapath actions for a flow when the actions are truncated because insufficient flow match information is available to add the next OpenFlow action. The recirculation action is preceded by an action to set the skb_mark to an id which can be used to scope a facet lookup of a recirculated packet. e.g. pop_mpls(0x0800),dec_ttl becomes pop_mpls(0x800),set(skb_mark(id)),recirculate * Datapath behaviour Then the datapath encounters a recirculate action it: + Recalculates the flow key based on the packet which will typically have been modified by previous actions + As the recirculate action is preceded by a set(skb_mark(id)) action, the new match key will now include skb_mark=id. + Performs a lookup using the new match key + Processes the packet if a facet matches the key or; + Makes an upcall if necessary * No facet behaviour + Loop: 1) translate actions 2) If there is a recirculate action, execute packet and go back to 1) for remaining actions. Base/Pre-requisites: This series applies to the git master branch on openvswtich.org and does not have any pre-requisites. Availability: To aid review and testing this series is available in git at: git://github.com/horms/openvswitch.git devel/mpls-recirculate-v15 Change Log Highlights: v14 * Enhance "ofproto-dpif: Add 'force-miss-model' configuration" patch to use strings. * Enhance "Add MPLS recirculation tests" to allow testing of remote datapaths - this was used to create an environment to test the kernel datapath * Allow matching of zero skb_mark - Include skb_mark in flow key passed from datapath - Include skb_mark in flow key constructed from flow if the mask is non-zero * Set skb_mark mask for recirculated facets during revalidation * Run through facets in reverse order in facet_revalidate(), so that children are always destroyed before their parents. * Recirculate on more than one pop otherwise intermediate eth type is lost v13 * Add patch to allow configuration of flow miss handling to force handling with or without facets. This is used by the patch to exercise recirculation using ofproto-dpif. * Extensive rebase for megaflow changes v12 * Rebase * Add much more extensive tests by Joe Stringer * Ensure pre_push_mpls_lse is never used uninitialised in do_xlate_actions() * Copy flow tunnel when recirculating in xlate_and_recirculate(). This avoids a bug that occurred when recirculating more than once and thus using flow_storage as the flow passed to flow_extract() and flow_storage.tunnel as the tunnel passed to flow_extract(). The tunnel parameter passed to flow_extract() must not be the tunnel element of its flow parameter. * In xlate_and_recirculate() ensure that there is sufficient headroom in packet for both the key, which is present if xlate_and_recirculate() is called by handle_flow_miss_without_facet(), and an MPLS LSE in case mpls_push() occurs. * As suggested by Ben Pfaff - Do not unnecessarily rename flow variable in dp_netdev_port_input() - Rename tun_key_from_attr() as odp_tun_key_from_attr() as it is now exported v11 * Add patches to handle multiple levels of push and pop * Addressed review by Jarno Rajahalme v10 * As suggested by Ben Pfaff - Rename lib/execute-actions.[ch] as lib/odp-execute.[ch] - Sort headers in alphabetical order - Minimise includes in lib/odp-execute.h - Include lib/odp-execute.h immediately after config.h in lib/odp-execute.c - Correct coding style of prototype of tun_key_from_attr - Remove dubious memset from set_tunnel_action - use {} with if statement for OVS_ACTION_ATTR_OUTPUT in execute_actions() - Do not put required side effects in conditions for ovs_assert(); - Remove execute_actions_for_recircualtion() helper and use odp_execute_actions directly() v9 * As suggested by Jesse Gross - Follow the convention on other code that when passing around a void pointer and casting it, the variable named with an underscore is the void pointer. - Move extraction of userspace data int
[ovs-dev] [PATCH] tests: Add ofproto-dpif push_vlan test
Signed-off-by: Joe Stringer --- tests/ofproto-dpif.at | 21 + 1 file changed, 21 insertions(+) diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 5e2afb9..7204195 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -265,6 +265,7 @@ cookie=0x6 table=4 in_port=83 actions=load:4->NXM_NX_REG3[[]],mod_nw_src:83.83.8 cookie=0x7 table=5 in_port=84 actions=load:5->NXM_NX_REG4[[]],load:6->NXM_NX_TUN_ID[[]],mod_nw_dst:84.84.84.84,controller,resubmit(85,6) cookie=0x8 table=6 in_port=85 actions=mod_tp_src:85,controller,resubmit(86,7) cookie=0x9 table=7 in_port=86 actions=mod_tp_dst:86,controller,controller +cookie=0xa dl_src=40:44:44:44:44:41 actions=push_vlan:0x8100,mod_vlan_vid:99,mod_vlan_pcp:1,controller cookie=0xa dl_src=40:44:44:44:44:42 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller cookie=0xa dl_src=40:44:44:44:44:43 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller cookie=0xa dl_src=40:44:44:44:44:44 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller @@ -335,6 +336,25 @@ OFPT_PACKET_IN (xid=0x0): total_len=64 in_port=1 (via action) data_len=64 (unbuf tcp,metadata=0,in_port=0,dl_vlan=15,dl_vlan_pcp=0,dl_src=30:33:33:33:33:33,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10 tcp_csum:0 ]) +dnl Modified VLAN controller action. +AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log]) + +for i in 1 2 3; do +ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:44:41,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)' +done + +OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) +AT_CHECK([cat ofctl_monitor.log], [0], [dnl +NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered) +tcp,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:44:41,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64 tcp_csum:0 +dnl +NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered) +tcp,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:44:41,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64 tcp_csum:0 +dnl +NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered) +tcp,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:44:41,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64 tcp_csum:0 +]) + dnl Modified MPLS controller action. AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log]) @@ -620,6 +640,7 @@ AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl cookie=0x7, table=5, n_packets=2, n_bytes=120, in_port=84 actions=load:0x5->NXM_NX_REG4[[]],load:0x6->NXM_NX_TUN_ID[[]],mod_nw_dst:84.84.84.84,CONTROLLER:65535,resubmit(85,6) cookie=0x8, table=6, n_packets=2, n_bytes=120, in_port=85 actions=mod_tp_src:85,CONTROLLER:65535,resubmit(86,7) cookie=0x9, table=7, n_packets=2, n_bytes=120, in_port=86 actions=mod_tp_dst:86,CONTROLLER:65535,CONTROLLER:65535 + cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:44:41 actions=mod_vlan_vid:99,mod_vlan_pcp:1,CONTROLLER:65535 cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:44:42 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535 cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:44:43 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535 cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:44:44 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535 -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] vswitchd: Fix typo in comment
"if_cfg_queue" is not a member of "struct bridge" in bridge.c Signed-off-by: Min Chen --- vswitchd/bridge.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 686eb93..608bc5c 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -505,7 +505,7 @@ bridge_reconfigure(const struct ovsrec_open_vswitch *ovs_cfg) "force-miss-model")); /* Destroy "struct bridge"s, "struct port"s, and "struct iface"s according - * to 'ovs_cfg' while update the "if_cfg_queue", with only very minimal + * to 'ovs_cfg' while update the "if_cfg_todo", with only very minimal * configuration otherwise. * * This is mostly an update to bridge data structures. Nothing is pushed @@ -2786,7 +2786,7 @@ bridge_queue_if_cfg(struct bridge *br, } /* Deletes "struct port"s and "struct iface"s under 'br' which aren't - * consistent with 'br->cfg'. Updates 'br->if_cfg_queue' with interfaces which + * consistent with 'br->cfg'. Updates 'br->if_cfg_todo' with interfaces which * 'br' needs to complete its configuration. */ static void bridge_add_del_ports(struct bridge *br, -- 1.7.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 07/28] timeval: Make reading the current time thread-safe.
On 10 July 2013 19:03, Ben Pfaff wrote: > Signed-off-by: Ben Pfaff Acked-by: Ed Maste (I rebased on master and had a trivial conflict to resolve.) X-CudaMail-Whitelist-To: dev@openvswitch.org ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 10/28] timeval: Make CPU usage and wakeup tracking per-thread.
On 10 July 2013 19:03, Ben Pfaff wrote: > This should make the "timeval" module thread-safe, except for its > calls into vlog (because vlog is not yet thread-safe). > > Signed-off-by: Ben Pfaff Acked-by: Ed Maste X-CudaMail-Whitelist-To: dev@openvswitch.org ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2.34] datapath: Add basic MPLS support to kernel
On Thu, Jul 11, 2013 at 6:55 PM, Simon Horman wrote: > On Thu, Jul 11, 2013 at 06:26:32PM -0700, Jesse Gross wrote: >> On Wed, Jul 10, 2013 at 5:16 PM, Simon Horman wrote: >> > On Mon, Jul 08, 2013 at 11:51:18PM -0700, Jesse Gross wrote: >> >> On Tue, Jul 2, 2013 at 6:31 PM, Simon Horman wrote: >> >> > On Tue, Jul 02, 2013 at 02:59:51PM -0700, Jesse Gross wrote: >> >> >> On Mon, Jul 1, 2013 at 9:16 PM, Simon Horman >> >> >> wrote: >> >> >> > On Mon, Jul 01, 2013 at 04:42:46PM -0700, Jesse Gross wrote: >> >> >> >> On Wed, Jun 26, 2013 at 1:18 AM, Simon Horman >> >> >> >> wrote: >> >> >> >> > Allow datapath to recognize and extract MPLS labels into flow keys >> >> >> >> > and execute actions which push, pop, and set labels on packets. >> >> >> >> > >> >> >> >> > Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and >> >> >> >> > Joe Stringer. >> >> >> >> > >> >> >> >> > Cc: Ravi K >> >> >> >> > Cc: Leo Alterman >> >> >> >> > Cc: Isaku Yamahata >> >> >> >> > Cc: Joe Stringer >> >> >> >> > Signed-off-by: Simon Horman >> >> >> >> >> >> >> >> Simon, have you thought any more about the header ordering issues? I >> >> >> >> don't think we've reached a conclusion at this point. >> >> >> > >> >> >> > I believe that you then raised the issue of QinQ, which somehow >> >> >> > I failed to respond to, I apologise for that. >> >> >> > >> >> >> > In particular, my understanding is that you are concerned the code >> >> >> > will >> >> >> > miss-calculate the end of L2 headers in the presence of multiple >> >> >> > VLAN tags. >> >> >> > Thus resulting in an MPLS push action inserting an MPLS LSE after >> >> >> > the first >> >> >> > rather than the last VLAN tag. And that would likely change if QinQ >> >> >> > support >> >> >> > was added to Open vSwtich. >> >> >> > >> >> >> > I wonder if a good way forwards is to enhance the calculation >> >> >> > of the end of L2 headers (mac_len) and the beginning of L3 headers >> >> >> > (network_header) in ovs_flow_extract() such that it takes into >> >> >> > account the presence of more than one VLAN tag. >> >> >> >> >> >> The problem is that this requires being able to calculate the length >> >> >> of all possible headers that we might want to support in the future. >> >> >> In the case of QinQ, this would mean the various EtherTypes. You could >> >> >> also imagine other things like MAC-in-MAC that are farther afield from >> >> >> what we currently support. >> >> > >> >> > That is true. >> >> > >> >> > I think that the key problem is it that it is hard for us >> >> > to correctly determine where the end of the L2 header is, >> >> > or more specifically where the MPLS tag should go, for all cases. >> >> > Particularly cases which are yet to be defined. >> >> > >> >> > Having spoken with Joe about this we see two main options: >> >> > >> >> > 1. The status quo as of this patch. Which is that MPLS actions >> >> >may be invalid for some cases. >> >> > >> >> >While it should be be possible to solve individual cases, >> >> >this doesn't solve the general case. >> >> > >> >> > 2. Only allow MPLS actions on ether types where the implementation >> >> >is known to work. >> >> > >> >> >This could act as a white list of sorts. It could start with >> >> >the obvious candidates: IPv4, IPv6, ARP, 802.1Q,... >> >> >And support for more protocols could be added in the future. >> >> > >> >> >This would seem to reflect the somewhat special nature of MPLS. >> >> >> >> I think what is really necessary at the kernel level is some >> >> flexibility about where to put the newly inserted MPLS header. Then >> >> you could basically chose either of the two options above or export >> >> the flexibility through OpenFlow (which by my reading of the spec is >> >> already supposed to be allowed). Furthermore, you could do different >> >> things in different situations as OpenFlow evolves, which really has >> >> to be done at the userspace level since only it has the full set of >> >> knowledge about the protocol. >> > >> > I wonder if this can be achieved by adding a list of features to >> > the MPLS push action, for example as a possibly zero-length array of >> > integers which encode feature bits. >> > >> > At the time that MPLS support is added to the datapath we could define that >> > all the bits are zero and the behaviour of the code at that time is the >> > expected behaviour. >> > >> > Suppose that a new feature is added in the future. I will use the example >> > of support for 802.1ad (the standardised variant of Q-in-Q). >> > >> > The logic in the datapath to determine the end of the L2 header and thus >> > the top of the MPLS LSE stack could be guarded by a new feature bit, >> > the ad-bit. >> > >> > If an MPLS push action, supplied by userspace, has the ad-bit set then the >> > new logic is used and the MPLS LSE is inserted accordingly. >> > Conversely, if the MPLS push action does not have the bit set then the >> > old logic is used and the
Re: [ovs-dev] [PATCH] tests: Add ofproto-dpif push_vlan test
On Fri, Jul 12, 2013 at 04:58:13PM +0900, Joe Stringer wrote: > Signed-off-by: Joe Stringer Thanks Joe, I applied this to master. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] tests: Fix theoretical races in "ofproto-dpif - controller" test.
I don't see anything that guaranteed that ovs-ofctl would receive and print the packets before it exited. This commit inserts explicit waits, to avoid the problem. Found by inspection. Signed-off-by: Ben Pfaff --- tests/ofproto-dpif.at | 46 +++--- 1 files changed, 31 insertions(+), 15 deletions(-) diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 7204195..330edd8 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -251,6 +251,7 @@ AT_SETUP([ofproto-dpif - controller]) OVS_VSWITCHD_START([dnl add-port br0 p1 -- set Interface p1 type=dummy ]) +ON_EXIT([kill `cat ovs-ofctl.pid`]) AT_CAPTURE_FILE([ofctl_monitor.log]) AT_DATA([flows.txt], [dnl @@ -285,8 +286,9 @@ AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfil for i in 1 2 3 ; do ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)' done +OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6]) +ovs-appctl -t ovs-ofctl exit -OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) AT_CHECK([cat ofctl_monitor.log], [0], [dnl OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via no_match) data_len=60 (unbuffered) tcp,metadata=0,in_port=0,vlan_tci=0x,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=9 tcp_csum:0 @@ -304,8 +306,9 @@ AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfil for i in 1 2 3 ; do ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=10:11:11:11:11:11,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=10)' done +OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6]) +ovs-appctl -t ovs-ofctl exit -OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) AT_CHECK([cat ofctl_monitor.log], [0], [dnl OFPT_PACKET_IN (xid=0x0): total_len=60 in_port=1 (via action) data_len=60 (unbuffered) tcp,metadata=0,in_port=0,vlan_tci=0x,dl_src=10:11:11:11:11:11,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10 tcp_csum:0 @@ -323,8 +326,9 @@ AT_CHECK([ovs-ofctl monitor -P openflow10 br0 65534 --detach --no-chdir --pidfil for i in 1 2 3 ; do ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=30:33:33:33:33:33,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=10)' done +OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6]) +ovs-appctl -t ovs-ofctl exit -OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) AT_CHECK([cat ofctl_monitor.log], [0], [dnl OFPT_PACKET_IN (xid=0x0): total_len=64 in_port=1 (via action) data_len=64 (unbuffered) tcp,metadata=0,in_port=0,dl_vlan=15,dl_vlan_pcp=0,dl_src=30:33:33:33:33:33,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=8,tp_dst=10 tcp_csum:0 @@ -342,8 +346,9 @@ AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor for i in 1 2 3; do ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:44:41,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)' done +OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6]) +ovs-appctl -t ovs-ofctl exit -OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) AT_CHECK([cat ofctl_monitor.log], [0], [dnl NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered) tcp,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:44:41,dl_dst=50:54:00:00:00:07,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64 tcp_csum:0 @@ -361,8 +366,9 @@ AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor for i in 1 2 3; do ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:44:42,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)' done +OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6]) +ovs-appctl -t ovs-ofctl exit -OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit]) AT_CHECK([cat ofctl_monitor.log], [0], [dnl NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered) mpls,metadata=0,in_port=0,vlan_tci=0x,dl_src=40:44:44:44:44:42,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1 @@ -401,8 +407,9 @@ AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor for i in 1 2 3; do ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:44:44,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=99,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
Re: [ovs-dev] [PATCH] vswitchd: Fix typo in comment
On Fri, Jul 12, 2013 at 05:59:34PM +0800, Min Chen wrote: > "if_cfg_queue" is not a member of "struct bridge" in bridge.c > > Signed-off-by: Min Chen I tried to apply your patch but failed: Applying: vswitchd: Fix typo in comment fatal: corrupt patch at line 10 Patch failed at 0001 vswitchd: Fix typo in comment When you have resolved this problem run "git am --resolved". If you would prefer to skip this patch, instead run "git am --skip". To restore the original branch and stop patching run "git am --abort". ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] vswitchd: Fix typo in comment
On Jul 12, 2013, at 9:41 AM, Ben Pfaff wrote: > On Fri, Jul 12, 2013 at 05:59:34PM +0800, Min Chen wrote: >> "if_cfg_queue" is not a member of "struct bridge" in bridge.c >> >> Signed-off-by: Min Chen > > I tried to apply your patch but failed: > > Applying: vswitchd: Fix typo in comment > fatal: corrupt patch at line 10 > Patch failed at 0001 vswitchd: Fix typo in comment > When you have resolved this problem run "git am --resolved". > If you would prefer to skip this patch, instead run "git am --skip". > To restore the original branch and stop patching run "git am --abort". FWIW, I had a similar problem when I tried to apply your patch over the weekend. I ended up just redoing the fix manually, since it was small and I thought the problem was a one-off. --Justin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] datapath: Use kernel eth_mac_addr() on old kernels.
The OVS MAC address set function was removed in favor of the version in the kernel but the function pointer for older kernels was not updated. Reported-by: Cali Ente Signed-off-by: Jesse Gross --- datapath/vport-internal_dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datapath/vport-internal_dev.c b/datapath/vport-internal_dev.c index 7f0b3c1..9ee1c42 100644 --- a/datapath/vport-internal_dev.c +++ b/datapath/vport-internal_dev.c @@ -173,7 +173,7 @@ static void do_setup(struct net_device *netdev) netdev->hard_start_xmit = internal_dev_xmit; netdev->open = internal_dev_open; netdev->stop = internal_dev_stop; - netdev->set_mac_address = internal_dev_mac_addr; + netdev->set_mac_address = eth_mac_addr; netdev->change_mtu = internal_dev_change_mtu; #endif -- 1.8.1.2 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] datapath: Use kernel eth_mac_addr() on old kernels.
On Fri, Jul 12, 2013 at 10:15 AM, Jesse Gross wrote: > The OVS MAC address set function was removed in favor of the version > in the kernel but the function pointer for older kernels was not > updated. > > Reported-by: Cali Ente > Signed-off-by: Jesse Gross > --- looks good. Acked-by: Pravin B Shelar . > datapath/vport-internal_dev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/datapath/vport-internal_dev.c b/datapath/vport-internal_dev.c > index 7f0b3c1..9ee1c42 100644 > --- a/datapath/vport-internal_dev.c > +++ b/datapath/vport-internal_dev.c > @@ -173,7 +173,7 @@ static void do_setup(struct net_device *netdev) > netdev->hard_start_xmit = internal_dev_xmit; > netdev->open = internal_dev_open; > netdev->stop = internal_dev_stop; > - netdev->set_mac_address = internal_dev_mac_addr; > + netdev->set_mac_address = eth_mac_addr; > netdev->change_mtu = internal_dev_change_mtu; > #endif > > -- > 1.8.1.2 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] datapath: Use kernel eth_mac_addr() on old kernels.
On Fri, Jul 12, 2013 at 11:06 AM, Pravin Shelar wrote: > On Fri, Jul 12, 2013 at 10:15 AM, Jesse Gross wrote: >> The OVS MAC address set function was removed in favor of the version >> in the kernel but the function pointer for older kernels was not >> updated. >> >> Reported-by: Cali Ente >> Signed-off-by: Jesse Gross >> --- > looks good. > > Acked-by: Pravin B Shelar . Applied, thanks. X-CudaMail-Whitelist-To: dev@openvswitch.org ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 07/28] timeval: Make reading the current time thread-safe.
On Fri, Jul 12, 2013 at 09:16:10AM -0400, Ed Maste wrote: > On 10 July 2013 19:03, Ben Pfaff wrote: > > Signed-off-by: Ben Pfaff > > Acked-by: Ed Maste Thanks, I'll apply to master soon. > (I rebased on master and had a trivial conflict to resolve.) Oh, sorry, I forgot to update my github "threads" branch, didn't I? I'll try to keep it up-to-date as I finish with batches of reviews. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 10/28] timeval: Make CPU usage and wakeup tracking per-thread.
On Fri, Jul 12, 2013 at 09:19:39AM -0400, Ed Maste wrote: > On 10 July 2013 19:03, Ben Pfaff wrote: > > This should make the "timeval" module thread-safe, except for its > > calls into vlog (because vlog is not yet thread-safe). > > > > Signed-off-by: Ben Pfaff > > Acked-by: Ed Maste Thanks, I applied this to master. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] bfd: Implements BFD decay
Thanks for getting this together. Comments below. Major stuff: When anything changes about the configuration of the decay feature, the decay_detect_time needs to be reset and we should switch back to the original min_rx. To be safe, I'd also reset it when the min_rx changes. I wouldn't bother with a default decay_min_rx value. If someone fails to set the appropriate value (something >= min_rx) I would simply warn and disable the feature. Changing the min_rx is actually quite a bit more complicated than what you've implemented here. It's insufficient to just set the value, instead we have to tell the other end of the BFD session we're going to do so, before we actually pull the trigger. We do this by by initiating a poll sequence. And setting the bfd->poll_min_rx the the new value. I think we can pretty cleanly achieve the correct behavior by (1) initiating a poll in bfd_run() if we should decay, but the min_rx does not equal the decay_min_rx. And (2) in bfd_poll() setting poll_min_rx to decay_min_rx if decay is enabled and we should decay. In bfd_run() the final if statement should run whether or not decay is enabled. It will have to become a bit more sophisticated, but assuming my suggestion in the previous patch works, it should be straight forward. With the above two comments implemented the bfd_decay function should become quite a bit simpler. For example, we shouldn't be manually mucking with the detect_time. The current min_rx changing code should handle that. The port_modified() function in ofproto-dpif.c needs to update the "struct bfd"s netdev like it does for cfm. Otherwise you'll run into some subtle bugs. Minor stuff: The subject line of the commit message should have a period. It should also change "Implements" to "Implement". Please change the commit message to refer to "interfaces" instead of "tunnels". BFD can run on anything. This prexisted your patch, but in bfd_configure() the 'cfg' argument should be on the previous line. Instead of directly holding a pointer to 'netdev', please take a reference of it using netdev_ref(). See ofproto-dpif-xlate for an example. We can probably get rid of decay_enabled flag altogher and just use the decay_detect_time. If it's 0, we'll say the feature is disabled. In bfd_rx_packets() why would bfd->netdev be null? In bfd.h instead of including netdev.h you can simply declare struct netdev. cfm.h has an example. Ethan On Thu, Jul 11, 2013 at 6:39 PM, Alex Wang wrote: > When there is no incoming data traffic in the tunnel for a period, > BFD decay allows the bfd session to increase the min_rx. This is > helpful in that usually some tunnels are rarely used. And cpu > consumption can be reduced by processing fewer bfd control packets. > > Signed-off-by: Alex Wang > --- > lib/bfd.c | 89 > ++-- > lib/bfd.h |5 ++- > ofproto/ofproto-dpif.c |3 +- > vswitchd/vswitch.xml | 17 + > 4 files changed, 109 insertions(+), 5 deletions(-) > > diff --git a/lib/bfd.c b/lib/bfd.c > index aa1a3f7..0d4805a 100644 > --- a/lib/bfd.c > +++ b/lib/bfd.c > @@ -152,6 +152,9 @@ struct bfd { > bool cpath_down; /* Concatenated Path Down. */ > uint8_t mult; /* bfd.DetectMult. */ > > +struct netdev *netdev; > +uint64_t rx_packets; /* Packets received by 'netdev'. */ > + > enum state state; /* bfd.SessionState. */ > enum state rmt_state; /* bfd.RemoteSessionState. */ > > @@ -182,6 +185,11 @@ struct bfd { > > int ref_cnt; > int forwarding_override; /* Manual override of 'forwarding' status. > */ > + > +/* BFD decay related variables. */ > +bool decay_enabled; /* Flag that enables bfd decay. */ > +int decay_min_rx; > +long long int decay_detect_time; > }; > > static bool bfd_in_poll(const struct bfd *); > @@ -191,6 +199,9 @@ static const char *bfd_state_str(enum state); > static long long int bfd_min_tx(const struct bfd *); > static long long int bfd_tx_interval(const struct bfd *); > static long long int bfd_rx_interval(const struct bfd *); > +static uint64_t bfd_rx_packets(const struct bfd *); > +static void bfd_decay_min_rx_set_default(struct bfd *); > +static void bfd_decay(struct bfd *); > static void bfd_set_next_tx(struct bfd *); > static void bfd_set_state(struct bfd *, enum state, enum diag); > static uint32_t generate_discriminator(void); > @@ -243,7 +254,8 @@ bfd_get_status(const struct bfd *bfd, struct smap *smap) > * Also returns NULL if cfg is NULL. */ > struct bfd * > bfd_configure(struct bfd *bfd, const char *name, > - const struct smap *cfg) > + const struct smap *cfg, > + struct netdev *netdev) > { > static uint16_t udp_src = 0; > static bool init = false; > @@ -276,6 +288,9 @@ bfd_configure(struct bfd *bfd, const char *name, > bfd->min_tx = 1000; >
Re: [ovs-dev] [threads 15/28] system-stats: Remove worker process support.
On Wed, Jul 10, 2013 at 4:03 PM, Ben Pfaff wrote: > The worker process implementation isn't thread-safe and, once OVS > itself is threaded, it doesn't make much sense to have a worker > process anyway. > > Signed-off-by: Ben Pfaff > Looks good. > --- > vswitchd/system-stats.c | 145 > --- > 1 files changed, 24 insertions(+), 121 deletions(-) > > diff --git a/vswitchd/system-stats.c b/vswitchd/system-stats.c > index f0f53c0..e7c1d73 100644 > --- a/vswitchd/system-stats.c > +++ b/vswitchd/system-stats.c > @@ -41,7 +41,6 @@ > #include "smap.h" > #include "timeval.h" > #include "vlog.h" > -#include "worker.h" > > VLOG_DEFINE_THIS_MODULE(system_stats); > > @@ -505,46 +504,17 @@ get_filesys_stats(struct smap *stats OVS_UNUSED) > > #define SYSTEM_STATS_INTERVAL (5 * 1000) /* In milliseconds. */ > > -/* Whether the client wants us to report system stats. */ > -static bool enabled; > +/* The next time to wake up, or LLONG_MAX if stats are disabled. */ > +static long long int next_refresh = LLONG_MAX; > > -static enum { > -S_DISABLED, /* Not enabled, nothing going on. */ > -S_WAITING, /* Sleeping for SYSTEM_STATS_INTERVAL ms. > */ > -S_REQUEST_SENT, /* Sent a request to worker. */ > -S_REPLY_RECEIVED/* Received a reply from worker. */ > -} state; > - > -/* In S_WAITING state: the next time to wake up. > - * In other states: not meaningful. */ > -static long long int next_refresh; > - > -/* In S_REPLY_RECEIVED: the stats that have just been received. > - * In other states: not meaningful. */ > -static struct smap *received_stats; > - > -static worker_request_func system_stats_request_cb; > -static worker_reply_func system_stats_reply_cb; > - > -/* Enables or disables system stats collection, according to 'new_enable'. > - * > - * Even if system stats are disabled, the caller should still > periodically call > - * system_stats_run(). */ > +/* Enables or disables system stats collection, according to 'enable'. */ > void > -system_stats_enable(bool new_enable) > +system_stats_enable(bool enable) > { > -if (new_enable != enabled) { > -if (new_enable) { > -if (state == S_DISABLED) { > -state = S_WAITING; > -next_refresh = time_msec(); > -} > -} else { > -if (state == S_WAITING) { > -state = S_DISABLED; > -} > -} > -enabled = new_enable; > +if (!enable) { > +next_refresh = LLONG_MAX; > +} else if (next_refresh == LLONG_MAX) { > +next_refresh = time_msec(); > } > } > > @@ -553,38 +523,26 @@ system_stats_enable(bool new_enable) > * > * When a new snapshot is available (which only occurs if system stats are > * enabled), returns it as an smap owned by the caller. The caller must > use > - * both smap_destroy() and free() to complete free the returned data. > + * both smap_destroy() and free() to completely free the returned data. > * > * When no new snapshot is available, returns NULL. */ > struct smap * > system_stats_run(void) > { > -switch (state) { > -case S_DISABLED: > -break; > - > -case S_WAITING: > -if (time_msec() >= next_refresh) { > -worker_request(NULL, 0, NULL, 0, system_stats_request_cb, > - system_stats_reply_cb, NULL); > -state = S_REQUEST_SENT; > -} > -break; > - > -case S_REQUEST_SENT: > -break; > - > -case S_REPLY_RECEIVED: > -if (enabled) { > -state = S_WAITING; > -next_refresh = time_msec() + SYSTEM_STATS_INTERVAL; > -return received_stats; > -} else { > -smap_destroy(received_stats); > -free(received_stats); > -state = S_DISABLED; > -} > -break; > +if (time_msec() >= next_refresh) { > +struct smap *stats; > + > +stats = xmalloc(sizeof *stats); > +smap_init(stats); > +get_cpu_cores(stats); > +get_load_average(stats); > +get_memory_stats(stats); > +get_process_stats(stats); > +get_filesys_stats(stats); > + > +next_refresh = time_msec() + SYSTEM_STATS_INTERVAL; > + > +return stats; > } > > return NULL; > @@ -595,62 +553,7 @@ system_stats_run(void) > void > system_stats_wait(void) > { > -switch (state) { > -case S_DISABLED: > -break; > - > -case S_WAITING: > +if (next_refresh != LLONG_MAX) { > poll_timer_wait_until(next_refresh); > -break; > - > -case S_REQUEST_SENT: > -/* Someone else should be calling worker_wait() to wake up when > the > - * reply arrives, otherwise there's a bug. */ > -break; > - > -case S_REPLY_RECEIVED: > -poll_immediate_wake(); > -break; > } > } > - > -static void > -system_stats_reques
[ovs-dev] [PATCH] util: Set thread name via pthreads in set_subprogram_name().
This makes "top" and "ps" output more readable on FreeBSD at least, and the names are also visible in debuggers. Suggested-by: Ed Maste Signed-off-by: Ben Pfaff --- One reason that I didn't know about these functions was that my main dev box runs a distro so old that its glibc doesn't even have this function. That means that this patch is effectively untested. configure.ac |3 ++- lib/util.c |8 +++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index e4f9991..078ff4f 100644 --- a/configure.ac +++ b/configure.ac @@ -64,7 +64,8 @@ AC_CHECK_DECLS([sys_siglist], [], [], [[#include ]]) AC_CHECK_MEMBERS([struct stat.st_mtim.tv_nsec, struct stat.st_mtimensec], [], [], [[#include ]]) AC_CHECK_MEMBERS([struct ifreq.ifr_flagshigh], [], [], [[#include ]]) -AC_CHECK_FUNCS([mlockall strnlen getloadavg statvfs getmntent_r]) +AC_CHECK_FUNCS([mlockall strnlen getloadavg statvfs getmntent_r \ + pthread_setname_np pthread_set_name_np]) AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h stdatomic.h]) AC_CHECK_HEADERS([net/if_mib.h], [], [], [[#include #include ]]) diff --git a/lib/util.c b/lib/util.c index c69d7d1..6a72107 100644 --- a/lib/util.c +++ b/lib/util.c @@ -395,11 +395,17 @@ get_subprogram_name(void) } /* Sets 'name' as the name of the currently running thread or process. (This - * appears in log messages.) */ + * appears in log messages and may also be visible in system process listings + * and debuggers.) */ void set_subprogram_name(const char *name) { free(subprogram_name_set(xstrdup(name))); +#if HAVE_PTHREAD_SETNAME_NP +pthread_setname_np(pthread_self(), name); +#elif HAVE_PTHREAD_SET_NAME_NP +pthread_set_name_np(pthread_self(), name); +#endif } /* Returns a pointer to a string describing the program version. The -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 23/28] util: Make subprogram_name thread-specific.
On Thu, Jul 11, 2013 at 05:16:33PM -0700, Ben Pfaff wrote: > On Thu, Jul 11, 2013 at 09:40:57AM -0400, Ed Maste wrote: > > On 10 July 2013 19:04, Ben Pfaff wrote: > > > Signed-off-by: Ben Pfaff > > > > > +void > > > +set_subprogram_name(const char *name) > > > +{ > > > +free(subprogram_name_set(xstrdup(name))); > > > +} > > > > How about calling (through appropriate abstraction) pthread_setname_np > > (Linux) or pthread_set_name_np (FreeBSD)? The thread names will then > > appear in gdb or lldb, and in top(1) (on FreeBSD at least). > > I didn't know about those functions. Thanks, I'll look into it. I decided to apply this as-is and then send out a separate patch to add this functionality. It is here: http://openvswitch.org/pipermail/dev/2013-July/029589.html Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 24/28] vlog: Make vlog_should_drop() thread-safe.
On Thu, Jul 11, 2013 at 09:26:58AM -0400, Ed Maste wrote: > On 10 July 2013 19:04, Ben Pfaff wrote: > > Signed-off-by: Ben Pfaff > > Acked-by: Ed Maste Thanks, I'll apply this soon. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 26/28] vlog: Remove unused function vlog_get_log_file().
On Thu, Jul 11, 2013 at 10:03:07AM -0400, Ed Maste wrote: > On 10 July 2013 19:04, Ben Pfaff wrote: > > Signed-off-by: Ben Pfaff > > Acked-by: Ed Maste Thanks, I'll apply this soon. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 27/28] vlog: Remove unused function vlog_exit().
On Thu, Jul 11, 2013 at 09:56:43AM -0400, Ed Maste wrote: > On 10 July 2013 19:04, Ben Pfaff wrote: > > This is harder to implement once threads are introduced. > > > > Signed-off-by: Ben Pfaff > > Acked-by: Ed Maste Thanks, I'll apply this soon. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 16/28] worker: Delete library.
On Wed, Jul 10, 2013 at 4:03 PM, Ben Pfaff wrote: > It had no remaining users. > > Signed-off-by: Ben Pfaff > Looks good to me. > --- > Makefile.am |2 +- > lib/automake.mk |4 +- > lib/util.c |3 +- > lib/worker.c| 472 > --- > lib/worker.h| 68 --- > vswitchd/ovs-vswitchd.c |5 - > 6 files changed, 3 insertions(+), 551 deletions(-) > delete mode 100644 lib/worker.c > delete mode 100644 lib/worker.h > > diff --git a/Makefile.am b/Makefile.am > index 08aea0f..5b9e0ac 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -202,7 +202,7 @@ ALL_LOCAL += check-assert-h-usage > check-assert-h-usage: > @if test -e $(srcdir)/.git && (git --version) >/dev/null 2>&1 && \ > (cd $(srcdir) && git --no-pager grep -l -E '[<]assert.h[>]') | > \ > - $(EGREP) -v '^lib/(sflow_receiver|vlog|worker).c$$|^tests/'; \ > + $(EGREP) -v '^lib/(sflow_receiver|vlog).c$$|^tests/'; \ > then \ > echo "Files listed above unexpectedly #include > <""assert.h"">."; \ > echo "Please use ovs_assert (from util.h) instead of assert."; > \ > diff --git a/lib/automake.mk b/lib/automake.mk > index 83ec520..280fc8b 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -218,9 +218,7 @@ lib_libopenvswitch_a_SOURCES = \ > lib/vlog.c \ > lib/vlog.h \ > lib/vswitch-idl.c \ > - lib/vswitch-idl.h \ > - lib/worker.c \ > - lib/worker.h > + lib/vswitch-idl.h > > nodist_lib_libopenvswitch_a_SOURCES = \ > lib/dirs.c > diff --git a/lib/util.c b/lib/util.c > index 0ba1ed5..8e628cc 100644 > --- a/lib/util.c > +++ b/lib/util.c > @@ -38,8 +38,7 @@ COVERAGE_DEFINE(util_xalloc); > /* argv[0] without directory names. */ > const char *program_name; > > -/* Ordinarily "" but set to "monitor" for a monitor process or "worker" > for a > - * worker process. */ > +/* Ordinarily "" but set to "monitor" for a monitor process. */ > const char *subprogram_name = ""; > > /* --version option output. */ > diff --git a/lib/worker.c b/lib/worker.c > deleted file mode 100644 > index 6904fdd..000 > --- a/lib/worker.c > +++ /dev/null > @@ -1,472 +0,0 @@ > -/* Copyright (c) 2012, 2013 Nicira, Inc. > - * > - * Licensed under the Apache License, Version 2.0 (the "License"); > - * you may not use this file except in compliance with the License. > - * You may obtain a copy of the License at: > - * > - * http://www.apache.org/licenses/LICENSE-2.0 > - * > - * Unless required by applicable law or agreed to in writing, software > - * distributed under the License is distributed on an "AS IS" BASIS, > - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > - * See the License for the specific language governing permissions and > - * limitations under the License. > - */ > - > -#include > - > -#include "worker.h" > - > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > - > -#include "command-line.h" > -#include "daemon.h" > -#include "ofpbuf.h" > -#include "poll-loop.h" > -#include "socket-util.h" > -#include "util.h" > -#include "vlog.h" > - > -VLOG_DEFINE_THIS_MODULE(worker); > - > -/* ovs_assert() logs the assertion message and logging sometimes goes > through a > - * worker, so using ovs_assert() in this source file could cause > recursion. */ > -#undef ovs_assert > -#define ovs_assert use_assert_instead_of_ovs_assert_in_this_module > - > -/* Header for an RPC request. */ > -struct worker_request { > -size_t request_len; /* Length of the payload in bytes. */ > -worker_request_func *request_cb; /* Function to call in worker > process. */ > -worker_reply_func *reply_cb; /* Function to call in main process. > */ > -void *reply_aux; /* Auxiliary data for 'reply_cb'. */ > -}; > - > -/* Header for an RPC reply. */ > -struct worker_reply { > -size_t reply_len;/* Length of the payload in bytes. */ > -worker_reply_func *reply_cb; /* Function to call in main process. */ > -void *reply_aux; /* Auxiliary data for 'reply_cb'. */ > -}; > - > -/* Receive buffer for a RPC request or reply. */ > -struct rxbuf { > -/* Header. */ > -struct ofpbuf header; /* Header data. */ > -int fds[SOUTIL_MAX_FDS];/* File descriptors. */ > -size_t n_fds; > - > -/* Payload. */ > -struct ofpbuf payload; /* Payload data. */ > -}; > - > -static int client_sock = -1; > -static struct rxbuf client_rx; > - > -static void rxbuf_init(struct rxbuf *); > -static void rxbuf_clear(struct rxbuf *); > -static int rxbuf_run(struct rxbuf *, int sock, size_t header_len); > - > -static struct iovec *prefix_iov(void *data, size_t len, > -const struct iovec *iovs, size_t n_iovs); > - > -static void worker_broke(void); > - > -stat
Re: [ovs-dev] [threads 16/28] worker: Delete library.
On Fri, Jul 12, 2013 at 02:37:34PM -0700, Gurucharan Shetty wrote: > On Wed, Jul 10, 2013 at 4:03 PM, Ben Pfaff wrote: > > > It had no remaining users. > > > > Signed-off-by: Ben Pfaff > > > Looks good to me. Thanks, applied. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 15/28] system-stats: Remove worker process support.
On Fri, Jul 12, 2013 at 02:18:41PM -0700, Gurucharan Shetty wrote: > On Wed, Jul 10, 2013 at 4:03 PM, Ben Pfaff wrote: > > > The worker process implementation isn't thread-safe and, once OVS > > itself is threaded, it doesn't make much sense to have a worker > > process anyway. > > > > Signed-off-by: Ben Pfaff > > > Looks good. Thanks, applied. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads 28/28] vlog: Make thread-safe.
On Thu, Jul 11, 2013 at 08:59:22AM -0400, Ed Maste wrote: > On 10 July 2013 19:04, Ben Pfaff wrote: > > static void format_log_message(const struct vlog_module *, enum vlog_level, > > - enum vlog_facility, unsigned int msg_num, > > + enum vlog_facility, > > const char *message, va_list, struct ds *) > > PRINTF_FORMAT(5, 0); > > This should be PRINTF_FORMAT(4, 0) now. Thanks, I've fixed that up. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 00/13] make some important libraries thread-safe
Many patches have been committed since v1. Of the remaining patches, I think only "vlog: Make thread-safe." has changed, and only to fix a PRINTF_FORMAT annotation. Ben Pfaff (13): stress: Remove essentially unused library. netlink-socket: Simplify use of transactions and dumps. netlink-socket: Make thread-safe. poll-loop: Simplify and speed up polling. poll-loop: Make poll loop data structures per-thread. latch: New module for a thread-safe, signal-safe, pollable doorbell. system-stats: Move into separate thread. byteq: Make the queue size variable instead of fixed at BYTEQ_SIZE bytes. util: New macros ROUND_UP_POW2, ROUND_DOWN_POW2. async-append: New library to allow asynchronous appending to a log file. vlog: Use async I/O. vlog: Make VLOG_ONCE thread-safe. vlog: Make thread-safe. configure.ac |1 + lib/async-append-aio.c | 178 + lib/async-append-sync.c| 62 lib/async-append.h | 67 + lib/automake.mk| 17 ++-- lib/byteq.c| 26 +++-- lib/byteq.h| 10 +- lib/dpif-linux.c | 20 ++--- lib/jsonrpc.c |5 +- lib/latch.c| 72 ++ lib/latch.h| 37 +++ lib/netdev-linux.c | 18 +--- lib/netlink-socket.c | 233 +--- lib/netlink-socket.h | 15 +++- lib/poll-loop.c| 154 ++--- lib/poll-loop.h| 20 ++-- lib/route-table.c | 26 +- lib/stream-fd.c| 17 --- lib/stress-unixctl.man | 72 -- lib/stress.c | 219 - lib/stress.h | 94 -- lib/util.h | 19 lib/vlog.c | 180 +- lib/vlog.h | 23 +++-- m4/openvswitch.m4 |5 + manpages.mk|4 - ovsdb/ovsdb-server.1.in|1 - ovsdb/ovsdb-server.c |2 - tests/library.at |2 + tests/test-util.c | 65 vswitchd/bridge.c |3 + vswitchd/ovs-vswitchd.8.in |1 - vswitchd/ovs-vswitchd.c|2 - vswitchd/system-stats.c| 105 34 files changed, 974 insertions(+), 801 deletions(-) create mode 100644 lib/async-append-aio.c create mode 100644 lib/async-append-sync.c create mode 100644 lib/async-append.h create mode 100644 lib/latch.c create mode 100644 lib/latch.h delete mode 100644 lib/stress-unixctl.man delete mode 100644 lib/stress.c delete mode 100644 lib/stress.h -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 01/13] stress: Remove essentially unused library.
The "stress" library was introduced years ago. We intended at the time to start using it to provoke errors in testing, to make sure that Open vSwitch was resilient against those errors. The intention was good, but there were few actual implementations of stress options, and the testing never materialized. Rather than adapt the stress library for thread safety, this seems like a good opportunity to remove it, so this commit does so. Signed-off-by: Ben Pfaff --- lib/automake.mk|8 -- lib/netlink-socket.c | 14 --- lib/stream-fd.c| 17 lib/stress-unixctl.man | 72 --- lib/stress.c | 219 lib/stress.h | 94 --- manpages.mk|4 - ovsdb/ovsdb-server.1.in|1 - ovsdb/ovsdb-server.c |2 - vswitchd/ovs-vswitchd.8.in |1 - vswitchd/ovs-vswitchd.c|2 - 11 files changed, 0 insertions(+), 434 deletions(-) delete mode 100644 lib/stress-unixctl.man delete mode 100644 lib/stress.c delete mode 100644 lib/stress.h diff --git a/lib/automake.mk b/lib/automake.mk index 507ca97..280fc8b 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -182,8 +182,6 @@ lib_libopenvswitch_a_SOURCES = \ lib/stream-unix.c \ lib/stream.c \ lib/stream.h \ - lib/stress.c \ - lib/stress.h \ lib/string.c \ lib/string.h \ lib/svec.c \ @@ -306,7 +304,6 @@ MAN_FRAGMENTS += \ lib/ssl-peer-ca-cert.man \ lib/ssl.man \ lib/ssl-syn.man \ - lib/stress-unixctl.man \ lib/table.man \ lib/unixctl.man \ lib/unixctl-syn.man \ @@ -383,11 +380,6 @@ lib/coverage.def: $(DIST_SOURCES) sed -n 's|^COVERAGE_DEFINE(\([_a-zA-Z0-9]\{1,\}\)).*$$|COVERAGE_COUNTER(\1)|p' $(all_sources) | LC_ALL=C sort -u > $@ CLEANFILES += lib/coverage.def -lib/stress.$(OBJEXT): lib/stress.def -lib/stress.def: $(DIST_SOURCES) - sed -n '/^STRESS_OPTION(/,/);$$/{s/);$$/)/;p}' $(all_sources) > $@ -CLEANFILES += lib/stress.def - lib/vlog.$(OBJEXT): lib/vlog-modules.def lib/vlog-modules.def: $(DIST_SOURCES) sed -n 's|^VLOG_DEFINE_\(THIS_\)\{0,1\}MODULE(\([_a-zA-Z0-9]\{1,\}\)).*$$|VLOG_MODULE(\2)|p' $(all_sources) | LC_ALL=C sort -u > $@ diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c index dfe39ac..aa7fca2 100644 --- a/lib/netlink-socket.c +++ b/lib/netlink-socket.c @@ -31,7 +31,6 @@ #include "ofpbuf.h" #include "poll-loop.h" #include "socket-util.h" -#include "stress.h" #include "util.h" #include "vlog.h" @@ -309,15 +308,6 @@ nl_sock_send_seq(struct nl_sock *sock, const struct ofpbuf *msg, return nl_sock_send__(sock, msg, nlmsg_seq, wait); } -/* This stress option is useful for testing that OVS properly tolerates - * -ENOBUFS on NetLink sockets. Such errors are unavoidable because they can - * occur if the kernel cannot temporarily allocate enough GFP_ATOMIC memory to - * reply to a request. They can also occur if messages arrive on a multicast - * channel faster than OVS can process them. */ -STRESS_OPTION( -netlink_overflow, "simulate netlink socket receive buffer overflow", -5, 1, -1, 100); - static int nl_sock_recv__(struct nl_sock *sock, struct ofpbuf *buf, bool wait) { @@ -373,10 +363,6 @@ nl_sock_recv__(struct nl_sock *sock, struct ofpbuf *buf, bool wait) return EPROTO; } -if (STRESS(netlink_overflow)) { -return ENOBUFS; -} - buf->size = MIN(retval, buf->allocated); if (retval > buf->allocated) { COVERAGE_INC(netlink_recv_jumbo); diff --git a/lib/stream-fd.c b/lib/stream-fd.c index d102582..1171f32 100644 --- a/lib/stream-fd.c +++ b/lib/stream-fd.c @@ -26,7 +26,6 @@ #include "fatal-signal.h" #include "poll-loop.h" #include "socket-util.h" -#include "stress.h" #include "util.h" #include "stream-provider.h" #include "stream.h" @@ -89,38 +88,22 @@ fd_connect(struct stream *stream) return check_connection_completion(s->fd); } -STRESS_OPTION( -stream_flaky_recv, "simulate failure of fd stream recvs", -100, 0, -1, 0); - static ssize_t fd_recv(struct stream *stream, void *buffer, size_t n) { struct stream_fd *s = stream_fd_cast(stream); ssize_t retval; -if (STRESS(stream_flaky_recv)) { -return -EIO; -} - retval = read(s->fd, buffer, n); return retval >= 0 ? retval : -errno; } -STRESS_OPTION( -stream_flaky_send, "simulate failure of fd stream sends", -100, 0, -1, 0); - static ssize_t fd_send(struct stream *stream, const void *buffer, size_t n) { struct stream_fd *s = stream_fd_cast(stream); ssize_t retval; -if (STRESS(stream_flaky_send)) { -return -EIO; -} - retval = write(s->fd, buffer, n); return (retval > 0 ? retval : retval == 0 ? -EAGAIN diff --git a/lib/stress-unixctl.man b/lib/stress-unixctl.man deleted file mode 100644 index d2
[ovs-dev] [threads v2 03/13] netlink-socket: Make thread-safe.
The uses of vlog in this module are not thread-safe, because vlog itself is not yet thread-safe. Signed-off-by: Ben Pfaff --- lib/netlink-socket.c | 24 +++- lib/netlink-socket.h |6 ++ 2 files changed, 25 insertions(+), 5 deletions(-) diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c index 8e08841..da32284 100644 --- a/lib/netlink-socket.c +++ b/lib/netlink-socket.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2012 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -29,6 +29,7 @@ #include "netlink.h" #include "netlink-protocol.h" #include "ofpbuf.h" +#include "ovs-thread.h" #include "poll-loop.h" #include "socket-util.h" #include "util.h" @@ -85,13 +86,14 @@ static void nl_pool_release(struct nl_sock *); int nl_sock_create(int protocol, struct nl_sock **sockp) { +static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; struct nl_sock *sock; struct sockaddr_nl local, remote; socklen_t local_size; int rcvbuf; int retval = 0; -if (!max_iovs) { +if (ovsthread_once_start(&once)) { int save_errno = errno; errno = 0; @@ -106,6 +108,7 @@ nl_sock_create(int protocol, struct nl_sock **sockp) } errno = save_errno; +ovsthread_once_done(&once); } *sockp = NULL; @@ -1010,17 +1013,25 @@ struct nl_pool { }; static struct nl_pool pools[MAX_LINKS]; +static pthread_mutex_t pool_mutex = PTHREAD_ADAPTIVE_MUTEX_INITIALIZER; static int nl_pool_alloc(int protocol, struct nl_sock **sockp) { +struct nl_sock *sock = NULL; struct nl_pool *pool; ovs_assert(protocol >= 0 && protocol < ARRAY_SIZE(pools)); +xpthread_mutex_lock(&pool_mutex); pool = &pools[protocol]; if (pool->n > 0) { -*sockp = pool->socks[--pool->n]; +sock = pool->socks[--pool->n]; +} +xpthread_mutex_unlock(&pool_mutex); + +if (sock) { +*sockp = sock; return 0; } else { return nl_sock_create(protocol, sockp); @@ -1033,11 +1044,14 @@ nl_pool_release(struct nl_sock *sock) if (sock) { struct nl_pool *pool = &pools[sock->protocol]; +xpthread_mutex_lock(&pool_mutex); if (pool->n < ARRAY_SIZE(pool->socks)) { pool->socks[pool->n++] = sock; -} else { -nl_sock_destroy(sock); +sock = NULL; } +xpthread_mutex_unlock(&pool_mutex); + +nl_sock_destroy(sock); } } diff --git a/lib/netlink-socket.h b/lib/netlink-socket.h index c77050e..986b574 100644 --- a/lib/netlink-socket.h +++ b/lib/netlink-socket.h @@ -30,6 +30,12 @@ * are Linux-specific. For Netlink protocol definitions, see * netlink-protocol.h. For helper functions for working with Netlink messages, * see netlink.h. + * + * + * Thread-safety + * = + * + * Only a single thread may use a given nl_sock or nl_dump at one time. */ #include -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 04/13] poll-loop: Simplify and speed up polling.
The simplification comes from dropping support for canceling a poll_waiter, which was a feature that was never used. The speedup comes from avoiding a malloc() for every call to poll_fd_wait(). (I doubt that this significantly improves performance.) This prepares for making the polling structures per-thread in the next commit. Signed-off-by: Ben Pfaff --- lib/poll-loop.c | 85 +++--- lib/poll-loop.h |6 +--- 2 files changed, 25 insertions(+), 66 deletions(-) diff --git a/lib/poll-loop.c b/lib/poll-loop.c index 5f4b16c..567c19f 100644 --- a/lib/poll-loop.c +++ b/lib/poll-loop.c @@ -41,18 +41,13 @@ COVERAGE_DEFINE(poll_zero_timeout); /* An event that will wake the following call to poll_block(). */ struct poll_waiter { -/* Set when the waiter is created. */ -struct list node; /* Element in global waiters list. */ -int fd; /* File descriptor. */ -short int events; /* Events to wait for (POLLIN, POLLOUT). */ const char *where; /* Where the waiter was created. */ - -/* Set only when poll_block() is called. */ -struct pollfd *pollfd; /* Pointer to element of the pollfds array. */ }; /* All active poll waiters. */ -static struct list waiters = LIST_INITIALIZER(&waiters); +static struct poll_waiter *waiters; +static struct pollfd *pollfds; +static size_t n_waiters, allocated_waiters; /* Time at which to wake up the next call to poll_block(), in milliseconds as * returned by time_msec(), LLONG_MIN to wake up immediately, or LLONG_MAX to @@ -62,8 +57,7 @@ static long long int timeout_when = LLONG_MAX; /* Location where waiter created. */ static const char *timeout_where; -static struct poll_waiter *new_waiter(int fd, short int events, - const char *where); +static void new_waiter(int fd, short int events, const char *where); /* Registers 'fd' as waiting for the specified 'events' (which should be POLLIN * or POLLOUT or POLLIN | POLLOUT). The following call to poll_block() will @@ -75,11 +69,11 @@ static struct poll_waiter *new_waiter(int fd, short int events, * * Ordinarily the 'where' argument is supplied automatically; see poll-loop.h * for more information. */ -struct poll_waiter * +void poll_fd_wait(int fd, short int events, const char *where) { COVERAGE_INC(poll_fd_wait); -return new_waiter(fd, events, where); +new_waiter(fd, events, where); } /* Causes the following call to poll_block() to block for no more than 'msec' @@ -210,11 +204,6 @@ log_wakeup(const char *where, const struct pollfd *pollfd, int timeout) void poll_block(void) { -static struct pollfd *pollfds; -static size_t max_pollfds; - -struct poll_waiter *pw, *next; -int n_waiters, n_pollfds; int elapsed; int retval; @@ -222,70 +211,44 @@ poll_block(void) * poll_block. */ fatal_signal_wait(); -n_waiters = list_size(&waiters); -if (max_pollfds < n_waiters) { -max_pollfds = n_waiters; -pollfds = xrealloc(pollfds, max_pollfds * sizeof *pollfds); -} - -n_pollfds = 0; -LIST_FOR_EACH (pw, node, &waiters) { -pw->pollfd = &pollfds[n_pollfds]; -pollfds[n_pollfds].fd = pw->fd; -pollfds[n_pollfds].events = pw->events; -pollfds[n_pollfds].revents = 0; -n_pollfds++; -} - if (timeout_when == LLONG_MIN) { COVERAGE_INC(poll_zero_timeout); } -retval = time_poll(pollfds, n_pollfds, timeout_when, &elapsed); +retval = time_poll(pollfds, n_waiters, timeout_when, &elapsed); if (retval < 0) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); VLOG_ERR_RL(&rl, "poll: %s", ovs_strerror(-retval)); } else if (!retval) { log_wakeup(timeout_where, NULL, elapsed); -} +} else if (get_cpu_usage() > 50 || VLOG_IS_DBG_ENABLED()) { +size_t i; -LIST_FOR_EACH_SAFE (pw, next, node, &waiters) { -if (pw->pollfd->revents) { -log_wakeup(pw->where, pw->pollfd, 0); +for (i = 0; i < n_waiters; i++) { +if (pollfds[i].revents) { +log_wakeup(waiters[i].where, &pollfds[i], 0); +} } -poll_cancel(pw); } timeout_when = LLONG_MAX; timeout_where = NULL; +n_waiters = 0; /* Handle any pending signals before doing anything else. */ fatal_signal_run(); } - -/* Cancels the file descriptor event registered with poll_fd_wait() using 'pw', - * the struct poll_waiter returned by that function. - * - * An event registered with poll_fd_wait() may be canceled from its time of - * registration until the next call to poll_block(). At that point, the event - * is automatically canceled by the system and its poll_waiter is freed. */ -void -poll_cancel(struct poll_waiter *pw) -{ -if (pw) { -list_remove(&pw->node); -free(pw); -} -}
[ovs-dev] [threads v2 06/13] latch: New module for a thread-safe, signal-safe, pollable doorbell.
Signed-off-by: Ben Pfaff --- lib/automake.mk |2 + lib/latch.c | 72 +++ lib/latch.h | 37 3 files changed, 111 insertions(+), 0 deletions(-) create mode 100644 lib/latch.c create mode 100644 lib/latch.h diff --git a/lib/automake.mk b/lib/automake.mk index 280fc8b..6b0972b 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -71,6 +71,8 @@ lib_libopenvswitch_a_SOURCES = \ lib/jsonrpc.h \ lib/lacp.c \ lib/lacp.h \ + lib/latch.c \ + lib/latch.h \ lib/learn.c \ lib/learn.h \ lib/learning-switch.c \ diff --git a/lib/latch.c b/lib/latch.c new file mode 100644 index 000..90df61a --- /dev/null +++ b/lib/latch.c @@ -0,0 +1,72 @@ +/* + * Copyright (c) 2013 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "latch.h" +#include +#include +#include +#include "poll-loop.h" +#include "socket-util.h" + +void +latch_init(struct latch *latch) +{ +xpipe_nonblocking(latch->fds); +} + +void +latch_destroy(struct latch *latch) +{ +close(latch->fds[0]); +close(latch->fds[1]); +} + +bool +latch_poll(struct latch *latch) +{ +char buffer[_POSIX_PIPE_BUF]; + +return read(latch->fds[0], buffer, sizeof buffer) > 0; +} + +void +latch_set(struct latch *latch) +{ +ignore(write(latch->fds[1], "", 1)); +} + +bool +latch_is_set(const struct latch *latch) +{ +struct pollfd pfd; +int retval; + +pfd.fd = latch->fds[0]; +pfd.events = POLLIN; +do { +retval = poll(&pfd, 1, 0); +} while (retval < 0 && errno == EINTR); + +return pfd.revents & POLLIN; +} + +void +(latch_wait)(const struct latch *latch, const char *where) +{ +(poll_fd_wait)(latch->fds[0], POLLIN, where); +} diff --git a/lib/latch.h b/lib/latch.h new file mode 100644 index 000..7e916e7 --- /dev/null +++ b/lib/latch.h @@ -0,0 +1,37 @@ +/* + * Copyright (c) 2013 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef LATCH_H +#define LATCH_H 1 + +#include +#include "util.h" + +struct latch { +int fds[2]; +}; + +void latch_init(struct latch *); +void latch_destroy(struct latch *); + +bool latch_poll(struct latch *); +void latch_set(struct latch *); + +bool latch_is_set(const struct latch *); +void latch_wait(const struct latch *, const char *where); +#define latch_wait(latch) latch_wait(latch, SOURCE_LOCATOR) + +#endif /* latch.h */ -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 05/13] poll-loop: Make poll loop data structures per-thread.
Signed-off-by: Ben Pfaff --- lib/poll-loop.c | 107 +++ lib/poll-loop.h | 14 +-- 2 files changed, 78 insertions(+), 43 deletions(-) diff --git a/lib/poll-loop.c b/lib/poll-loop.c index 567c19f..97fc806 100644 --- a/lib/poll-loop.c +++ b/lib/poll-loop.c @@ -25,6 +25,7 @@ #include "dynamic-string.h" #include "fatal-signal.h" #include "list.h" +#include "ovs-thread.h" #include "socket-util.h" #include "timeval.h" #include "vlog.h" @@ -39,25 +40,20 @@ VLOG_DEFINE_THIS_MODULE(poll_loop); COVERAGE_DEFINE(poll_fd_wait); COVERAGE_DEFINE(poll_zero_timeout); -/* An event that will wake the following call to poll_block(). */ -struct poll_waiter { -const char *where; /* Where the waiter was created. */ -}; - -/* All active poll waiters. */ -static struct poll_waiter *waiters; -static struct pollfd *pollfds; -static size_t n_waiters, allocated_waiters; +struct poll_loop { +/* All active poll waiters. */ +struct pollfd *pollfds; /* Events to pass to poll(). */ +const char **where; /* Where each pollfd was created. */ +size_t n_waiters; /* Number of elems in 'where' and 'pollfds'. */ +size_t allocated_waiters; /* Allocated elems in 'where' and 'pollfds'. */ -/* Time at which to wake up the next call to poll_block(), in milliseconds as - * returned by time_msec(), LLONG_MIN to wake up immediately, or LLONG_MAX to - * wait forever. */ -static long long int timeout_when = LLONG_MAX; - -/* Location where waiter created. */ -static const char *timeout_where; +/* Time at which to wake up the next call to poll_block(), LLONG_MIN to + * wake up immediately, or LLONG_MAX to wait forever. */ +long long int timeout_when; /* In msecs as returned by time_msec(). */ +const char *timeout_where; /* Where 'timeout_when' was set. */ +}; -static void new_waiter(int fd, short int events, const char *where); +static struct poll_loop *poll_loop(void); /* Registers 'fd' as waiting for the specified 'events' (which should be POLLIN * or POLLOUT or POLLIN | POLLOUT). The following call to poll_block() will @@ -72,8 +68,21 @@ static void new_waiter(int fd, short int events, const char *where); void poll_fd_wait(int fd, short int events, const char *where) { +struct poll_loop *loop = poll_loop(); + COVERAGE_INC(poll_fd_wait); -new_waiter(fd, events, where); +if (loop->n_waiters >= loop->allocated_waiters) { +loop->where = x2nrealloc(loop->where, &loop->allocated_waiters, + sizeof *loop->where); +loop->pollfds = xrealloc(loop->pollfds, + (loop->allocated_waiters + * sizeof *loop->pollfds)); +} + +loop->where[loop->n_waiters] = where; +loop->pollfds[loop->n_waiters].fd = fd; +loop->pollfds[loop->n_waiters].events = events; +loop->n_waiters++; } /* Causes the following call to poll_block() to block for no more than 'msec' @@ -120,9 +129,10 @@ poll_timer_wait(long long int msec, const char *where) void poll_timer_wait_until(long long int when, const char *where) { -if (when < timeout_when) { -timeout_when = when; -timeout_where = where; +struct poll_loop *loop = poll_loop(); +if (when < loop->timeout_when) { +loop->timeout_when = when; +loop->timeout_where = where; } } @@ -204,6 +214,7 @@ log_wakeup(const char *where, const struct pollfd *pollfd, int timeout) void poll_block(void) { +struct poll_loop *loop = poll_loop(); int elapsed; int retval; @@ -211,44 +222,62 @@ poll_block(void) * poll_block. */ fatal_signal_wait(); -if (timeout_when == LLONG_MIN) { +if (loop->timeout_when == LLONG_MIN) { COVERAGE_INC(poll_zero_timeout); } -retval = time_poll(pollfds, n_waiters, timeout_when, &elapsed); + +retval = time_poll(loop->pollfds, loop->n_waiters, + loop->timeout_when, &elapsed); if (retval < 0) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); VLOG_ERR_RL(&rl, "poll: %s", ovs_strerror(-retval)); } else if (!retval) { -log_wakeup(timeout_where, NULL, elapsed); +log_wakeup(loop->timeout_where, NULL, elapsed); } else if (get_cpu_usage() > 50 || VLOG_IS_DBG_ENABLED()) { size_t i; -for (i = 0; i < n_waiters; i++) { -if (pollfds[i].revents) { -log_wakeup(waiters[i].where, &pollfds[i], 0); +for (i = 0; i < loop->n_waiters; i++) { +if (loop->pollfds[i].revents) { +log_wakeup(loop->where[i], &loop->pollfds[i], 0); } } } -timeout_when = LLONG_MAX; -timeout_where = NULL; -n_waiters = 0; +loop->timeout_when = LLONG_MAX; +loop->timeout_where = NULL; +loop->n_waiters = 0; /* Handle any pending signals before doing any
[ovs-dev] [threads v2 02/13] netlink-socket: Simplify use of transactions and dumps.
This disentangles "struct nl_dump" from "struct nl_sock", clearing the way to make the use of either one thread-safe in an obviously correct manner. Signed-off-by: Ben Pfaff --- lib/dpif-linux.c | 20 ++--- lib/netdev-linux.c | 18 + lib/netlink-socket.c | 201 +++-- lib/netlink-socket.h |9 ++- lib/route-table.c| 26 +-- 5 files changed, 115 insertions(+), 159 deletions(-) diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c index 804a90f..958873c 100644 --- a/lib/dpif-linux.c +++ b/lib/dpif-linux.c @@ -162,7 +162,6 @@ static int ovs_flow_family; static int ovs_packet_family; /* Generic Netlink socket. */ -static struct nl_sock *genl_sock; static struct nln *nln = NULL; static int dpif_linux_init(void); @@ -692,7 +691,7 @@ dpif_linux_port_dump_start(const struct dpif *dpif_, void **statep) buf = ofpbuf_new(1024); dpif_linux_vport_to_ofpbuf(&request, buf); -nl_dump_start(&state->dump, genl_sock, buf); +nl_dump_start(&state->dump, NETLINK_GENERIC, buf); ofpbuf_delete(buf); return 0; @@ -898,7 +897,7 @@ dpif_linux_flow_dump_start(const struct dpif *dpif_, void **statep) buf = ofpbuf_new(1024); dpif_linux_flow_to_ofpbuf(&request, buf); -nl_dump_start(&state->dump, genl_sock, buf); +nl_dump_start(&state->dump, NETLINK_GENERIC, buf); ofpbuf_delete(buf); state->buf = NULL; @@ -1005,7 +1004,7 @@ dpif_linux_execute__(int dp_ifindex, const struct dpif_execute *execute) ofpbuf_use_stub(&request, request_stub, sizeof request_stub); dpif_linux_encode_execute(dp_ifindex, execute, &request); -error = nl_sock_transact(genl_sock, &request, NULL); +error = nl_transact(NETLINK_GENERIC, &request, NULL); ofpbuf_uninit(&request); return error; @@ -1090,7 +1089,7 @@ dpif_linux_operate__(struct dpif *dpif_, struct dpif_op **ops, size_t n_ops) for (i = 0; i < n_ops; i++) { txnsp[i] = &auxes[i].txn; } -nl_sock_transact_multiple(genl_sock, txnsp, n_ops); +nl_transact_multiple(NETLINK_GENERIC, txnsp, n_ops); for (i = 0; i < n_ops; i++) { struct op_auxdata *aux = &auxes[i]; @@ -1464,9 +1463,6 @@ dpif_linux_init(void) &ovs_packet_family); } if (!error) { -error = nl_sock_create(NETLINK_GENERIC, &genl_sock); -} -if (!error) { error = nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY, OVS_VPORT_MCGROUP, &ovs_vport_mcgroup, OVS_VPORT_MCGROUP_FALLBACK_ID); @@ -1659,7 +1655,7 @@ dpif_linux_vport_transact(const struct dpif_linux_vport *request, request_buf = ofpbuf_new(1024); dpif_linux_vport_to_ofpbuf(request, request_buf); -error = nl_sock_transact(genl_sock, request_buf, bufp); +error = nl_transact(NETLINK_GENERIC, request_buf, bufp); ofpbuf_delete(request_buf); if (reply) { @@ -1780,7 +1776,7 @@ dpif_linux_dp_dump_start(struct nl_dump *dump) buf = ofpbuf_new(1024); dpif_linux_dp_to_ofpbuf(&request, buf); -nl_dump_start(dump, genl_sock, buf); +nl_dump_start(dump, NETLINK_GENERIC, buf); ofpbuf_delete(buf); } @@ -1801,7 +1797,7 @@ dpif_linux_dp_transact(const struct dpif_linux_dp *request, request_buf = ofpbuf_new(1024); dpif_linux_dp_to_ofpbuf(request, request_buf); -error = nl_sock_transact(genl_sock, request_buf, bufp); +error = nl_transact(NETLINK_GENERIC, request_buf, bufp); ofpbuf_delete(request_buf); if (reply) { @@ -1965,7 +1961,7 @@ dpif_linux_flow_transact(struct dpif_linux_flow *request, request_buf = ofpbuf_new(1024); dpif_linux_flow_to_ofpbuf(request, request_buf); -error = nl_sock_transact(genl_sock, request_buf, bufp); +error = nl_transact(NETLINK_GENERIC, request_buf, bufp); ofpbuf_delete(request_buf); if (reply) { diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 8790f14..197e51d 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -409,9 +409,6 @@ static const struct netdev_rx_class netdev_rx_linux_class; /* Sockets used for ioctl operations. */ static int af_inet_sock = -1; /* AF_INET, SOCK_DGRAM. */ -/* A Netlink routing socket that is not subscribed to any multicast groups. */ -static struct nl_sock *rtnl_sock; - /* This is set pretty low because we probably won't learn anything from the * additional log messages. */ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); @@ -477,15 +474,6 @@ netdev_linux_init(void) if (status) { VLOG_ERR("failed to create inet socket: %s", ovs_strerror(status)); } - -/* Create rtnetlink socket. */ -if (!status) { -status = nl_sock_create(NETLINK_ROUTE, &rtnl_sock); -if (status) { -VLOG_ERR_RL(&rl, "failed to create rtnetlink socket: %s", -
[ovs-dev] [threads v2 07/13] system-stats: Move into separate thread.
Signed-off-by: Ben Pfaff --- vswitchd/system-stats.c | 105 +- 1 files changed, 84 insertions(+), 21 deletions(-) diff --git a/vswitchd/system-stats.c b/vswitchd/system-stats.c index e7c1d73..ed63899 100644 --- a/vswitchd/system-stats.c +++ b/vswitchd/system-stats.c @@ -35,7 +35,9 @@ #include "dirs.h" #include "dynamic-string.h" #include "json.h" +#include "latch.h" #include "ofpbuf.h" +#include "ovs-thread.h" #include "poll-loop.h" #include "shash.h" #include "smap.h" @@ -504,17 +506,32 @@ get_filesys_stats(struct smap *stats OVS_UNUSED) #define SYSTEM_STATS_INTERVAL (5 * 1000) /* In milliseconds. */ -/* The next time to wake up, or LLONG_MAX if stats are disabled. */ -static long long int next_refresh = LLONG_MAX; +static pthread_mutex_t mutex = PTHREAD_ADAPTIVE_MUTEX_INITIALIZER; +static pthread_cond_t cond = PTHREAD_COND_INITIALIZER; +static struct latch latch; +static bool enabled; +static bool started; +static struct smap *system_stats; + +static void *system_stats_thread_func(void *); +static void discard_stats(void); /* Enables or disables system stats collection, according to 'enable'. */ void system_stats_enable(bool enable) { -if (!enable) { -next_refresh = LLONG_MAX; -} else if (next_refresh == LLONG_MAX) { -next_refresh = time_msec(); +if (enabled != enable) { +xpthread_mutex_lock(&mutex); +if (enable) { +if (!started) { +xpthread_create(NULL, NULL, system_stats_thread_func, NULL); +latch_init(&latch); +} +discard_stats(); +xpthread_cond_signal(&cond); +} +enabled = enable; +xpthread_mutex_unlock(&mutex); } } @@ -529,23 +546,22 @@ system_stats_enable(bool enable) struct smap * system_stats_run(void) { -if (time_msec() >= next_refresh) { -struct smap *stats; +struct smap *stats = NULL; -stats = xmalloc(sizeof *stats); -smap_init(stats); -get_cpu_cores(stats); -get_load_average(stats); -get_memory_stats(stats); -get_process_stats(stats); -get_filesys_stats(stats); - -next_refresh = time_msec() + SYSTEM_STATS_INTERVAL; +xpthread_mutex_lock(&mutex); +if (system_stats) { +latch_poll(&latch); -return stats; +if (enabled) { +stats = system_stats; +system_stats = NULL; +} else { +discard_stats(); +} } +xpthread_mutex_unlock(&mutex); -return NULL; +return stats; } /* Causes poll_block() to wake up when system_stats_run() needs to be @@ -553,7 +569,54 @@ system_stats_run(void) void system_stats_wait(void) { -if (next_refresh != LLONG_MAX) { -poll_timer_wait_until(next_refresh); +if (enabled) { +latch_wait(&latch); +} +} + +static void +discard_stats(void) +{ +if (system_stats) { +smap_destroy(system_stats); +free(system_stats); +system_stats = NULL; +} +} + +static void * NO_RETURN +system_stats_thread_func(void *arg OVS_UNUSED) +{ +pthread_detach(pthread_self()); + +for (;;) { +long long int next_refresh; +struct smap *stats; + +xpthread_mutex_lock(&mutex); +while (!enabled) { +xpthread_cond_wait(&cond, &mutex); +} +xpthread_mutex_unlock(&mutex); + +stats = xmalloc(sizeof *stats); +smap_init(stats); +get_cpu_cores(stats); +get_load_average(stats); +get_memory_stats(stats); +get_process_stats(stats); +get_filesys_stats(stats); + +xpthread_mutex_lock(&mutex); +discard_stats(); +system_stats = stats; +latch_set(&latch); +xpthread_mutex_unlock(&mutex); + +next_refresh = time_msec() + SYSTEM_STATS_INTERVAL; +do { +poll_timer_wait_until(next_refresh); +poll_block(); +} while (time_msec() < next_refresh); } } -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 08/13] byteq: Make the queue size variable instead of fixed at BYTEQ_SIZE bytes.
Signed-off-by: Ben Pfaff --- lib/byteq.c | 26 +++--- lib/byteq.h | 10 -- lib/jsonrpc.c |5 +++-- 3 files changed, 22 insertions(+), 19 deletions(-) diff --git a/lib/byteq.c b/lib/byteq.c index 2ee4a65..3f865cf 100644 --- a/lib/byteq.c +++ b/lib/byteq.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2008, 2009, 2012 Nicira, Inc. +/* Copyright (c) 2008, 2009, 2012, 2013 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -20,13 +20,17 @@ #include #include "util.h" -/* The queue size must be a power of 2. */ -BUILD_ASSERT_DECL(!(BYTEQ_SIZE & (BYTEQ_SIZE - 1))); - -/* Initializes 'q' as empty. */ +/* Initializes 'q' as an empty byteq that uses the 'size' bytes of 'buffer' to + * store data. 'size' must be a power of 2. + * + * The caller must ensure that 'buffer' remains available to the byteq as long + * as 'q' is in use. */ void -byteq_init(struct byteq *q) +byteq_init(struct byteq *q, uint8_t *buffer, size_t size) { +ovs_assert(is_pow2(size)); +q->buffer = buffer; +q->size = size; q->head = q->tail = 0; } @@ -41,7 +45,7 @@ byteq_used(const struct byteq *q) int byteq_avail(const struct byteq *q) { -return BYTEQ_SIZE - byteq_used(q); +return q->size - byteq_used(q); } /* Returns true if no bytes are queued in 'q', @@ -147,7 +151,7 @@ int byteq_tailroom(const struct byteq *q) { int used = byteq_used(q); -int tail_to_end = BYTEQ_SIZE - (q->tail & (BYTEQ_SIZE - 1)); +int tail_to_end = q->size - (q->tail & (q->size - 1)); return MIN(used, tail_to_end); } @@ -156,7 +160,7 @@ byteq_tailroom(const struct byteq *q) const uint8_t * byteq_tail(const struct byteq *q) { -return &q->buffer[q->tail & (BYTEQ_SIZE - 1)]; +return &q->buffer[q->tail & (q->size - 1)]; } /* Removes 'n' bytes from the tail of 'q', which must have at least 'n' bytes @@ -173,7 +177,7 @@ byteq_advance_tail(struct byteq *q, unsigned int n) uint8_t * byteq_head(struct byteq *q) { -return &q->buffer[q->head & (BYTEQ_SIZE - 1)]; +return &q->buffer[q->head & (q->size - 1)]; } /* Returns the number of contiguous bytes of free space starting at the head @@ -182,7 +186,7 @@ int byteq_headroom(const struct byteq *q) { int avail = byteq_avail(q); -int head_to_end = BYTEQ_SIZE - (q->head & (BYTEQ_SIZE - 1)); +int head_to_end = q->size - (q->head & (q->size - 1)); return MIN(avail, head_to_end); } diff --git a/lib/byteq.h b/lib/byteq.h index 5fa51fd..d73e368 100644 --- a/lib/byteq.h +++ b/lib/byteq.h @@ -1,4 +1,4 @@ -/* Copyright (c) 2008, 2009 Nicira, Inc. +/* Copyright (c) 2008, 2009, 2013 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -20,17 +20,15 @@ #include #include -/* Maximum number of bytes in a byteq. */ -#define BYTEQ_SIZE 512 - /* General-purpose circular queue of bytes. */ struct byteq { -uint8_t buffer[BYTEQ_SIZE]; /* Circular queue. */ +uint8_t *buffer;/* Circular queue. */ +unsigned int size; /* Number of bytes allocated for 'buffer'. */ unsigned int head; /* Head of queue. */ unsigned int tail; /* Chases the head. */ }; -void byteq_init(struct byteq *); +void byteq_init(struct byteq *, uint8_t *buffer, size_t size); int byteq_used(const struct byteq *); int byteq_avail(const struct byteq *); bool byteq_is_empty(const struct byteq *); diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index 56b4cce..b4bbc84 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -41,6 +41,7 @@ struct jsonrpc { /* Input. */ struct byteq input; +uint8_t input_buffer[512]; struct json_parser *parser; struct jsonrpc_msg *received; @@ -87,7 +88,7 @@ jsonrpc_open(struct stream *stream) rpc = xzalloc(sizeof *rpc); rpc->name = xstrdup(stream_get_name(stream)); rpc->stream = stream; -byteq_init(&rpc->input); +byteq_init(&rpc->input, rpc->input_buffer, sizeof rpc->input_buffer); list_init(&rpc->output); return rpc; @@ -330,7 +331,7 @@ jsonrpc_recv(struct jsonrpc *rpc, struct jsonrpc_msg **msgp) jsonrpc_received(rpc); if (rpc->status) { const struct byteq *q = &rpc->input; -if (q->head <= BYTEQ_SIZE) { +if (q->head <= q->size) { stream_report_content(q->buffer, q->head, STREAM_JSONRPC, THIS_MODULE, rpc->name); -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 10/13] async-append: New library to allow asynchronous appending to a log file.
This will be hooked into the vlog library in an upcoming commit. Signed-off-by: Ben Pfaff --- configure.ac|1 + lib/async-append-aio.c | 178 +++ lib/async-append-sync.c | 62 lib/async-append.h | 67 ++ lib/automake.mk |7 ++ m4/openvswitch.m4 |5 ++ 6 files changed, 320 insertions(+), 0 deletions(-) create mode 100644 lib/async-append-aio.c create mode 100644 lib/async-append-sync.c create mode 100644 lib/async-append.h diff --git a/configure.ac b/configure.ac index e4f9991..2f3e474 100644 --- a/configure.ac +++ b/configure.ac @@ -86,6 +86,7 @@ OVS_CHECK_ATOMIC_ALWAYS_LOCK_FREE(1) OVS_CHECK_ATOMIC_ALWAYS_LOCK_FREE(2) OVS_CHECK_ATOMIC_ALWAYS_LOCK_FREE(4) OVS_CHECK_ATOMIC_ALWAYS_LOCK_FREE(8) +OVS_CHECK_POSIX_AIO OVS_ENABLE_OPTION([-Wall]) OVS_ENABLE_OPTION([-Wno-sign-compare]) diff --git a/lib/async-append-aio.c b/lib/async-append-aio.c new file mode 100644 index 000..48edc38 --- /dev/null +++ b/lib/async-append-aio.c @@ -0,0 +1,178 @@ +/* Copyright (c) 2013 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +/* This implementation of the async-append.h interface uses the POSIX + * asynchronous I/O interface. */ + +#include "async-append.h" + +#include +#include +#include +#include + +#include "byteq.h" +#include "ovs-thread.h" +#include "util.h" + +/* Maximum number of bytes of buffered data. */ +enum { BUFFER_SIZE = 65536 }; + +/* Maximum number of aiocbs to use. + * + * aiocbs are big (144 bytes with glibc 2.11 on i386) so we try to allow for a + * reasonable number by basing the number we allocate on the amount of buffer + * space. */ +enum { MAX_CBS = ROUND_DOWN_POW2(BUFFER_SIZE / sizeof(struct aiocb)) }; +BUILD_ASSERT_DECL(IS_POW2(MAX_CBS)); + +struct async_append { +int fd; + +struct aiocb *aiocbs; +unsigned int aiocb_head, aiocb_tail; + +uint8_t *buffer; +struct byteq byteq; +}; + +static bool async_append_enabled; + +void +async_append_enable(void) +{ +assert_single_threaded(); +forbid_forking("async i/o enabled"); +async_append_enabled = true; +} + +struct async_append * +async_append_create(int fd) +{ +struct async_append *ap; + +ap = xmalloc(sizeof *ap); +ap->fd = fd; +ap->aiocbs = xmalloc(MAX_CBS * sizeof *ap->aiocbs); +ap->aiocb_head = ap->aiocb_tail = 0; +ap->buffer = xmalloc(BUFFER_SIZE); +byteq_init(&ap->byteq, ap->buffer, BUFFER_SIZE); + +return ap; +} + +void +async_append_destroy(struct async_append *ap) +{ +if (ap) { +async_append_flush(ap); +free(ap->aiocbs); +free(ap->buffer); +free(ap); +} +} + +static bool +async_append_is_full(const struct async_append *ap) +{ +return (ap->aiocb_head - ap->aiocb_tail >= MAX_CBS +|| byteq_is_full(&ap->byteq)); +} + +static bool +async_append_is_empty(const struct async_append *ap) +{ +return byteq_is_empty(&ap->byteq); +} + +static void +async_append_wait(struct async_append *ap) +{ +int n = 0; + +while (!async_append_is_empty(ap)) { +struct aiocb *aiocb = &ap->aiocbs[ap->aiocb_tail & (MAX_CBS - 1)]; +int error = aio_error(aiocb); + +if (error == EINPROGRESS) { +const struct aiocb *p = aiocb; +if (n > 0) { +return; +} +aio_suspend(&p, 1, NULL); +} else { +ignore(aio_return(aiocb)); +ap->aiocb_tail++; +byteq_advance_tail(&ap->byteq, aiocb->aio_nbytes); +n++; +} +} +} + +void +async_append_write(struct async_append *ap, const void *data_, size_t size) +{ +const uint8_t *data = data_; + +if (!async_append_enabled) { +ignore(write(ap->fd, data, size)); +return; +} + +while (size > 0) { +struct aiocb *aiocb; +size_t chunk_size; +void *chunk; + +while (async_append_is_full(ap)) { +async_append_wait(ap); +} + +chunk = byteq_head(&ap->byteq); +chunk_size = byteq_headroom(&ap->byteq); +if (chunk_size > size) { +chunk_size = size; +} +memcpy(chunk, data, chunk_size); + +aiocb = &ap->aiocbs[ap->aiocb_head & (MAX_CBS - 1)]; +memset(aiocb, 0, sizeof *aiocb); +aiocb->aio_fildes = ap->fd; +aiocb->aio_offset = 0; +aiocb-
[ovs-dev] [threads v2 09/13] util: New macros ROUND_UP_POW2, ROUND_DOWN_POW2.
Signed-off-by: Ben Pfaff --- lib/util.h| 19 +++ tests/library.at |2 + tests/test-util.c | 65 + 3 files changed, 86 insertions(+), 0 deletions(-) diff --git a/lib/util.h b/lib/util.h index ae8bfd7..2159594 100644 --- a/lib/util.h +++ b/lib/util.h @@ -108,6 +108,25 @@ is_pow2(uintmax_t x) return IS_POW2(x); } +/* Returns X rounded up to a power of 2. X must be a constant expression. */ +#define ROUND_UP_POW2(X) RUP2__(X) +#define RUP2__(X) (RUP2_1(X) + 1) +#define RUP2_1(X) (RUP2_2(X) | (RUP2_2(X) >> 16)) +#define RUP2_2(X) (RUP2_3(X) | (RUP2_3(X) >> 8)) +#define RUP2_3(X) (RUP2_4(X) | (RUP2_4(X) >> 4)) +#define RUP2_4(X) (RUP2_5(X) | (RUP2_5(X) >> 2)) +#define RUP2_5(X) (RUP2_6(X) | (RUP2_6(X) >> 1)) +#define RUP2_6(X) ((X) - 1) + +/* Returns X rounded down to a power of 2. X must be a constant expression. */ +#define ROUND_DOWN_POW2(X) RDP2__(X) +#define RDP2__(X) (RDP2_1(X) - (RDP2_1(X) >> 1)) +#define RDP2_1(X) (RDP2_2(X) | (RDP2_2(X) >> 16)) +#define RDP2_2(X) (RDP2_3(X) | (RDP2_3(X) >> 8)) +#define RDP2_3(X) (RDP2_4(X) | (RDP2_4(X) >> 4)) +#define RDP2_4(X) (RDP2_5(X) | (RDP2_5(X) >> 2)) +#define RDP2_5(X) ( (X) | ( (X) >> 1)) + #ifndef MIN #define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) #endif diff --git a/tests/library.at b/tests/library.at index f84a55b..92ec49b 100644 --- a/tests/library.at +++ b/tests/library.at @@ -112,6 +112,8 @@ AT_CLEANUP m4_foreach( [testname], [[ctz], + [round_up_pow2], + [round_down_pow2], [popcount], [log_2_floor], [bitwise_copy], diff --git a/tests/test-util.c b/tests/test-util.c index 5afe2f7..6bf8ed5 100644 --- a/tests/test-util.c +++ b/tests/test-util.c @@ -90,6 +90,69 @@ test_ctz(int argc OVS_UNUSED, char *argv[] OVS_UNUSED) check_ctz(0, 32); } +/* Returns a random number in the range 'min'...'max' inclusive. */ +static uint32_t +random_in_range(uint32_t min, uint32_t max) +{ +return min == max ? min : min + random_range(max - min + 1); +} + +static void +check_rup2(uint32_t x, int n) +{ +uint32_t rup2 = ROUND_UP_POW2(x); +if (rup2 != n) { +fprintf(stderr, "ROUND_UP_POW2(%#"PRIx32") is %#"PRIx32" " +"but should be %#"PRIx32"\n", x, rup2, n); +abort(); +} +} + +static void +test_round_up_pow2(int argc OVS_UNUSED, char *argv[] OVS_UNUSED) +{ +int n; + +for (n = 0; n < 32; n++) { +/* Min, max value for which ROUND_UP_POW2 should yield (1 << n). */ +uint32_t min = ((1u << n) >> 1) + 1; +uint32_t max = 1u << n; + +check_rup2(min, 1u << n); +check_rup2(max, 1u << n); +check_rup2(random_in_range(min, max), 1u << n); +} +check_rup2(0, 0); +} + +static void +check_rdp2(uint32_t x, int n) +{ +uint32_t rdp2 = ROUND_DOWN_POW2(x); +if (rdp2 != n) { +fprintf(stderr, "ROUND_DOWN_POW2(%#"PRIx32") is %#"PRIx32" " +"but should be %#"PRIx32"\n", x, rdp2, n); +abort(); +} +} + +static void +test_round_down_pow2(int argc OVS_UNUSED, char *argv[] OVS_UNUSED) +{ +int n; + +for (n = 0; n < 32; n++) { +/* Min, max value for which ROUND_DOWN_POW2 should yield (1 << n). */ +uint32_t min = 1u << n; +uint32_t max = ((1u << n) << 1) - 1; + +check_rdp2(min, 1u << n); +check_rdp2(max, 1u << n); +check_rdp2(random_in_range(min, max), 1u << n); +} +check_rdp2(0, 0); +} + static void shuffle(unsigned int *p, size_t n) { @@ -346,6 +409,8 @@ test_assert(int argc OVS_UNUSED, char *argv[] OVS_UNUSED) static const struct command commands[] = { {"ctz", 0, 0, test_ctz}, +{"round_up_pow2", 0, 0, test_round_up_pow2}, +{"round_down_pow2", 0, 0, test_round_down_pow2}, {"popcount", 0, 0, test_popcount}, {"log_2_floor", 0, 0, test_log_2_floor}, {"bitwise_copy", 0, 0, test_bitwise_copy}, -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 13/13] vlog: Make thread-safe.
Signed-off-by: Ben Pfaff --- lib/vlog.c | 182 +++- lib/vlog.h |9 +++ 2 files changed, 116 insertions(+), 75 deletions(-) diff --git a/lib/vlog.c b/lib/vlog.c index d134c29..074225e 100644 --- a/lib/vlog.c +++ b/lib/vlog.c @@ -33,6 +33,7 @@ #include "dirs.h" #include "dynamic-string.h" #include "ofpbuf.h" +#include "ovs-thread.h" #include "sat-math.h" #include "svec.h" #include "timeval.h" @@ -84,7 +85,7 @@ struct vlog_module *vlog_modules[] = { /* Information about each facility. */ struct facility { const char *name; /* Name. */ -char *pattern; /* Current pattern. */ +char *pattern; /* Current pattern (see 'pattern_rwlock'). */ bool default_pattern; /* Whether current pattern is the default. */ }; static struct facility facilities[VLF_N_FACILITIES] = { @@ -93,18 +94,27 @@ static struct facility facilities[VLF_N_FACILITIES] = { #undef VLOG_FACILITY }; -/* VLF_FILE configuration. */ +/* Protects the 'pattern' in all "struct facility"s, so that a race between + * changing and reading the pattern does not cause an access to freed + * memory. */ +static pthread_rwlock_t pattern_rwlock = PTHREAD_RWLOCK_INITIALIZER; + +/* Sequence number for the message currently being composed. */ +DEFINE_PER_THREAD_DATA(unsigned int, msg_num, 0); + +/* VLF_FILE configuration. + * + * All of the following is protected by 'log_file_mutex', which nests inside + * pattern_rwlock. */ +static pthread_mutex_t log_file_mutex = PTHREAD_ADAPTIVE_MUTEX_INITIALIZER; static char *log_file_name; static int log_fd = -1; static struct async_append *log_writer; -/* vlog initialized? */ -static bool vlog_inited; - static void format_log_message(const struct vlog_module *, enum vlog_level, - enum vlog_facility, unsigned int msg_num, + enum vlog_facility, const char *message, va_list, struct ds *) -PRINTF_FORMAT(5, 0); +PRINTF_FORMAT(4, 0); /* Searches the 'n_names' in 'names'. Returns the index of a match for * 'target', or 'n_names' if no name matches. */ @@ -191,7 +201,7 @@ vlog_get_level(const struct vlog_module *module, enum vlog_facility facility) return module->levels[facility]; } -static void +static void OVS_MUST_HOLD(log_file_mutex) update_min_level(struct vlog_module *module) { enum vlog_facility facility; @@ -214,6 +224,7 @@ set_facility_level(enum vlog_facility facility, struct vlog_module *module, assert(facility >= 0 && facility < VLF_N_FACILITIES); assert(level < VLL_N_LEVELS); +xpthread_mutex_lock(&log_file_mutex); if (!module) { struct vlog_module **mp; @@ -225,6 +236,7 @@ set_facility_level(enum vlog_facility facility, struct vlog_module *module, module->levels[facility] = level; update_min_level(module); } +xpthread_mutex_unlock(&log_file_mutex); } /* Sets the logging level for the given 'module' and 'facility' to 'level'. A @@ -248,12 +260,15 @@ static void do_set_pattern(enum vlog_facility facility, const char *pattern) { struct facility *f = &facilities[facility]; + +xpthread_rwlock_wrlock(&pattern_rwlock); if (!f->default_pattern) { free(f->pattern); } else { f->default_pattern = false; } f->pattern = xstrdup(pattern); +xpthread_rwlock_unlock(&pattern_rwlock); } /* Sets the pattern for the given 'facility' to 'pattern'. */ @@ -276,49 +291,67 @@ vlog_set_pattern(enum vlog_facility facility, const char *pattern) int vlog_set_log_file(const char *file_name) { -char *old_log_file_name; +char *new_log_file_name; struct vlog_module **mp; -int error; +struct stat old_stat; +struct stat new_stat; +int new_log_fd; +bool same_file; + +/* Open new log file. */ +new_log_file_name = (file_name + ? xstrdup(file_name) + : xasprintf("%s/%s.log", ovs_logdir(), program_name)); +new_log_fd = open(new_log_file_name, O_WRONLY | O_CREAT | O_APPEND, 0666); +if (new_log_fd < 0) { +VLOG_WARN("failed to open %s for logging: %s", + new_log_file_name, ovs_strerror(errno)); +free(new_log_file_name); +return errno; +} + +/* If the new log file is the same one we already have open, bail out. */ +xpthread_mutex_lock(&log_file_mutex); +same_file = (log_fd >= 0 + && new_log_fd >= 0 + && !fstat(log_fd, &old_stat) + && !fstat(new_log_fd, &new_stat) + && old_stat.st_dev == new_stat.st_dev + && old_stat.st_ino == new_stat.st_ino); +xpthread_mutex_unlock(&log_file_mutex); +if (same_file) { +close(new_log_fd); +free(new_log_file_name); +return 0; +} -/* Close old log file. */ +/* Log closing old log
[ovs-dev] [threads v2 11/13] vlog: Use async I/O.
Signed-off-by: Ben Pfaff --- lib/vlog.c| 12 +++- vswitchd/bridge.c |3 +++ 2 files changed, 14 insertions(+), 1 deletions(-) diff --git a/lib/vlog.c b/lib/vlog.c index ded434f..d134c29 100644 --- a/lib/vlog.c +++ b/lib/vlog.c @@ -28,6 +28,7 @@ #include #include #include +#include "async-append.h" #include "coverage.h" #include "dirs.h" #include "dynamic-string.h" @@ -95,6 +96,7 @@ static struct facility facilities[VLF_N_FACILITIES] = { /* VLF_FILE configuration. */ static char *log_file_name; static int log_fd = -1; +static struct async_append *log_writer; /* vlog initialized? */ static bool vlog_inited; @@ -281,6 +283,10 @@ vlog_set_log_file(const char *file_name) /* Close old log file. */ if (log_fd >= 0) { VLOG_INFO("closing log file"); + +async_append_destroy(log_writer); +log_writer = NULL; + close(log_fd); log_fd = -1; } @@ -307,6 +313,7 @@ vlog_set_log_file(const char *file_name) log_file_name, ovs_strerror(errno)); error = errno; } else { +log_writer = async_append_create(log_fd); VLOG_INFO("opened log file %s", log_file_name); error = 0; } @@ -786,7 +793,10 @@ vlog_valist(const struct vlog_module *module, enum vlog_level level, format_log_message(module, level, VLF_FILE, msg_num, message, args, &s); ds_put_char(&s, '\n'); -ignore(write(log_fd, s.string, s.length)); +async_append_write(log_writer, s.string, s.length); +if (level == VLL_EMER) { +async_append_flush(log_writer); +} } ds_destroy(&s); diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 686eb93..a73379c 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -18,6 +18,7 @@ #include #include #include +#include "async-append.h" #include "bfd.h" #include "bitmap.h" #include "bond.h" @@ -2444,6 +2445,8 @@ bridge_run(void) * process that forked us to exit successfully. */ daemonize_complete(); +async_append_enable(); + VLOG_INFO_ONCE("%s (Open vSwitch) %s", program_name, VERSION); } } -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [threads v2 12/13] vlog: Make VLOG_ONCE thread-safe.
Of course, the code it calls into isn't thread-safe itself yet. Signed-off-by: Ben Pfaff --- lib/vlog.h | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/lib/vlog.h b/lib/vlog.h index 3466d96..eeec5fc 100644 --- a/lib/vlog.h +++ b/lib/vlog.h @@ -231,13 +231,13 @@ void vlog_usage(void); vlog_rate_limit(THIS_MODULE, level__, RL, __VA_ARGS__); \ } \ } while (0) -#define VLOG_ONCE(LEVEL, ...) \ -do {\ -static bool already_logged; \ -if (!already_logged) { \ -already_logged = true; \ -vlog(THIS_MODULE, LEVEL, __VA_ARGS__); \ -} \ +#define VLOG_ONCE(LEVEL, ...) \ +do {\ +static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; \ +if (ovsthread_once_start(&once)) { \ +vlog(THIS_MODULE, LEVEL, __VA_ARGS__); \ +ovsthread_once_done(&once); \ +} \ } while (0) #define VLOG_DEFINE_MODULE__(MODULE)\ -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] bfd: Implements BFD decay
Thanks Ethan, for the comments, On Fri, Jul 12, 2013 at 2:15 PM, Ethan Jackson wrote: > In bfd_run() the final if statement should run whether or not decay is > enabled. It will have to become a bit more sophisticated, but > assuming my suggestion in the previous patch works, it should be > straight forward. > > With the above two comments implemented the bfd_decay function should > become quite a bit simpler. For example, we shouldn't be manually > mucking with the detect_time. The current min_rx changing code should > handle that. This makes sense. Modifying the bfd_poll() will make life easier for sure. > The port_modified() function in ofproto-dpif.c needs to update the "struct bfd"s netdev like it does for cfm. Otherwise you'll run into > some subtle bugs. > Thanks for this reminder! > Minor stuff: > The subject line of the commit message should have a period. It > should also change "Implements" to "Implement". > Please change the commit message to refer to "interfaces" instead of > "tunnels". BFD can run on anything. > > This prexisted your patch, but in bfd_configure() the 'cfg' argument > should be on the previous line. > > Instead of directly holding a pointer to 'netdev', please take a > reference of it using netdev_ref(). See ofproto-dpif-xlate for an > example. > Thanks for pointing these out. > We can probably get rid of decay_enabled flag altogher and just use > the decay_detect_time. If it's 0, we'll say the feature is disabled. Makes sense, I'll adjust accordingly > In bfd_rx_packets() why would bfd->netdev be null? > Yes, it is an unnecessary check. > In bfd.h instead of including netdev.h you can simply declare struct > netdev. cfm.h has an example. > Thanks pointing this out. On Fri, Jul 12, 2013 at 2:15 PM, Ethan Jackson wrote: > Thanks for getting this together. Comments below. > > Major stuff: > > When anything changes about the configuration of the decay feature, > the decay_detect_time needs to be reset and we should switch back to > the original min_rx. To be safe, I'd also reset it when the min_rx > changes. > > I wouldn't bother with a default decay_min_rx value. If someone fails > to set the appropriate value (something >= min_rx) I would simply warn > and disable the feature. > > Changing the min_rx is actually quite a bit more complicated than what > you've implemented here. It's insufficient to just set the value, > instead we have to tell the other end of the BFD session we're going > to do so, before we actually pull the trigger. We do this by by > initiating a poll sequence. And setting the bfd->poll_min_rx the the > new value. I think we can pretty cleanly achieve the correct behavior > by (1) initiating a poll in bfd_run() if we should decay, but the > min_rx does not equal the decay_min_rx. And (2) in bfd_poll() setting > poll_min_rx to decay_min_rx if decay is enabled and we should decay. > > In bfd_run() the final if statement should run whether or not decay is > enabled. It will have to become a bit more sophisticated, but > assuming my suggestion in the previous patch works, it should be > straight forward. > > With the above two comments implemented the bfd_decay function should > become quite a bit simpler. For example, we shouldn't be manually > mucking with the detect_time. The current min_rx changing code should > handle that. > > The port_modified() function in ofproto-dpif.c needs to update the > "struct bfd"s netdev like it does for cfm. Otherwise you'll run into > some subtle bugs. > > Minor stuff: > The subject line of the commit message should have a period. It > should also change "Implements" to "Implement". > Please change the commit message to refer to "interfaces" instead of > "tunnels". BFD can run on anything. > > This prexisted your patch, but in bfd_configure() the 'cfg' argument > should be on the previous line. > > Instead of directly holding a pointer to 'netdev', please take a > reference of it using netdev_ref(). See ofproto-dpif-xlate for an > example. > > We can probably get rid of decay_enabled flag altogher and just use > the decay_detect_time. If it's 0, we'll say the feature is disabled. > > In bfd_rx_packets() why would bfd->netdev be null? > > In bfd.h instead of including netdev.h you can simply declare struct > netdev. cfm.h has an example. > > Ethan > > > On Thu, Jul 11, 2013 at 6:39 PM, Alex Wang wrote: > > When there is no incoming data traffic in the tunnel for a period, > > BFD decay allows the bfd session to increase the min_rx. This is > > helpful in that usually some tunnels are rarely used. And cpu > > consumption can be reduced by processing fewer bfd control packets. > > > > Signed-off-by: Alex Wang > > --- > > lib/bfd.c | 89 > ++-- > > lib/bfd.h |5 ++- > > ofproto/ofproto-dpif.c |3 +- > > vswitchd/vswitch.xml | 17 + > > 4 files changed, 109 insertions(+), 5
[ovs-dev] [PATCH] datapath: Use RCU lock for dp dump operation.
RCUfy dp-dump operation which is already read-only. This makes all ovs dump operations lockless. Signed-off-by: Pravin B Shelar --- datapath/datapath.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index 2f02f71..09367cf 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -972,9 +972,10 @@ static struct genl_ops dp_packet_genl_ops[] = { static void get_dp_stats(struct datapath *dp, struct ovs_dp_stats *stats) { + struct flow_table *table; int i; - struct flow_table *table = ovsl_dereference(dp->table); + table = rcu_dereference_check(dp->table, lockdep_ovsl_is_held()); stats->n_flows = ovs_flow_tbl_count(table); stats->n_hit = stats->n_missed = stats->n_lost = 0; @@ -1902,7 +1903,7 @@ static int ovs_dp_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) int skip = cb->args[0]; int i = 0; - ovs_lock(); + rcu_read_lock(); list_for_each_entry(dp, &ovs_net->dps, list_node) { if (i >= skip && ovs_dp_cmd_fill_info(dp, skb, NETLINK_CB(cb->skb).portid, @@ -1911,7 +1912,7 @@ static int ovs_dp_cmd_dump(struct sk_buff *skb, struct netlink_callback *cb) break; i++; } - ovs_unlock(); + rcu_read_unlock(); cb->args[0] = i; -- 1.7.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads v2 07/13] system-stats: Move into separate thread.
On Fri, Jul 12, 2013 at 2:54 PM, Ben Pfaff wrote: > Signed-off-by: Ben Pfaff > --- > vswitchd/system-stats.c | 105 > +- > 1 files changed, 84 insertions(+), 21 deletions(-) > > diff --git a/vswitchd/system-stats.c b/vswitchd/system-stats.c > index e7c1d73..ed63899 100644 > --- a/vswitchd/system-stats.c > +++ b/vswitchd/system-stats.c > @@ -35,7 +35,9 @@ > #include "dirs.h" > #include "dynamic-string.h" > #include "json.h" > +#include "latch.h" > #include "ofpbuf.h" > +#include "ovs-thread.h" > #include "poll-loop.h" > #include "shash.h" > #include "smap.h" > @@ -504,17 +506,32 @@ get_filesys_stats(struct smap *stats OVS_UNUSED) > > #define SYSTEM_STATS_INTERVAL (5 * 1000) /* In milliseconds. */ > > -/* The next time to wake up, or LLONG_MAX if stats are disabled. */ > -static long long int next_refresh = LLONG_MAX; > +static pthread_mutex_t mutex = PTHREAD_ADAPTIVE_MUTEX_INITIALIZER; > +static pthread_cond_t cond = PTHREAD_COND_INITIALIZER; > +static struct latch latch; > +static bool enabled; > +static bool started; > +static struct smap *system_stats; > + > +static void *system_stats_thread_func(void *); > +static void discard_stats(void); > > /* Enables or disables system stats collection, according to 'enable'. */ > void > system_stats_enable(bool enable) > { > -if (!enable) { > -next_refresh = LLONG_MAX; > -} else if (next_refresh == LLONG_MAX) { > -next_refresh = time_msec(); > +if (enabled != enable) { > +xpthread_mutex_lock(&mutex); > +if (enable) { > +if (!started) { I do not see "started" getting changed anywhere else. > +xpthread_create(NULL, NULL, system_stats_thread_func, > NULL); > +latch_init(&latch); > +} > +discard_stats(); > +xpthread_cond_signal(&cond); > +} > +enabled = enable; > +xpthread_mutex_unlock(&mutex); > } > } > > @@ -529,23 +546,22 @@ system_stats_enable(bool enable) > struct smap * > system_stats_run(void) > { > -if (time_msec() >= next_refresh) { > -struct smap *stats; > +struct smap *stats = NULL; > > -stats = xmalloc(sizeof *stats); > -smap_init(stats); > -get_cpu_cores(stats); > -get_load_average(stats); > -get_memory_stats(stats); > -get_process_stats(stats); > -get_filesys_stats(stats); > - > -next_refresh = time_msec() + SYSTEM_STATS_INTERVAL; > +xpthread_mutex_lock(&mutex); > +if (system_stats) { > +latch_poll(&latch); > > -return stats; > +if (enabled) { > +stats = system_stats; > +system_stats = NULL; > +} else { > +discard_stats(); > +} > } > +xpthread_mutex_unlock(&mutex); > > -return NULL; > +return stats; > } > > /* Causes poll_block() to wake up when system_stats_run() needs to be > @@ -553,7 +569,54 @@ system_stats_run(void) > void > system_stats_wait(void) > { > -if (next_refresh != LLONG_MAX) { > -poll_timer_wait_until(next_refresh); > +if (enabled) { > +latch_wait(&latch); > +} > +} > + > +static void > +discard_stats(void) > +{ > +if (system_stats) { > +smap_destroy(system_stats); > +free(system_stats); > +system_stats = NULL; > +} > +} > + > +static void * NO_RETURN > +system_stats_thread_func(void *arg OVS_UNUSED) > +{ > +pthread_detach(pthread_self()); > + > +for (;;) { > +long long int next_refresh; > +struct smap *stats; > + > +xpthread_mutex_lock(&mutex); > +while (!enabled) { > +xpthread_cond_wait(&cond, &mutex); > +} > +xpthread_mutex_unlock(&mutex); > + > +stats = xmalloc(sizeof *stats); > +smap_init(stats); > +get_cpu_cores(stats); > +get_load_average(stats); > +get_memory_stats(stats); > +get_process_stats(stats); > +get_filesys_stats(stats); > + > +xpthread_mutex_lock(&mutex); > +discard_stats(); > +system_stats = stats; > +latch_set(&latch); > +xpthread_mutex_unlock(&mutex); > + > +next_refresh = time_msec() + SYSTEM_STATS_INTERVAL; > +do { > +poll_timer_wait_until(next_refresh); > The poll_timer_wait_until function sets the value of "loop->timeout_when". The way I see it, it is not protected by any locks and can be overwritten by multiple threads at the same time. > +poll_block(); > +} while (time_msec() < next_refresh); > } > } > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] ofproto-dpif: Zero-out subfacet counters when installation fails.
When deleting subfacets, subfacet_uninstall() asserts that the subfacet's counters are zero to make sure we don't lose counters. We have seen cases where a subfacet cannot be installed, but the counters have values. This should not happen, since we shouldn't create a subfacet if the datapath has a flow. To prevent asserts, this commit zeros out the counters and logs an error, since it's possible for a datapath to trigger this. Bug #18460. Signed-off-by: Justin Pettit --- ofproto/ofproto-dpif.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 67e6c7a..8b216e5 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -3760,6 +3760,16 @@ handle_miss_upcalls(struct dpif_backer *backer, struct dpif_upcall *upcalls, COVERAGE_INC(subfacet_install_fail); +/* Zero-out subfacet counters when installation failed, but + * datapath reported hits. This should not happen, since if + * the datapath flow exists, we should not be attempting to + * create a new subfacet. */ +if (subfacet->dp_packet_count || subfacet->dp_byte_count) { +VLOG_ERR_RL(&rl, "failed to install subfacet for which " +"datapath reported hits"); +subfacet->dp_packet_count = subfacet->dp_byte_count = 0; +} + subfacet->path = SF_NOT_INSTALLED; } -- 1.7.5.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [threads v2 01/13] stress: Remove essentially unused library.
On Fri, Jul 12, 2013 at 2:54 PM, Ben Pfaff wrote: > The "stress" library was introduced years ago. We intended at the time to > start using it to provoke errors in testing, to make sure that Open vSwitch > was resilient against those errors. The intention was good, but there were > few actual implementations of stress options, and the testing never > materialized. > > Rather than adapt the stress library for thread safety, this seems like a > good opportunity to remove it, so this commit does so. > > Signed-off-by: Ben Pfaff > Looks good to me. > --- > lib/automake.mk|8 -- > lib/netlink-socket.c | 14 --- > lib/stream-fd.c| 17 > lib/stress-unixctl.man | 72 --- > lib/stress.c | 219 > > lib/stress.h | 94 --- > manpages.mk|4 - > ovsdb/ovsdb-server.1.in|1 - > ovsdb/ovsdb-server.c |2 - > vswitchd/ovs-vswitchd.8.in |1 - > vswitchd/ovs-vswitchd.c|2 - > 11 files changed, 0 insertions(+), 434 deletions(-) > delete mode 100644 lib/stress-unixctl.man > delete mode 100644 lib/stress.c > delete mode 100644 lib/stress.h > > diff --git a/lib/automake.mk b/lib/automake.mk > index 507ca97..280fc8b 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -182,8 +182,6 @@ lib_libopenvswitch_a_SOURCES = \ > lib/stream-unix.c \ > lib/stream.c \ > lib/stream.h \ > - lib/stress.c \ > - lib/stress.h \ > lib/string.c \ > lib/string.h \ > lib/svec.c \ > @@ -306,7 +304,6 @@ MAN_FRAGMENTS += \ > lib/ssl-peer-ca-cert.man \ > lib/ssl.man \ > lib/ssl-syn.man \ > - lib/stress-unixctl.man \ > lib/table.man \ > lib/unixctl.man \ > lib/unixctl-syn.man \ > @@ -383,11 +380,6 @@ lib/coverage.def: $(DIST_SOURCES) > sed -n > 's|^COVERAGE_DEFINE(\([_a-zA-Z0-9]\{1,\}\)).*$$|COVERAGE_COUNTER(\1)|p' > $(all_sources) | LC_ALL=C sort -u > $@ > CLEANFILES += lib/coverage.def > > -lib/stress.$(OBJEXT): lib/stress.def > -lib/stress.def: $(DIST_SOURCES) > - sed -n '/^STRESS_OPTION(/,/);$$/{s/);$$/)/;p}' $(all_sources) > $@ > -CLEANFILES += lib/stress.def > - > lib/vlog.$(OBJEXT): lib/vlog-modules.def > lib/vlog-modules.def: $(DIST_SOURCES) > sed -n > 's|^VLOG_DEFINE_\(THIS_\)\{0,1\}MODULE(\([_a-zA-Z0-9]\{1,\}\)).*$$|VLOG_MODULE(\2)|p' > $(all_sources) | LC_ALL=C sort -u > $@ > diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c > index dfe39ac..aa7fca2 100644 > --- a/lib/netlink-socket.c > +++ b/lib/netlink-socket.c > @@ -31,7 +31,6 @@ > #include "ofpbuf.h" > #include "poll-loop.h" > #include "socket-util.h" > -#include "stress.h" > #include "util.h" > #include "vlog.h" > > @@ -309,15 +308,6 @@ nl_sock_send_seq(struct nl_sock *sock, const struct > ofpbuf *msg, > return nl_sock_send__(sock, msg, nlmsg_seq, wait); > } > > -/* This stress option is useful for testing that OVS properly tolerates > - * -ENOBUFS on NetLink sockets. Such errors are unavoidable because they > can > - * occur if the kernel cannot temporarily allocate enough GFP_ATOMIC > memory to > - * reply to a request. They can also occur if messages arrive on a > multicast > - * channel faster than OVS can process them. */ > -STRESS_OPTION( > -netlink_overflow, "simulate netlink socket receive buffer overflow", > -5, 1, -1, 100); > - > static int > nl_sock_recv__(struct nl_sock *sock, struct ofpbuf *buf, bool wait) > { > @@ -373,10 +363,6 @@ nl_sock_recv__(struct nl_sock *sock, struct ofpbuf > *buf, bool wait) > return EPROTO; > } > > -if (STRESS(netlink_overflow)) { > -return ENOBUFS; > -} > - > buf->size = MIN(retval, buf->allocated); > if (retval > buf->allocated) { > COVERAGE_INC(netlink_recv_jumbo); > diff --git a/lib/stream-fd.c b/lib/stream-fd.c > index d102582..1171f32 100644 > --- a/lib/stream-fd.c > +++ b/lib/stream-fd.c > @@ -26,7 +26,6 @@ > #include "fatal-signal.h" > #include "poll-loop.h" > #include "socket-util.h" > -#include "stress.h" > #include "util.h" > #include "stream-provider.h" > #include "stream.h" > @@ -89,38 +88,22 @@ fd_connect(struct stream *stream) > return check_connection_completion(s->fd); > } > > -STRESS_OPTION( > -stream_flaky_recv, "simulate failure of fd stream recvs", > -100, 0, -1, 0); > - > static ssize_t > fd_recv(struct stream *stream, void *buffer, size_t n) > { > struct stream_fd *s = stream_fd_cast(stream); > ssize_t retval; > > -if (STRESS(stream_flaky_recv)) { > -return -EIO; > -} > - > retval = read(s->fd, buffer, n); > return retval >= 0 ? retval : -errno; > } > > -STRESS_OPTION( > -stream_flaky_send, "simulate failure of fd stream sends", > -100, 0, -1, 0); > - > static ssize_t > fd_send(struct stream *stream, const void *buffer, size_t n
Re: [ovs-dev] [PATCH] ofproto-dpif: Zero-out subfacet counters when installation fails.
One thing that isn't entirely clear from the commit message and comment is whether we believe this message, when it appears, indicates that there is a bug somewhere. I think the intention is that the message indicates a bug, I'm just saying this to check that I'm right. The code change definitely fixes a potential problem and I'm in favor of it. Thanks, Ben. On Jul 12, 2013 5:12 PM, "Justin Pettit" wrote: > When deleting subfacets, subfacet_uninstall() asserts that the > subfacet's counters are zero to make sure we don't lose counters. We > have seen cases where a subfacet cannot be installed, but the counters > have values. This should not happen, since we shouldn't create a > subfacet if the datapath has a flow. To prevent asserts, this commit > zeros out the counters and logs an error, since it's possible for a > datapath to trigger this. > > Bug #18460. > > Signed-off-by: Justin Pettit > --- > ofproto/ofproto-dpif.c | 10 ++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > index 67e6c7a..8b216e5 100644 > --- a/ofproto/ofproto-dpif.c > +++ b/ofproto/ofproto-dpif.c > @@ -3760,6 +3760,16 @@ handle_miss_upcalls(struct dpif_backer *backer, > struct dpif_upcall *upcalls, > > COVERAGE_INC(subfacet_install_fail); > > +/* Zero-out subfacet counters when installation failed, but > + * datapath reported hits. This should not happen, since if > + * the datapath flow exists, we should not be attempting to > + * create a new subfacet. */ > +if (subfacet->dp_packet_count || subfacet->dp_byte_count) { > +VLOG_ERR_RL(&rl, "failed to install subfacet for which " > +"datapath reported hits"); > +subfacet->dp_packet_count = subfacet->dp_byte_count = 0; > +} > + > subfacet->path = SF_NOT_INSTALLED; > } > > -- > 1.7.5.4 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] ofproto-dpif: Don't put new subfacets as result of "userspace" action.
Don't install a flow if it's the result of the "userspace" action and it contains a slow-path action. This can occur when a datapath flow with wildcards has a "userspace" action and flows sent to userspace result in a different subfacet, which will then be rejected as overlapping by the datapath. Signed-off-by: Justin Pettit --- ofproto/ofproto-dpif.c | 18 +- 1 files changed, 17 insertions(+), 1 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 283336c..f5f2c75 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -3426,7 +3426,23 @@ handle_flow_miss_with_facet(struct flow_miss *miss, struct facet *facet, } } -if (miss->upcall_type == DPIF_UC_MISS || subfacet->path != want_path) { +/* Put down a flow in the following cases: + * + * - The upcall is a miss. (This should only happen if flows we + * expect in the kernel have disappeared (e.g., as the result of + * "ovs-dpctl del-flows".)) + * - The current path is not what we want. + * + * However, we do not want to install the flow if it's the result of + * the "userspace" action and it contains a slow-path action. + * This can occur when a datapath flow with wildcards has a + * "userspace" action and flows sent to userspace result in a + * different subfacet, which will then be rejected as overlapping by + * the datapath. + */ +if (!(miss->upcall_type == DPIF_UC_ACTION && want_path == SF_SLOW_PATH) +&& (miss->upcall_type == DPIF_UC_MISS || +subfacet->path != want_path)) { struct flow_miss_op *op = &ops[(*n_ops)++]; struct dpif_flow_put *put = &op->dpif_op.u.flow_put; -- 1.7.5.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ofproto-dpif: Don't put new subfacets as result of "userspace" action.
This makes me a bit uncomfortable because we're going to have facet's floating around which have never been installed. I would expect the rest of ofproto-dpif to be unhappy about that, though I can't say exactly how. At any rate, how about an alternative approach. If a packet came in as DPIF_UC_ACTION, simply never install it. Something like what I have below. Now that I look at it, in the below patch, we'll have to add the stats to facet if we decide to bail out, but you get the gist of it. Thoughts? Ethan --- ofproto/ofproto-dpif.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 67e6c7a..31d463f 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -3395,11 +3395,7 @@ handle_flow_miss_with_facet(struct flow_miss *miss, struct facet *facet, struct subfacet *subfacet; struct ofpbuf *packet; -subfacet = subfacet_create(facet, miss, now); want_path = facet->xout.slow ? SF_SLOW_PATH : SF_FAST_PATH; -if (stats) { -subfacet_update_stats(subfacet, stats); -} LIST_FOR_EACH (packet, list_node, &miss->packets) { struct flow_miss_op *op = &ops[*n_ops]; @@ -3426,6 +3422,16 @@ handle_flow_miss_with_facet(struct flow_miss *miss, struct facet *facet, } } +if (miss->upcall_type == DPIF_UC_ACTION +&& !list_is_empty(&facet->subfacets)) { +return; +} + +subfacet = subfacet_create(facet, miss, now); +if (stats) { +subfacet_update_stats(subfacet, stats); +} + if (miss->upcall_type == DPIF_UC_MISS || subfacet->path != want_path) { struct flow_miss_op *op = &ops[(*n_ops)++]; struct dpif_flow_put *put = &op->dpif_op.u.flow_put; -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] datapath: Use RCU lock for dp dump operation.
On Fri, Jul 12, 2013 at 4:37 PM, Pravin B Shelar wrote: > diff --git a/datapath/datapath.c b/datapath/datapath.c > index 2f02f71..09367cf 100644 > --- a/datapath/datapath.c > +++ b/datapath/datapath.c > @@ -1902,7 +1903,7 @@ static int ovs_dp_cmd_dump(struct sk_buff *skb, struct > netlink_callback *cb) > int skip = cb->args[0]; > int i = 0; > > - ovs_lock(); > + rcu_read_lock(); > list_for_each_entry(dp, &ovs_net->dps, list_node) { I don't think we're maintaining this list in an RCU-safe manner at the moment. X-CudaMail-Whitelist-To: dev@openvswitch.org ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] bfd: Implement BFD decay.
When there is no incoming data traffic at the interface for a period, BFD decay allows the bfd session to increase the min_rx. This is helpful in that some interfaces usually idle for long time. And cpu consumption can be reduced by processing fewer bfd control packets. Signed-off-by: Alex Wang --- v1 -> v2: - remove bfd:decay_enable option, only use bfd:decay_min_rx. - add bfd_set_netdev() function. - reset decay_min_rx when itself or min_rx is reconfigured. - use bfd_poll() to update the decay changes. - refine the code as suggested by Ethan. --- lib/bfd.c | 97 ++-- lib/bfd.h |5 ++- ofproto/ofproto-dpif.c |7 +++- vswitchd/vswitch.xml | 10 + 4 files changed, 113 insertions(+), 6 deletions(-) diff --git a/lib/bfd.c b/lib/bfd.c index aa1a3f7..e5b04ec 100644 --- a/lib/bfd.c +++ b/lib/bfd.c @@ -24,6 +24,7 @@ #include "hash.h" #include "hmap.h" #include "list.h" +#include "netdev.h" #include "netlink.h" #include "odp-util.h" #include "ofpbuf.h" @@ -152,6 +153,9 @@ struct bfd { bool cpath_down; /* Concatenated Path Down. */ uint8_t mult; /* bfd.DetectMult. */ +struct netdev *netdev; +uint64_t rx_packets; /* Packets received by 'netdev'. */ + enum state state; /* bfd.SessionState. */ enum state rmt_state; /* bfd.RemoteSessionState. */ @@ -182,6 +186,10 @@ struct bfd { int ref_cnt; int forwarding_override; /* Manual override of 'forwarding' status. */ + +/* BFD decay related variables. */ +int decay_min_rx; +long long int decay_detect_time; /* Decay detection time. */ }; static bool bfd_in_poll(const struct bfd *); @@ -191,6 +199,8 @@ static const char *bfd_state_str(enum state); static long long int bfd_min_tx(const struct bfd *); static long long int bfd_tx_interval(const struct bfd *); static long long int bfd_rx_interval(const struct bfd *); +static uint64_t bfd_rx_packets(const struct bfd *); +static void bfd_decay(struct bfd *); static void bfd_set_next_tx(struct bfd *); static void bfd_set_state(struct bfd *, enum state, enum diag); static uint32_t generate_discriminator(void); @@ -242,12 +252,13 @@ bfd_get_status(const struct bfd *bfd, struct smap *smap) * handle for the session, or NULL if BFD is not enabled according to 'cfg'. * Also returns NULL if cfg is NULL. */ struct bfd * -bfd_configure(struct bfd *bfd, const char *name, - const struct smap *cfg) +bfd_configure(struct bfd *bfd, const char *name, const struct smap *cfg, + struct netdev *netdev) { static uint16_t udp_src = 0; static bool init = false; +int decay_min_rx; long long int min_tx, min_rx; bool cpath_down; @@ -276,6 +287,9 @@ bfd_configure(struct bfd *bfd, const char *name, bfd->min_tx = 1000; bfd->mult = 3; bfd->ref_cnt = 1; +bfd->netdev = netdev_ref(netdev); +bfd->decay_detect_time = 0; +bfd->rx_packets = bfd_rx_packets(bfd); /* RFC 5881 section 4 * The source port MUST be in the range 49152 through 65535. The same @@ -295,6 +309,7 @@ bfd_configure(struct bfd *bfd, const char *name, || (!bfd_in_poll(bfd) && bfd->cfg_min_tx < bfd->min_tx)) { bfd->min_tx = bfd->cfg_min_tx; } +bfd->decay_min_rx = 0; bfd_poll(bfd); } @@ -309,6 +324,19 @@ bfd_configure(struct bfd *bfd, const char *name, bfd_poll(bfd); } +decay_min_rx = smap_get_int(cfg, "decay_min_rx", 0); +if (bfd->decay_min_rx != decay_min_rx ) { +if (decay_min_rx > 0 && decay_min_rx < bfd->cfg_min_rx) { +VLOG_WARN("%s: decay_min_rx cannot be less than %lld ms", + bfd->name, bfd->cfg_min_rx); +bfd->decay_min_rx = 0; +} else { +bfd->decay_min_rx = decay_min_rx; +} +bfd->min_rx = bfd->cfg_min_rx; +bfd_poll(bfd); +} + cpath_down = smap_get_bool(cfg, "cpath_down", false); if (bfd->cpath_down != cpath_down) { bfd->cpath_down = cpath_down; @@ -340,6 +368,7 @@ bfd_unref(struct bfd *bfd) hmap_remove(&all_bfds, &bfd->node); free(bfd->name); free(bfd); +netdev_close(bfd->netdev); } } } @@ -360,11 +389,23 @@ bfd_wait(const struct bfd *bfd) void bfd_run(struct bfd *bfd) { -if (bfd->state > STATE_DOWN && time_msec() >= bfd->detect_time) { +long long int now = time_msec(); + +if (bfd->state > STATE_DOWN && now >= bfd->detect_time) { bfd_set_state(bfd, STATE_DOWN, DIAG_EXPIRED); } +if (bfd->state == STATE_UP && bfd->decay_min_rx > 0 +&& now >= bfd->decay_detect_time) { +bfd_decay(bfd); +} + if (bfd->min_tx != bfd->cfg_min_tx || bfd->min_rx != bfd->cfg_min_rx) { +/* Do not poll if already decayed to decay_min_rx. */
Re: [ovs-dev] [PATCH] bfd: Implement BFD decay.
Hey Ethan, Sorry, I didn't add the version number. Should be [PATCH 1/2 V2] One clarification: """ +bfd->decay_detect_time = (bfd->decay_min_rx < 2000 ? + 2000 : bfd->decay_min_rx) + time_msec(); """ The decay_detect_time should be greater than 2000, since "facet_push_stats()" is called every 2 seconds. If the "decay_detect_time" is less than 2 sec, there can be flips between decay and non-decay. >From what I experimented on ejja and ejjb, this patch works fine. Also I think I can finish the bfd decay tests soon and check if there is any hidden bugs. Hope to hear comments from you. Kind Regards, Alex Wang, On Fri, Jul 12, 2013 at 7:03 PM, Alex Wang wrote: > When there is no incoming data traffic at the interface for a period, > BFD decay allows the bfd session to increase the min_rx. This is > helpful in that some interfaces usually idle for long time. And cpu > consumption can be reduced by processing fewer bfd control packets. > > Signed-off-by: Alex Wang > --- > > v1 -> v2: > - remove bfd:decay_enable option, only use bfd:decay_min_rx. > - add bfd_set_netdev() function. > - reset decay_min_rx when itself or min_rx is reconfigured. > - use bfd_poll() to update the decay changes. > - refine the code as suggested by Ethan. > > --- > lib/bfd.c | 97 > ++-- > lib/bfd.h |5 ++- > ofproto/ofproto-dpif.c |7 +++- > vswitchd/vswitch.xml | 10 + > 4 files changed, 113 insertions(+), 6 deletions(-) > > diff --git a/lib/bfd.c b/lib/bfd.c > index aa1a3f7..e5b04ec 100644 > --- a/lib/bfd.c > +++ b/lib/bfd.c > @@ -24,6 +24,7 @@ > #include "hash.h" > #include "hmap.h" > #include "list.h" > +#include "netdev.h" > #include "netlink.h" > #include "odp-util.h" > #include "ofpbuf.h" > @@ -152,6 +153,9 @@ struct bfd { > bool cpath_down; /* Concatenated Path Down. */ > uint8_t mult; /* bfd.DetectMult. */ > > +struct netdev *netdev; > +uint64_t rx_packets; /* Packets received by 'netdev'. */ > + > enum state state; /* bfd.SessionState. */ > enum state rmt_state; /* bfd.RemoteSessionState. */ > > @@ -182,6 +186,10 @@ struct bfd { > > int ref_cnt; > int forwarding_override; /* Manual override of 'forwarding' > status. */ > + > +/* BFD decay related variables. */ > +int decay_min_rx; > +long long int decay_detect_time; /* Decay detection time. */ > }; > > static bool bfd_in_poll(const struct bfd *); > @@ -191,6 +199,8 @@ static const char *bfd_state_str(enum state); > static long long int bfd_min_tx(const struct bfd *); > static long long int bfd_tx_interval(const struct bfd *); > static long long int bfd_rx_interval(const struct bfd *); > +static uint64_t bfd_rx_packets(const struct bfd *); > +static void bfd_decay(struct bfd *); > static void bfd_set_next_tx(struct bfd *); > static void bfd_set_state(struct bfd *, enum state, enum diag); > static uint32_t generate_discriminator(void); > @@ -242,12 +252,13 @@ bfd_get_status(const struct bfd *bfd, struct smap > *smap) > * handle for the session, or NULL if BFD is not enabled according to > 'cfg'. > * Also returns NULL if cfg is NULL. */ > struct bfd * > -bfd_configure(struct bfd *bfd, const char *name, > - const struct smap *cfg) > +bfd_configure(struct bfd *bfd, const char *name, const struct smap *cfg, > + struct netdev *netdev) > { > static uint16_t udp_src = 0; > static bool init = false; > > +int decay_min_rx; > long long int min_tx, min_rx; > bool cpath_down; > > @@ -276,6 +287,9 @@ bfd_configure(struct bfd *bfd, const char *name, > bfd->min_tx = 1000; > bfd->mult = 3; > bfd->ref_cnt = 1; > +bfd->netdev = netdev_ref(netdev); > +bfd->decay_detect_time = 0; > +bfd->rx_packets = bfd_rx_packets(bfd); > > /* RFC 5881 section 4 > * The source port MUST be in the range 49152 through 65535. The > same > @@ -295,6 +309,7 @@ bfd_configure(struct bfd *bfd, const char *name, > || (!bfd_in_poll(bfd) && bfd->cfg_min_tx < bfd->min_tx)) { > bfd->min_tx = bfd->cfg_min_tx; > } > +bfd->decay_min_rx = 0; > bfd_poll(bfd); > } > > @@ -309,6 +324,19 @@ bfd_configure(struct bfd *bfd, const char *name, > bfd_poll(bfd); > } > > +decay_min_rx = smap_get_int(cfg, "decay_min_rx", 0); > +if (bfd->decay_min_rx != decay_min_rx ) { > +if (decay_min_rx > 0 && decay_min_rx < bfd->cfg_min_rx) { > +VLOG_WARN("%s: decay_min_rx cannot be less than %lld ms", > + bfd->name, bfd->cfg_min_rx); > +bfd->decay_min_rx = 0; > +} else { > +bfd->decay_min_rx = decay_min_rx; > +} > +bfd->min_rx = bfd->cfg_min_rx; > +bfd_poll(bfd); > +} > + > cpath_down = smap_get_
Re: [ovs-dev] [PATCH] ofproto-dpif: Zero-out subfacet counters when installation fails.
Okay, I made it more explicit in both and pushed to affected branches. Thanks for the review and strategy discussion. --Justin On Jul 12, 2013, at 5:21 PM, Ben Pfaff wrote: > One thing that isn't entirely clear from the commit message and comment is > whether we believe this message, when it appears, indicates that there is a > bug somewhere. I think the intention is that the message indicates a bug, I'm > just saying this to check that I'm right. > > The code change definitely fixes a potential problem and I'm in favor of it. > > Thanks, > > Ben. > > On Jul 12, 2013 5:12 PM, "Justin Pettit" wrote: > When deleting subfacets, subfacet_uninstall() asserts that the > subfacet's counters are zero to make sure we don't lose counters. We > have seen cases where a subfacet cannot be installed, but the counters > have values. This should not happen, since we shouldn't create a > subfacet if the datapath has a flow. To prevent asserts, this commit > zeros out the counters and logs an error, since it's possible for a > datapath to trigger this. > > Bug #18460. > > Signed-off-by: Justin Pettit > --- > ofproto/ofproto-dpif.c | 10 ++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > index 67e6c7a..8b216e5 100644 > --- a/ofproto/ofproto-dpif.c > +++ b/ofproto/ofproto-dpif.c > @@ -3760,6 +3760,16 @@ handle_miss_upcalls(struct dpif_backer *backer, struct > dpif_upcall *upcalls, > > COVERAGE_INC(subfacet_install_fail); > > +/* Zero-out subfacet counters when installation failed, but > + * datapath reported hits. This should not happen, since if > + * the datapath flow exists, we should not be attempting to > + * create a new subfacet. */ > +if (subfacet->dp_packet_count || subfacet->dp_byte_count) { > +VLOG_ERR_RL(&rl, "failed to install subfacet for which " > +"datapath reported hits"); > +subfacet->dp_packet_count = subfacet->dp_byte_count = 0; > +} > + > subfacet->path = SF_NOT_INSTALLED; > } > > -- > 1.7.5.4 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] missing log directory on rhel?
The install guide here: http://n40lab.wordpress.com/2013/06/03/centos-6-4-openvswitch-installation/ shows unexpected startup errors: #/etc/init.d/openvswitch start /usr/share/openvswitch/scripts/ovs-lib: line 49: /var/log/openvswitch/ovs-ctl.log: No such file or directory tee: /var/log/openvswitch/ovs-ctl.log: No such file or directory which imply that our RHEL packaging fails to create /var/log/openvswitch. Does that make any sense? (Maybe we need to list /var/log/openvswitch at the end of openvswitch.spec.in?) Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] missing log directory on rhel?
On Fri, Jul 12, 2013 at 7:31 PM, Ben Pfaff wrote: > The install guide here: > > > http://n40lab.wordpress.com/2013/06/03/centos-6-4-openvswitch-installation/ > > shows unexpected startup errors: > > #/etc/init.d/openvswitch start > /usr/share/openvswitch/scripts/ovs-lib: line 49: > /var/log/openvswitch/ovs-ctl.log: No such file or directory > tee: /var/log/openvswitch/ovs-ctl.log: No such file or directory > > which imply that our RHEL packaging fails to create > /var/log/openvswitch. Does that make any sense? (Maybe we need to list > /var/log/openvswitch at the end of openvswitch.spec.in?) > Yes. Looks like it. I probably never saw it because the directory is not removed during package removal. > > Thanks, > > Ben. > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] rhel, xenserver: Create /var/log/openvswitch directory.
During installation create the /var/log/openvswitch directory so that openvswitch startup script is able to write the ovs-ctl.log Signed-off-by: Gurucharan Shetty --- rhel/openvswitch-fedora.spec.in |2 ++ rhel/openvswitch.spec.in |2 ++ xenserver/openvswitch-xen.spec.in |2 ++ 3 files changed, 6 insertions(+) diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in index dc40403..cc0f9db 100644 --- a/rhel/openvswitch-fedora.spec.in +++ b/rhel/openvswitch-fedora.spec.in @@ -68,6 +68,7 @@ install python/compat/uuid.py $RPM_BUILD_ROOT/usr/share/openvswitch/python install python/compat/argparse.py $RPM_BUILD_ROOT/usr/share/openvswitch/python install -d -m 755 $RPM_BUILD_ROOT/var/lib/openvswitch +install -d -m 755 $RPM_BUILD_ROOT/var/log/openvswitch %clean rm -rf $RPM_BUILD_ROOT @@ -174,6 +175,7 @@ systemctl start openvswitch.service %doc /usr/share/man/man8/ovs-test.8.gz %doc /usr/share/man/man8/ovs-l3ping.8.gz /var/lib/openvswitch +/var/log/openvswitch /usr/share/openvswitch/scripts/ovs-ctl %exclude /etc/openvswitch %exclude /usr/bin/ovs-benchmark diff --git a/rhel/openvswitch.spec.in b/rhel/openvswitch.spec.in index 53512bc..37ec271 100644 --- a/rhel/openvswitch.spec.in +++ b/rhel/openvswitch.spec.in @@ -66,6 +66,7 @@ rm \ $RPM_BUILD_ROOT/usr/share/man/man8/ovs-vlan-bug-workaround.8 install -d -m 755 $RPM_BUILD_ROOT/var/lib/openvswitch +install -d -m 755 $RPM_BUILD_ROOT/var/log/openvswitch %clean rm -rf $RPM_BUILD_ROOT @@ -155,3 +156,4 @@ exit 0 /usr/share/doc/openvswitch-%{version}/FAQ /usr/share/doc/openvswitch-%{version}/README.RHEL /var/lib/openvswitch +/var/log/openvswitch diff --git a/xenserver/openvswitch-xen.spec.in b/xenserver/openvswitch-xen.spec.in index dff18d0..a8c6395 100644 --- a/xenserver/openvswitch-xen.spec.in +++ b/xenserver/openvswitch-xen.spec.in @@ -134,6 +134,7 @@ rm \ $RPM_BUILD_ROOT/usr/share/man/man8/ovs-test.8 install -d -m 755 $RPM_BUILD_ROOT/var/lib/openvswitch +install -d -m 755 $RPM_BUILD_ROOT/var/log/openvswitch %clean rm -rf $RPM_BUILD_ROOT @@ -452,6 +453,7 @@ exit 0 /usr/share/man/man8/ovs-vsctl.8.gz /usr/share/man/man8/ovs-vswitchd.8.gz /var/lib/openvswitch +/var/log/openvswitch %exclude /usr/lib/xsconsole/plugins-base/*.py[co] %exclude /usr/share/openvswitch/scripts/*.py[co] %exclude /usr/share/openvswitch/python/*.py[co] -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] rhel, xenserver: Create /var/log/openvswitch directory.
Thanks, this looks good. But I am surprised that "make install" does not create the log directory. Perhaps that is an underlying problem that we should fix instead of adding a directory creation call to every packager. On Jul 12, 2013 9:29 PM, "Gurucharan Shetty" wrote: > During installation create the /var/log/openvswitch directory > so that openvswitch startup script is able to write the ovs-ctl.log > > Signed-off-by: Gurucharan Shetty > --- > rhel/openvswitch-fedora.spec.in |2 ++ > rhel/openvswitch.spec.in |2 ++ > xenserver/openvswitch-xen.spec.in |2 ++ > 3 files changed, 6 insertions(+) > > diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/ > openvswitch-fedora.spec.in > index dc40403..cc0f9db 100644 > --- a/rhel/openvswitch-fedora.spec.in > +++ b/rhel/openvswitch-fedora.spec.in > @@ -68,6 +68,7 @@ install python/compat/uuid.py > $RPM_BUILD_ROOT/usr/share/openvswitch/python > install python/compat/argparse.py > $RPM_BUILD_ROOT/usr/share/openvswitch/python > > install -d -m 755 $RPM_BUILD_ROOT/var/lib/openvswitch > +install -d -m 755 $RPM_BUILD_ROOT/var/log/openvswitch > > %clean > rm -rf $RPM_BUILD_ROOT > @@ -174,6 +175,7 @@ systemctl start openvswitch.service > %doc /usr/share/man/man8/ovs-test.8.gz > %doc /usr/share/man/man8/ovs-l3ping.8.gz > /var/lib/openvswitch > +/var/log/openvswitch > /usr/share/openvswitch/scripts/ovs-ctl > %exclude /etc/openvswitch > %exclude /usr/bin/ovs-benchmark > diff --git a/rhel/openvswitch.spec.in b/rhel/openvswitch.spec.in > index 53512bc..37ec271 100644 > --- a/rhel/openvswitch.spec.in > +++ b/rhel/openvswitch.spec.in > @@ -66,6 +66,7 @@ rm \ > $RPM_BUILD_ROOT/usr/share/man/man8/ovs-vlan-bug-workaround.8 > > install -d -m 755 $RPM_BUILD_ROOT/var/lib/openvswitch > +install -d -m 755 $RPM_BUILD_ROOT/var/log/openvswitch > > %clean > rm -rf $RPM_BUILD_ROOT > @@ -155,3 +156,4 @@ exit 0 > /usr/share/doc/openvswitch-%{version}/FAQ > /usr/share/doc/openvswitch-%{version}/README.RHEL > /var/lib/openvswitch > +/var/log/openvswitch > diff --git a/xenserver/openvswitch-xen.spec.in b/xenserver/ > openvswitch-xen.spec.in > index dff18d0..a8c6395 100644 > --- a/xenserver/openvswitch-xen.spec.in > +++ b/xenserver/openvswitch-xen.spec.in > @@ -134,6 +134,7 @@ rm \ > $RPM_BUILD_ROOT/usr/share/man/man8/ovs-test.8 > > install -d -m 755 $RPM_BUILD_ROOT/var/lib/openvswitch > +install -d -m 755 $RPM_BUILD_ROOT/var/log/openvswitch > > %clean > rm -rf $RPM_BUILD_ROOT > @@ -452,6 +453,7 @@ exit 0 > /usr/share/man/man8/ovs-vsctl.8.gz > /usr/share/man/man8/ovs-vswitchd.8.gz > /var/lib/openvswitch > +/var/log/openvswitch > %exclude /usr/lib/xsconsole/plugins-base/*.py[co] > %exclude /usr/share/openvswitch/scripts/*.py[co] > %exclude /usr/share/openvswitch/python/*.py[co] > -- > 1.7.9.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev