Re: [ovs-dev] Buffer in openvswitch
Hello, First thanks for the tip. You said, "NICs only trigger an interrupt after the packet has been completely DMAed into memory, which makes cut-through impossible to implement on today's servers". The triggers that you speak about is done at driver level, at kernel level or is something done by hardware? Do someone knows if this is possible to change? I think that some switch do cut-trough, did someone knows how this is implemented, or know some doc about this. Best Regards, Marco Citando Wes Felter : On 8/7/12 8:21 AM, mvp...@iol.pt wrote: Can someone tell me if is in mind an cut-through implementation for openvSwitch? I was thinking in do something like that. AFAIK, NICs only trigger an interrupt after the packet has been completely DMAed into memory, which makes cut-through impossible to implement on today's servers. There may be some possible optimizations that would make OVS faster, but they would have to be something other than cut-through. I would assume OVS is already zero-copy, so reducing copies probably won't help either. -- Wes Felter IBM Research - Austin ___ dev mailing list dev@openvswitch.orghttp://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] (no subject)
Qué usted necesita un préstamo?___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] timeval: On 64-bit systems refresh time whenever it is requested.
On Tue, Aug 07, 2012 at 05:00:59PM -0700, Leo Alterman wrote: > 64-bit glibc appears to avoid syscalls for clock_gettime(), so we can get > higher resolution timing and avoid having a timer firing off SIGALRM > without introducing extra overhead. > > Signed-off-by: Leo Alterman I'd heard that this was fast on Linux x86-64, but I'd never tested it. That seems like a good idea before committing it, so I wrote the following program: #include #include int main(void) { struct timespec start, end; int i; clock_gettime(CLOCK_MONOTONIC, &start); for (i = 0; i < 100; i++) { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); } clock_gettime(CLOCK_MONOTONIC, &end); printf("%.3f nsec per call\n", (10.0 * (end.tv_sec - start.tv_sec) + end.tv_nsec - start.tv_nsec) / 100.0); return 0; } and ran it on identical i386 and x86-64 Debian GNU/Linux "wheezy" prerelease systems with gcc version 4.6.3 and eglibc 2.13-33, compiling without out any optimizations. On i386: about 312 nsec per call. On x86-64: about 29 nsec per call. Changing CLOCK_MONOTONIC to CLOCK_REALTIME doesn't change the speed. When I comment out the clock_gettime() call on either system, it's about 3 nsec per "call", which one can figure as the overhead of the loop. So, my take is that this is fast enough on x86-64 to drop the caching. (This isn't a review of your code, just an analysis of whether it's a worthwhile change.) ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Buffer in openvswitch
On 8/8/12 4:06 AM, mvp...@iol.pt wrote: You said, "NICs only trigger an interrupt after the packet has been completely DMAed into memory, which makes cut-through impossible to implement on today's servers". The triggers that you speak about is done at driver level, at kernel level or is something done by hardware? Do someone knows if this is possible to change? It's a hardware limitation of the NIC. I think designing an efficient cut-through NIC would be pretty challenging given commodity I/O architecture. I think that some switch do cut-trough, did someone knows how this is implemented, or know some doc about this. Dally and Towles wrote a book about how switch chips work: http://store.elsevier.com/product.jsp?isbn=9780122007514 -- Wes Felter IBM Research - Austin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] Bump up the supported kernel versions to include 3.5.x.
On Tue, Aug 7, 2012 at 3:48 PM, Kyle Mestery wrote: > Signed-off-by: Kyle Mestery > --- > datapath/datapath.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Applied, thanks Kyle. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] Bump up the supported kernel versions to include 3.5.x.
On Aug 8, 2012, at 1:08 PM, Jesse Gross wrote: > On Tue, Aug 7, 2012 at 3:48 PM, Kyle Mestery wrote: >> Signed-off-by: Kyle Mestery >> --- >> datapath/datapath.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) > > Applied, thanks Kyle. Thanks Jesse. Fedora rolled up to this version, so to keep using OVS, this was necessary. Just curious, does this check basically ensure that someone at least runs OVS with future versions of the kernel before committing it into the OVS tree? Is it preventing people from shooting themselves in the foot in case an API changes upstream? Thanks! Kyle ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] lib: Correct "old-style function definition" warning.
Signed-off-by: Ed Maste --- lib/route-table-bsd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/route-table-bsd.c b/lib/route-table-bsd.c index 1c29071..eb4a168 100644 --- a/lib/route-table-bsd.c +++ b/lib/route-table-bsd.c @@ -104,7 +104,7 @@ route_table_get_name(ovs_be32 ip, char name[IFNAMSIZ]) } void -route_table_register() +route_table_register(void) { if (!register_count) { @@ -115,7 +115,7 @@ route_table_register() } void -route_table_unregister() +route_table_unregister(void) { register_count--; } -- 1.7.10.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] Bump up the supported kernel versions to include 3.5.x.
On Wed, Aug 8, 2012 at 11:13 AM, Kyle Mestery (kmestery) wrote: > On Aug 8, 2012, at 1:08 PM, Jesse Gross wrote: >> On Tue, Aug 7, 2012 at 3:48 PM, Kyle Mestery wrote: >>> Signed-off-by: Kyle Mestery >>> --- >>> datapath/datapath.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> Applied, thanks Kyle. > > Thanks Jesse. Fedora rolled up to this version, so to keep using OVS, this was > necessary. > > Just curious, does this check basically ensure that someone at least runs OVS > with > future versions of the kernel before committing it into the OVS tree? Is it > preventing people from shooting themselves in the foot in case an API changes > upstream? The latter. In the best case, when something changes upstream users get a bunch of compiler errors and complain that OVS doesn't work. In the worst case, the API that OVS uses doesn't change but the meaning does and things compile but crash or otherwise misbehave. In this case nothing needed to change but that's pretty rare. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] lib: Add header #include for writev
This fixes a warning on FreeBSD. Signed-off-by: Ed Maste --- lib/socket-util.c |1 + 1 file changed, 1 insertion(+) diff --git a/lib/socket-util.c b/lib/socket-util.c index 314761d..7c40ab8 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include "dynamic-string.h" -- 1.7.10.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] Bump up the supported kernel versions to include 3.5.x.
On Aug 8, 2012, at 1:34 PM, Jesse Gross wrote: > On Wed, Aug 8, 2012 at 11:13 AM, Kyle Mestery (kmestery) > wrote: >> On Aug 8, 2012, at 1:08 PM, Jesse Gross wrote: >>> On Tue, Aug 7, 2012 at 3:48 PM, Kyle Mestery wrote: Signed-off-by: Kyle Mestery --- datapath/datapath.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> Applied, thanks Kyle. >> >> Thanks Jesse. Fedora rolled up to this version, so to keep using OVS, this >> was >> necessary. >> >> Just curious, does this check basically ensure that someone at least runs >> OVS with >> future versions of the kernel before committing it into the OVS tree? Is it >> preventing people from shooting themselves in the foot in case an API changes >> upstream? > > The latter. In the best case, when something changes upstream users > get a bunch of compiler errors and complain that OVS doesn't work. In > the worst case, the API that OVS uses doesn't change but the meaning > does and things compile but crash or otherwise misbehave. In this > case nothing needed to change but that's pretty rare. Thanks for the explanation Jesse, this makes sense to me. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] netdev-bsd: Initialize variable to silence a compiler warning.
FreeBSD's system compiler is a somewhat old version of GCC that produced a spurious warning about a potential unitialized variable use. Signed-off-by: Ed Maste --- lib/netdev-bsd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c index 0b1a37c..f8b1188 100644 --- a/lib/netdev-bsd.c +++ b/lib/netdev-bsd.c @@ -453,7 +453,7 @@ netdev_bsd_listen(struct netdev *netdev_) struct netdev_bsd *netdev = netdev_bsd_cast(netdev_); char errbuf[PCAP_ERRBUF_SIZE]; int error; -int fd; +int fd = -1; int one = 1; if (netdev->netdev_fd >= 0) { -- 1.7.10.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] timeval: On 64-bit systems refresh time whenever it is requested.
On Tue, Aug 07, 2012 at 05:00:59PM -0700, Leo Alterman wrote: > 64-bit glibc appears to avoid syscalls for clock_gettime(), so we can get > higher resolution timing and avoid having a timer firing off SIGALRM > without introducing extra overhead. > > Signed-off-by: Leo Alterman I agree with Ed Maste that this should be limited to Linux on x86-64 (not ia64 == Itanium). Preprocessor directives should be at the left margin, not indented to the same level as the code. However, after looking at it, I think that we shouldn't be using preprocessor directives. Instead, do something like vswitchd/system-stats.c, which has: /* #ifdefs make it a pain to maintain code: you have to try to build both ways. * Thus, this file tries to compile as much of the code as possible regardless * of the target, by writing "if (LINUX)" instead of "#ifdef __linux__" where * this is possible. */ #ifdef __linux__ #include #define LINUX 1 #else #define LINUX 0 #endif and then uses "if" instead of "#ifdef" wherever it can. Does it pass the unit tests? Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] lib: Correct "old-style function definition" warning.
On Wed, Aug 08, 2012 at 06:29:32PM +, Ed Maste wrote: > Signed-off-by:Ed Maste Thanks, applied to master. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] lib: Add header #include for writev
On Wed, Aug 08, 2012 at 06:35:13PM +, Ed Maste wrote: > This fixes a warning on FreeBSD. > > Signed-off-by:Ed Maste Applied to master, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] netdev-bsd: Initialize variable to silence a compiler warning.
On Wed, Aug 08, 2012 at 06:48:01PM +, Ed Maste wrote: > FreeBSD's system compiler is a somewhat old version of GCC that produced > a spurious warning about a potential unitialized variable use. > > Signed-off-by:Ed Maste Applied to master, thanks. By the way, somehow a tab keeps sneaking into your Signed-off-by: lines. A space is conventional and doesn't confuse my scripts. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] Relax y mariscada por 60¤ / Toalla Antonio Miró por 21,69¤ / Kit de 18 piezas para Nintendo por 19,12¤
Cabestan. Layout Doble Asegúrate de no perderte ninguna oferta, añade ofer...@globalbono.com a tu lista de contactos. Si no ves correctamente las imágenes, pulsa [ http://email.globalbono.com/E07082012123134.cfm?WL=969&WS=208833_8707685&WA=417 ] aquí [ http://email.globalbono.com/Go/index.cfm?WL=564&WS=208833_8707685&WA=417 ] Tus Ofertas de hoy Síguenos en: [ http://email.globalbono.com/Go/index.cfm?WL=50&WS=208833_8707685&WA=417 ] [ http://email.globalbono.com/Go/index.cfm?WL=51&WS=208833_8707685&WA=417 ] [ http://email.globalbono.com/Go/index.cfm?WL=674&WS=208833_8707685&WA=417 ] 58% Dto. Silla Super Plegable Llévate donde quieras esta silla más que plegable. Aguanta hasta 100 kilos de peso por s... 10,47¤ [ http://email.globalbono.com/Go/index.cfm?WL=674&WS=208833_8707685&WA=417 ] Antes 25,00¤Dto: 58% Ahorro: 14,53¤ [ http://email.globalbono.com/Go/index.cfm?WL=194&WS=208833_8707685&WA=417 ] 56% Dto. Máquina de Sushi Máquina para hacer sushi que te ayudará a prepararlo perfectamente formado y de manera m... 12,67¤ [ http://email.globalbono.com/Go/index.cfm?WL=194&WS=208833_8707685&WA=417 ] Antes 29,00¤Dto: 56% Ahorro: 16,33¤ [ http://email.globalbono.com/Go/index.cfm?WL=948&WS=208833_8707685&WA=417 ] 54% Dto. Noche en Santiago, Hártate de Mariscada, Duerme en Hotel Tres Estrellas Disfruta de una fantástica noche en Santiago, y de una maravillosa mariscada. Porque con... 60,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=948&WS=208833_8707685&WA=417 ] Antes 130,00¤Dto: 54% Ahorro: 70,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=962&WS=208833_8707685&WA=417 ] 56% Dto. Amor De Luxe en Sevilla: Una Noche Perfecta en un Hotel Cinco Estrellas Junto al Guadalquivir Visitar Sevilla, salir a tomar una copa, pasear de vuelta al hotel junto al Guadalquivir... 46,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=962&WS=208833_8707685&WA=417 ] Antes 105,00¤Dto: 56% Ahorro: 59,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=886&WS=208833_8707685&WA=417 ] 25% Dto. Toalla Antonio Miro Llévate a casa esta fantástica toalla 75 x 150 diseño de Antonio Miro, ves a la playa o ... 21,69¤ [ http://email.globalbono.com/Go/index.cfm?WL=886&WS=208833_8707685&WA=417 ] Antes 29,00¤Dto: 25% Ahorro: 7,31¤ [ http://email.globalbono.com/Go/index.cfm?WL=904&WS=208833_8707685&WA=417 ] 45% Dto. Kit 18 en 1 para Nintendo DS/DSL Completo kit de accesorios para la consola Nintendo DS/DSL, con: funda de transporte, pu... 19,12¤ [ http://email.globalbono.com/Go/index.cfm?WL=904&WS=208833_8707685&WA=417 ] Antes 34,97¤Dto: 45% Ahorro: 15,85¤ [ http://email.globalbono.com/Go/index.cfm?WL=706&WS=208833_8707685&WA=417 ] De conformidad con lo que establece la Ley Orgánica 15/1999 de Protección de Datos de Carácter Personal, le informamos que sus datos personales serán incorporados a un fichero bajo la responsabilidad de GlobalBono, con la finalidad de poder atender los compromisos derivados de la relación que mantenemos con usted. Puede ejercer sus derechos de acceso, cancelación, rectificación y oposición mediante un escrito a la dirección email i...@globalbono.com. Si en el plazo de 30 días no nos comunica lo contrario, entenderemos que los datos no han sido modificados, que se compromete a notificarnos cualquier variación y que tenemos el consentimiento para utilizarlos a fin de poder fidelizar la relación entre las partes. ¿Quieres dejar de recibir este email? Haz click [ http://email.globalbono.com/baja/form_baja_GB.cfm?WL=182&WS=208833_8707685&WA=417 ] aquí. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/3] reconnect: Rename reconnect_received() to reconnect_activity().
Receiving data is not the only reasonable way to verify that a connection is up. For example, on a TCP connection, receiving an acknowledgment that the remote side has accepted data that we sent is also a reasonable means. Therefore, this commit generalizes the naming. Also, similarly for the Python implementation: Reconnect.received() becomes Reconnect.activity(). Signed-off-by: Ben Pfaff --- lib/jsonrpc.c |2 +- lib/reconnect.c | 32 +++--- lib/reconnect.h |6 ++-- python/ovs/jsonrpc.py |2 +- python/ovs/reconnect.py | 29 ++- tests/reconnect.at | 68 +++--- tests/test-reconnect.c | 14 +- tests/test-reconnect.py | 14 +- 8 files changed, 84 insertions(+), 83 deletions(-) diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index 5761369..c4d7dd2 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -977,7 +977,7 @@ jsonrpc_session_recv(struct jsonrpc_session *s) struct jsonrpc_msg *msg; jsonrpc_recv(s->rpc, &msg); if (msg) { -reconnect_received(s->reconnect, time_msec()); +reconnect_activity(s->reconnect, time_msec()); if (msg->type == JSONRPC_REQUEST && !strcmp(msg->method, "echo")) { /* Echo request. Send reply. */ struct jsonrpc_msg *reply; diff --git a/lib/reconnect.c b/lib/reconnect.c index 0333d96..0f1e062 100644 --- a/lib/reconnect.c +++ b/lib/reconnect.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2012 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -58,7 +58,7 @@ struct reconnect { enum state state; long long int state_entered; int backoff; -long long int last_received; +long long int last_activity; long long int last_connected; long long int last_disconnected; unsigned int max_tries; @@ -105,7 +105,7 @@ reconnect_create(long long int now) fsm->state = S_VOID; fsm->state_entered = now; fsm->backoff = 0; -fsm->last_received = now; +fsm->last_activity = now; fsm->last_connected = LLONG_MAX; fsm->last_disconnected = LLONG_MAX; fsm->max_tries = UINT_MAX; @@ -176,9 +176,9 @@ reconnect_get_max_backoff(const struct reconnect *fsm) /* Returns the "probe interval" for 'fsm' in milliseconds. If this is zero, it * disables the connection keepalive feature. If it is nonzero, then if the - * interval passes while 'fsm' is connected and without reconnect_received() + * interval passes while 'fsm' is connected and without reconnect_activity() * being called for 'fsm', reconnect_run() returns RECONNECT_PROBE. If the - * interval passes again without reconnect_received() being called, + * interval passes again without reconnect_activity() being called, * reconnect_run() returns RECONNECT_DISCONNECT for 'fsm'. */ int reconnect_get_probe_interval(const struct reconnect *fsm) @@ -233,8 +233,8 @@ reconnect_set_backoff(struct reconnect *fsm, int min_backoff, int max_backoff) /* Sets the "probe interval" for 'fsm' to 'probe_interval', in milliseconds. * If this is zero, it disables the connection keepalive feature. If it is * nonzero, then if the interval passes while 'fsm' is connected and without - * reconnect_received() being called for 'fsm', reconnect_run() returns - * RECONNECT_PROBE. If the interval passes again without reconnect_received() + * reconnect_activity() being called for 'fsm', reconnect_run() returns + * RECONNECT_PROBE. If the interval passes again without reconnect_activity() * being called, reconnect_run() returns RECONNECT_DISCONNECT for 'fsm'. * * If 'probe_interval' is nonzero, then it will be forced to a value of at @@ -360,7 +360,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) } /* Back off. */ if (fsm->state & (S_ACTIVE | S_IDLE) - && (fsm->last_received - fsm->last_connected >= fsm->backoff + && (fsm->last_activity - fsm->last_connected >= fsm->backoff || fsm->passive)) { fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; } else { @@ -441,7 +441,7 @@ reconnect_listen_error(struct reconnect *fsm, long long int now, int error) /* Tell 'fsm' that the connection was successful. * * The FSM will start the probe interval timer, which is reset by - * reconnect_received(). If the timer expires, a probe will be sent (by + * reconnect_activity(). If the timer expires, a probe will be sent (by * returning RECONNECT_PROBE from reconnect_run()). If the timer expires * again without being reset, the connection will be aborted (by returning * RECONNECT_DISCONNECT from reconnect_run()). */ @@ -467,15 +467,15 @@ reconnect_connect_failed(struct reconnect *fsm, long lon
[ovs-dev] [PATCH 2/3] jsonrpc: Treat draining data from send queue as activity.
Until now, the jsonrpc module has used messages received from the remote peer as the sole means to determine that the JSON-RPC connection is up. This could in theory interact badly with a remote peer that stops reading and processing messages from the receive queue when there is a backlog in the send queue for a given connection (ovsdb-server is an example of a program that behaves this way). This commit fixes the problem by expanding the definition of "activity" to include successfully sending JSON-RPC data that was previously queued. The above change is exactly analogous to the similar change made to the rconn library in commit 133f2dc95454 (rconn: Treat draining a message from the send queue as activity.). Bug #12789. Signed-off-by: Ben Pfaff --- lib/jsonrpc.c | 13 + python/ovs/jsonrpc.py | 11 ++- 2 files changed, 23 insertions(+), 1 deletions(-) diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index c4d7dd2..9c0aaa9 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -880,9 +880,22 @@ jsonrpc_session_run(struct jsonrpc_session *s) } if (s->rpc) { +size_t backlog; int error; +backlog = jsonrpc_get_backlog(s->rpc); jsonrpc_run(s->rpc); +if (jsonrpc_get_backlog(s->rpc) < backlog) { +/* Data previously caught in a queue was successfully sent. + * + * We don't count data that is successfully sent immediately as + * activity, because there's a lot of queuing downstream from us, + * which means that we can push a lot of data into a connection + * that has stalled and won't ever recover. + */ +reconnect_activity(s->reconnect, time_msec()); +} + error = jsonrpc_get_status(s->rpc); if (error) { reconnect_disconnected(s->reconnect, time_msec(), error); diff --git a/python/ovs/jsonrpc.py b/python/ovs/jsonrpc.py index cb471cb..4496bba 100644 --- a/python/ovs/jsonrpc.py +++ b/python/ovs/jsonrpc.py @@ -502,9 +502,18 @@ class Session(object): def recv(self): if self.rpc is not None: +backlog = self.rpc.get_backlog() error, msg = self.rpc.recv() -if not error: +if self.rpc.get_backlog() < backlog: +# Data previously caught in a queue was successfully sent. +# +# We don't count data that is successfully sent immediately as +# activity, because there's a lot of queuing downstream from +# us, which means that we can push a lot of data into a +# connection that has stalled and won't ever recover. self.reconnect.activity(ovs.timeval.msec()) + +if not error: if msg.type == Message.T_REQUEST and msg.method == "echo": # Echo request. Send reply. self.send(Message.create_reply(msg.params, msg.id)) -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 0/3] Fix bugs in jsonrpc that we previously fixed in rconn
The problem fixed in commit 612be57384 also exists in the jsonrpc code, although we haven't actually observed it there. This series fixes that problem in jsonrpc too, as a preemptive measure. Ben Pfaff (3): reconnect: Rename reconnect_received() to reconnect_activity(). jsonrpc: Treat draining data from send queue as activity. jsonrpc: Treat receiving part of a message as activity. lib/jsonrpc.c | 34 ++- lib/jsonrpc.h |1 + lib/reconnect.c | 32 +++--- lib/reconnect.h |6 ++-- python/ovs/jsonrpc.py | 25 - python/ovs/reconnect.py | 29 ++- tests/reconnect.at | 68 +++--- tests/test-reconnect.c | 14 +- tests/test-reconnect.py | 14 +- 9 files changed, 140 insertions(+), 83 deletions(-) -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 3/3] jsonrpc: Treat receiving part of a message as activity.
Until now, the jsonrpc code has only counted receiving a full JSON-RPC messages as activity. This could theoretically time out, then, while a very long message is in transit or if a slow link is involved. This commit changes this code to count receiving any part of a message as activity. This isn't a problem for OpenFlow connections because OpenFlow messages are at most 64 kB in size. This problem hasn't actually been observed in practice. Bug #12789. Signed-off-by: Ben Pfaff --- lib/jsonrpc.c | 21 - lib/jsonrpc.h |1 + python/ovs/jsonrpc.py | 14 ++ 3 files changed, 35 insertions(+), 1 deletions(-) diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index 9c0aaa9..81a6e53 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -178,6 +178,14 @@ jsonrpc_get_backlog(const struct jsonrpc *rpc) return rpc->status ? 0 : rpc->backlog; } +/* Returns the number of bytes that have been received on 'rpc''s underlying + * stream. (The value wraps around if it exceeds UINT_MAX.) */ +unsigned int +jsonrpc_get_received_bytes(const struct jsonrpc *rpc) +{ +return rpc->input.head; +} + /* Returns 'rpc''s name, that is, the name returned by stream_get_name() for * the stream underlying 'rpc' when 'rpc' was created. */ const char * @@ -987,10 +995,21 @@ struct jsonrpc_msg * jsonrpc_session_recv(struct jsonrpc_session *s) { if (s->rpc) { +unsigned int received_bytes; struct jsonrpc_msg *msg; + +received_bytes = jsonrpc_get_received_bytes(s->rpc); jsonrpc_recv(s->rpc, &msg); -if (msg) { +if (received_bytes != jsonrpc_get_received_bytes(s->rpc)) { +/* Data was successfully received. + * + * Previously we only counted receiving a full message as activity, + * but with large messages or a slow connection that policy could + * time out the session mid-message. */ reconnect_activity(s->reconnect, time_msec()); +} + +if (msg) { if (msg->type == JSONRPC_REQUEST && !strcmp(msg->method, "echo")) { /* Echo request. Send reply. */ struct jsonrpc_msg *reply; diff --git a/lib/jsonrpc.h b/lib/jsonrpc.h index cd78170..44ae06f 100644 --- a/lib/jsonrpc.h +++ b/lib/jsonrpc.h @@ -50,6 +50,7 @@ void jsonrpc_wait(struct jsonrpc *); int jsonrpc_get_status(const struct jsonrpc *); size_t jsonrpc_get_backlog(const struct jsonrpc *); +unsigned int jsonrpc_get_received_bytes(const struct jsonrpc *); const char *jsonrpc_get_name(const struct jsonrpc *); int jsonrpc_send(struct jsonrpc *, struct jsonrpc_msg *); diff --git a/python/ovs/jsonrpc.py b/python/ovs/jsonrpc.py index 4496bba..67e798a 100644 --- a/python/ovs/jsonrpc.py +++ b/python/ovs/jsonrpc.py @@ -186,6 +186,7 @@ class Connection(object): self.input = "" self.output = "" self.parser = None +self.received_bytes = 0 def close(self): self.stream.close() @@ -221,6 +222,9 @@ class Connection(object): else: return len(self.output) +def get_received_bytes(self): +return self.received_bytes + def __log_msg(self, title, msg): vlog.dbg("%s: %s %s" % (self.name, title, msg)) @@ -271,6 +275,7 @@ class Connection(object): return EOF, None else: self.input += data +self.received_bytes += len(data) else: if self.parser is None: self.parser = ovs.json.Parser() @@ -444,7 +449,16 @@ class Session(object): self.pstream = None if self.rpc: +received_bytes = self.rpc.get_received_bytes() self.rpc.run() +if received_bytes != self.rpc.get_received_bytes(): +# Data was successfully received. +# +# Previously we only counted receiving a full message as +# activity, but with large messages or a slow connection that +# policy could time out the session mid-message. +self.reconnect.activity(ovs.timeval.msec()) + error = self.rpc.get_status() if error != 0: self.reconnect.disconnected(ovs.timeval.msec(), error) -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 07/41 v11] ofp-util: Prepare Packet Out encoder for other Open Flow versions
On Wed, Aug 08, 2012 at 12:19:57PM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman > -size_t size; > +size_t size = 0; > > size = po->ofpacts_len; > if (po->buffer_id == UINT32_MAX) { > -size += po->packet_len; > +size = po->packet_len; > } I know of a reason for either of these changes, so I omitted them. Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 08/41] ofp-util: Allow encoding of Open Flow 1.1 and 1.2 Packet Out Messages
On Wed, Aug 08, 2012 at 06:49:43AM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman Thanks. I applied the series up to this point to master. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 09/41] ofp-util: Prepare Packet Out decoder for other Open Flow versions
On Wed, Aug 08, 2012 at 06:49:44AM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman > { > -const struct ofp_packet_out *opo; > -enum ofperr error; > enum ofpraw raw; > struct ofpbuf b; > +enum ofperr bad_in_port_err = OFPERR_OFPET_BAD_REQUEST; > > ofpbuf_use_const(&b, oh, ntohs(oh->length)); > raw = ofpraw_pull_assert(&b); > -assert(raw == OFPRAW_OFPT10_PACKET_OUT); > > -opo = ofpbuf_pull(&b, sizeof *opo); > -po->buffer_id = ntohl(opo->buffer_id); > -po->in_port = ntohs(opo->in_port); > +if (raw == OFPRAW_OFPT10_PACKET_OUT) { > +enum ofperr error; > +const struct ofp_packet_out *opo = ofpbuf_pull(&b, sizeof *opo); > + > +po->buffer_id = ntohl(opo->buffer_id); > +po->in_port = ntohs(opo->in_port); > + > +error = ofpacts_pull_openflow10(&b, ntohs(opo->actions_len), > ofpacts); > +if (error) { > +return error; > +} > +bad_in_port_err = OFPERR_NXBRC_BAD_IN_PORT; > +} else { > +NOT_REACHED(); > +} I think that we should consistently use OFPERR_NXBRC_BAD_IN_PORT. OFPERR_OFPET_BAD_REQUEST isn't a good substitute because it isn't encodable as an error at all (it is an error category, not an error itself). If there's a better choice for a standard error code in some OpenFlow version then I'm happy to use that. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 07/41 v11] ofp-util: Prepare Packet Out encoder for other Open Flow versions
On Wed, Aug 08, 2012 at 04:32:49PM -0700, Ben Pfaff wrote: > On Wed, Aug 08, 2012 at 12:19:57PM +0900, Simon Horman wrote: > > Signed-off-by: Simon Horman > > > -size_t size; > > +size_t size = 0; > > > > size = po->ofpacts_len; > > if (po->buffer_id == UINT32_MAX) { > > -size += po->packet_len; > > +size = po->packet_len; > > } > > I know of a reason for either of these changes, so I omitted them. Sorry about that, they are a legacy from much earlier revisions of this code. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 09/41] ofp-util: Prepare Packet Out decoder for other Open Flow versions
On Wed, Aug 08, 2012 at 04:42:30PM -0700, Ben Pfaff wrote: > On Wed, Aug 08, 2012 at 06:49:44AM +0900, Simon Horman wrote: > > Signed-off-by: Simon Horman > > { > > -const struct ofp_packet_out *opo; > > -enum ofperr error; > > enum ofpraw raw; > > struct ofpbuf b; > > +enum ofperr bad_in_port_err = OFPERR_OFPET_BAD_REQUEST; > > > > ofpbuf_use_const(&b, oh, ntohs(oh->length)); > > raw = ofpraw_pull_assert(&b); > > -assert(raw == OFPRAW_OFPT10_PACKET_OUT); > > > > -opo = ofpbuf_pull(&b, sizeof *opo); > > -po->buffer_id = ntohl(opo->buffer_id); > > -po->in_port = ntohs(opo->in_port); > > +if (raw == OFPRAW_OFPT10_PACKET_OUT) { > > +enum ofperr error; > > +const struct ofp_packet_out *opo = ofpbuf_pull(&b, sizeof *opo); > > + > > +po->buffer_id = ntohl(opo->buffer_id); > > +po->in_port = ntohs(opo->in_port); > > + > > +error = ofpacts_pull_openflow10(&b, ntohs(opo->actions_len), > > ofpacts); > > +if (error) { > > +return error; > > +} > > +bad_in_port_err = OFPERR_NXBRC_BAD_IN_PORT; > > +} else { > > +NOT_REACHED(); > > +} > > I think that we should consistently use OFPERR_NXBRC_BAD_IN_PORT. > OFPERR_OFPET_BAD_REQUEST isn't a good substitute because it isn't > encodable as an error at all (it is an error category, not an error > itself). > > If there's a better choice for a standard error code in some OpenFlow > version then I'm happy to use that. I'm comfortable with using OFPERR_NXBRC_BAD_IN_PORT if you are. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 09/41] ofp-util: Prepare Packet Out decoder for other Open Flow versions
On Thu, Aug 09, 2012 at 08:56:10AM +0900, Simon Horman wrote: > On Wed, Aug 08, 2012 at 04:42:30PM -0700, Ben Pfaff wrote: > > On Wed, Aug 08, 2012 at 06:49:44AM +0900, Simon Horman wrote: > > > Signed-off-by: Simon Horman > > > { > > > -const struct ofp_packet_out *opo; > > > -enum ofperr error; > > > enum ofpraw raw; > > > struct ofpbuf b; > > > +enum ofperr bad_in_port_err = OFPERR_OFPET_BAD_REQUEST; > > > > > > ofpbuf_use_const(&b, oh, ntohs(oh->length)); > > > raw = ofpraw_pull_assert(&b); > > > -assert(raw == OFPRAW_OFPT10_PACKET_OUT); > > > > > > -opo = ofpbuf_pull(&b, sizeof *opo); > > > -po->buffer_id = ntohl(opo->buffer_id); > > > -po->in_port = ntohs(opo->in_port); > > > +if (raw == OFPRAW_OFPT10_PACKET_OUT) { > > > +enum ofperr error; > > > +const struct ofp_packet_out *opo = ofpbuf_pull(&b, sizeof *opo); > > > + > > > +po->buffer_id = ntohl(opo->buffer_id); > > > +po->in_port = ntohs(opo->in_port); > > > + > > > +error = ofpacts_pull_openflow10(&b, ntohs(opo->actions_len), > > > ofpacts); > > > +if (error) { > > > +return error; > > > +} > > > +bad_in_port_err = OFPERR_NXBRC_BAD_IN_PORT; > > > +} else { > > > +NOT_REACHED(); > > > +} > > > > I think that we should consistently use OFPERR_NXBRC_BAD_IN_PORT. > > OFPERR_OFPET_BAD_REQUEST isn't a good substitute because it isn't > > encodable as an error at all (it is an error category, not an error > > itself). > > > > If there's a better choice for a standard error code in some OpenFlow > > version then I'm happy to use that. > > I'm comfortable with using OFPERR_NXBRC_BAD_IN_PORT if you are. Thanks. This is what I ended up with: --8<--cut here-->8-- From: Simon Horman Date: Wed, 8 Aug 2012 06:49:44 +0900 Subject: [PATCH] ofp-util: Prepare Packet Out decoder for other Open Flow versions Signed-off-by: Simon Horman Signed-off-by: Ben Pfaff --- lib/ofp-util.c | 25 +++-- 1 files changed, 15 insertions(+), 10 deletions(-) diff --git a/lib/ofp-util.c b/lib/ofp-util.c index 4632482..64eb5ca 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -2149,18 +2149,27 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, const struct ofp_header *oh, struct ofpbuf *ofpacts) { -const struct ofp_packet_out *opo; -enum ofperr error; enum ofpraw raw; struct ofpbuf b; ofpbuf_use_const(&b, oh, ntohs(oh->length)); raw = ofpraw_pull_assert(&b); -assert(raw == OFPRAW_OFPT10_PACKET_OUT); -opo = ofpbuf_pull(&b, sizeof *opo); -po->buffer_id = ntohl(opo->buffer_id); -po->in_port = ntohs(opo->in_port); +if (raw == OFPRAW_OFPT10_PACKET_OUT) { +enum ofperr error; +const struct ofp_packet_out *opo = ofpbuf_pull(&b, sizeof *opo); + +po->buffer_id = ntohl(opo->buffer_id); +po->in_port = ntohs(opo->in_port); + +error = ofpacts_pull_openflow10(&b, ntohs(opo->actions_len), ofpacts); +if (error) { +return error; +} +} else { +NOT_REACHED(); +} + if (po->in_port >= OFPP_MAX && po->in_port != OFPP_LOCAL && po->in_port != OFPP_NONE && po->in_port != OFPP_CONTROLLER) { VLOG_WARN_RL(&bad_ofmsg_rl, "packet-out has bad input port %#"PRIx16, @@ -2168,10 +2177,6 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, return OFPERR_NXBRC_BAD_IN_PORT; } -error = ofpacts_pull_openflow10(&b, ntohs(opo->actions_len), ofpacts); -if (error) { -return error; -} po->ofpacts = ofpacts->data; po->ofpacts_len = ofpacts->size; -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 10/41] ofp-util: Allow decoding of Open Flow 1.1 and 1.2 Packet Out Messages
On Wed, Aug 08, 2012 at 06:49:45AM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman > +bad_in_port_err = OFPERR_OFPBMC_BAD_VALUE; Oh, I see, I misunderstood that there *was* a reasonable error code in other versions of OpenFlow. Sorry about that. I applied the following incremental to your patch which, I think, makes 9+10 the same as 9+10 as you sent, just with bits moved between the patches: diff --git a/lib/ofp-util.c b/lib/ofp-util.c index 1139bb1..6b92dcd 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -2149,6 +2149,7 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, const struct ofp_header *oh, struct ofpbuf *ofpacts) { +enum ofperr bad_in_port_err; enum ofpraw raw; struct ofpbuf b; @@ -2183,6 +2184,8 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, if (error) { return error; } + +bad_in_port_err = OFPERR_NXBRC_BAD_IN_PORT; } else { NOT_REACHED(); } @@ -2191,7 +2194,7 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, && po->in_port != OFPP_NONE && po->in_port != OFPP_CONTROLLER) { VLOG_WARN_RL(&bad_ofmsg_rl, "packet-out has bad input port %#"PRIx16, po->in_port); -return OFPERR_NXBRC_BAD_IN_PORT; +return bad_in_port_err; } po->ofpacts = ofpacts->data; ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 09/41] ofp-util: Prepare Packet Out decoder for other Open Flow versions
On Wed, Aug 08, 2012 at 05:00:02PM -0700, Ben Pfaff wrote: > On Thu, Aug 09, 2012 at 08:56:10AM +0900, Simon Horman wrote: > > On Wed, Aug 08, 2012 at 04:42:30PM -0700, Ben Pfaff wrote: > > > On Wed, Aug 08, 2012 at 06:49:44AM +0900, Simon Horman wrote: > > > > Signed-off-by: Simon Horman > > > > { > > > > -const struct ofp_packet_out *opo; > > > > -enum ofperr error; > > > > enum ofpraw raw; > > > > struct ofpbuf b; > > > > +enum ofperr bad_in_port_err = OFPERR_OFPET_BAD_REQUEST; > > > > > > > > ofpbuf_use_const(&b, oh, ntohs(oh->length)); > > > > raw = ofpraw_pull_assert(&b); > > > > -assert(raw == OFPRAW_OFPT10_PACKET_OUT); > > > > > > > > -opo = ofpbuf_pull(&b, sizeof *opo); > > > > -po->buffer_id = ntohl(opo->buffer_id); > > > > -po->in_port = ntohs(opo->in_port); > > > > +if (raw == OFPRAW_OFPT10_PACKET_OUT) { > > > > +enum ofperr error; > > > > +const struct ofp_packet_out *opo = ofpbuf_pull(&b, sizeof > > > > *opo); > > > > + > > > > +po->buffer_id = ntohl(opo->buffer_id); > > > > +po->in_port = ntohs(opo->in_port); > > > > + > > > > +error = ofpacts_pull_openflow10(&b, ntohs(opo->actions_len), > > > > ofpacts); > > > > +if (error) { > > > > +return error; > > > > +} > > > > +bad_in_port_err = OFPERR_NXBRC_BAD_IN_PORT; > > > > +} else { > > > > +NOT_REACHED(); > > > > +} > > > > > > I think that we should consistently use OFPERR_NXBRC_BAD_IN_PORT. > > > OFPERR_OFPET_BAD_REQUEST isn't a good substitute because it isn't > > > encodable as an error at all (it is an error category, not an error > > > itself). > > > > > > If there's a better choice for a standard error code in some OpenFlow > > > version then I'm happy to use that. > > > > I'm comfortable with using OFPERR_NXBRC_BAD_IN_PORT if you are. > > Thanks. > > This is what I ended up with: Thanks, looks good. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 11/41] ofp-util: Allow encoding of Open Flow 1.2 Flow Removed messages
On Wed, Aug 08, 2012 at 06:49:46AM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman Hmm, ofp12_flow_removed has a 'hard_timeout' member that this doesn't populate. I queued this up anyway. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 12/41] ofp-util: Allow encoding Open Flow 1.2 Flow Stats Request Messages
On Wed, Aug 08, 2012 at 06:49:47AM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman > +case OFPUTIL_P_OF12: { > +struct ofp11_flow_stats_request *ofsr; > + > +raw = (fsr->aggregate > + ? OFPRAW_OFPST_AGGREGATE_REQUEST > + : OFPRAW_OFPST_FLOW_REQUEST); > +msg = ofpraw_alloc(raw, OFP12_VERSION, 0); I changed 0 to NXM_TYPICAL_LEN here, and in the existing OFPUTIL_P_NXM case too. > +ofsr = ofpbuf_put_zeros(msg, sizeof *ofsr); > +ofsr->table_id = fsr->table_id; > +ofsr->out_port = ofputil_port_to_ofp11(fsr->out_port); > +ofsr->out_group = htonl(OFPG11_ANY); > +ofsr->cookie = fsr->cookie; > +ofsr->cookie_mask = fsr->cookie_mask; > +oxm_put_match(msg, &fsr->match); > +break; > +} ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 12/41] ofp-util: Allow encoding Open Flow 1.2 Flow Stats Request Messages
On Wed, Aug 08, 2012 at 05:10:18PM -0700, Ben Pfaff wrote: > On Wed, Aug 08, 2012 at 06:49:47AM +0900, Simon Horman wrote: > > Signed-off-by: Simon Horman > > +case OFPUTIL_P_OF12: { > > +struct ofp11_flow_stats_request *ofsr; > > + > > +raw = (fsr->aggregate > > + ? OFPRAW_OFPST_AGGREGATE_REQUEST > > + : OFPRAW_OFPST_FLOW_REQUEST); > > +msg = ofpraw_alloc(raw, OFP12_VERSION, 0); > > I changed 0 to NXM_TYPICAL_LEN here, and in the existing OFPUTIL_P_NXM > case too. I pushed the series to this point to master. I'll continue looking it over in time, no need to resend or rebase at this point. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/2] timeval: On Linux x86-64 systems refresh time whenever it is requested.
64-bit glibc appears to avoid syscalls for clock_gettime(), so we can get higher resolution timing and avoid having a timer firing off SIGALRM without introducing extra overhead. Signed-off-by: Leo Alterman --- lib/timeval.c| 36 +--- lib/timeval.h| 15 +++ tests/test-timeval.c |8 3 files changed, 52 insertions(+), 7 deletions(-) diff --git a/lib/timeval.c b/lib/timeval.c index d29b661..3d339e4 100644 --- a/lib/timeval.c +++ b/lib/timeval.c @@ -40,7 +40,7 @@ VLOG_DEFINE_THIS_MODULE(timeval); * to CLOCK_REALTIME. */ static clockid_t monotonic_clock; -/* Has a timer tick occurred? +/* Has a timer tick occurred? Only relevant if CACHE_TIME is 1. * * We initialize these to true to force time_init() to get called on the first * call to time_msec() or another function that queries the current time. */ @@ -94,8 +94,11 @@ time_init(void) VLOG_DBG("monotonic timer not available"); } -set_up_signal(SA_RESTART); -set_up_timer(); +if (CACHE_TIME) { +set_up_signal(SA_RESTART); +set_up_timer(); +} + boot_time = time_msec(); } @@ -168,7 +171,16 @@ void time_postfork(void) { time_init(); -set_up_timer(); + +if (CACHE_TIME) { +set_up_timer(); +} else { +/* If we are not caching kernel time, the only reason the timer should + * exist is if time_alarm() was called and deadline is set */ +if (deadline != TIME_MIN) { +set_up_timer(); +} +} } static void @@ -199,7 +211,9 @@ refresh_monotonic(void) /* Forces a refresh of the current time from the kernel. It is not usually * necessary to call this function, since the time will be refreshed - * automatically at least every TIME_UPDATE_INTERVAL milliseconds. */ + * automatically at least every TIME_UPDATE_INTERVAL milliseconds. If + * CACHE_TIME is 0, we will always refresh the current time so this + * function has no effect. */ void time_refresh(void) { @@ -275,9 +289,17 @@ time_alarm(unsigned int secs) sigset_t oldsigs; time_init(); + block_sigalrm(&oldsigs); deadline = secs ? time_add(time_now(), secs) : TIME_MIN; unblock_sigalrm(&oldsigs); + +if (!CACHE_TIME) { +/* If we aren't timing the gaps between kernel time refreshes we need to + * to start the timer up now */ +set_up_signal(SA_RESTART); +set_up_timer(); +} } /* Like poll(), except: @@ -366,7 +388,7 @@ sigalrm_handler(int sig_nr) static void refresh_wall_if_ticked(void) { -if (wall_tick) { +if (!CACHE_TIME || wall_tick) { refresh_wall(); } } @@ -374,7 +396,7 @@ refresh_wall_if_ticked(void) static void refresh_monotonic_if_ticked(void) { -if (monotonic_tick) { +if (!CACHE_TIME || monotonic_tick) { refresh_monotonic(); } } diff --git a/lib/timeval.h b/lib/timeval.h index e8413ff..12fbbc3 100644 --- a/lib/timeval.h +++ b/lib/timeval.h @@ -36,6 +36,21 @@ BUILD_ASSERT_DECL(TYPE_IS_INTEGER(time_t)); * ever encounter such a platform. */ BUILD_ASSERT_DECL(TYPE_IS_SIGNED(time_t)); +/* On Linux x86-64 systems, glibc avoids using syscalls for clock_gettime(). + * + * For systems which do invoke a system call we wait at least + * TIME_UPDATE_INTERVAL ms between clock_gettime() calls and cache the time for + * the interim. + * + * For systems which do not invoke a system call, we just call clock_gettime() + * whenever the time is requested. As a result we don't start the background + * SIGALRM timer unless explicitly needed by time_alarm() */ +#if defined __LP64__ && defined __linux__ +#define CACHE_TIME 0 +#else +#define CACHE_TIME 1 +#endif + #define TIME_MAX TYPE_MAXIMUM(time_t) #define TIME_MIN TYPE_MINIMUM(time_t) diff --git a/tests/test-timeval.c b/tests/test-timeval.c index d277fc9..504ee9e 100644 --- a/tests/test-timeval.c +++ b/tests/test-timeval.c @@ -97,6 +97,11 @@ main(int argc, char *argv[]) if (argc != 2) { usage(); } else if (!strcmp(argv[1], "plain")) { +/* If we're not caching time there isn't much to test and SIGALRM won't + * be around to pull us out of the select() call, so just skip out */ +if (!CACHE_TIME) +exit (77); + do_test(); } else if (!strcmp(argv[1], "daemon")) { /* Test that time still advances even in a daemon. This is an @@ -104,6 +109,9 @@ main(int argc, char *argv[]) char cwd[1024], *pidfile; FILE *success; +if (!CACHE_TIME) +exit (77); + assert(getcwd(cwd, sizeof cwd) == cwd); unlink("test-timeval.success"); -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 2/2] lockfile: Remove lockfile_lock timeout argument
lockfile_lock() accepts a timeout argument but, aside from unit tests pertaining to timeout, its value is always 0. Since this feature relies on a periodic SIGALRM signal, which is no longer a given after refactoring timeval.c, the cleanest solution is to just remove it. Signed-off-by: Leo Alterman --- lib/lockfile.c| 35 - lib/lockfile.h|2 +- ovsdb/file.c |2 +- ovsdb/log.c |2 +- ovsdb/ovsdb-tool.c|4 +- tests/lockfile.at |2 - tests/test-lockfile.c | 101 ++--- 7 files changed, 41 insertions(+), 107 deletions(-) diff --git a/lib/lockfile.c b/lib/lockfile.c index 7ac3465..0e52941 100644 --- a/lib/lockfile.c +++ b/lib/lockfile.c @@ -87,16 +87,12 @@ lockfile_name(const char *filename_) /* Locks the configuration file against modification by other processes and * re-reads it from disk. * - * The 'timeout' specifies the maximum number of milliseconds to wait for the - * config file to become free. Use 0 to avoid waiting or INT_MAX to wait - * forever. - * * Returns 0 on success, otherwise a positive errno value. On success, * '*lockfilep' is set to point to a new "struct lockfile *" that may be * unlocked with lockfile_unlock(). On failure, '*lockfilep' is set to * NULL. */ int -lockfile_lock(const char *file, int timeout, struct lockfile **lockfilep) +lockfile_lock(const char *file, struct lockfile **lockfilep) { /* Only exclusive ("write") locks are supported. This is not a problem * because the Open vSwitch code that currently uses lock files does so in @@ -110,33 +106,16 @@ lockfile_lock(const char *file, int timeout, struct lockfile **lockfilep) COVERAGE_INC(lockfile_lock); lock_name = lockfile_name(file); -time_refresh(); -start = time_msec(); - -do { -error = lockfile_try_lock(lock_name, timeout > 0, lockfilep); -time_refresh(); -elapsed = time_msec() - start; -if (elapsed > warn_elapsed) { -warn_elapsed *= 2; -VLOG_WARN("%s: waiting for lock file, %lld ms elapsed", - lock_name, elapsed); -} -} while (error == EINTR && (timeout == INT_MAX || elapsed < timeout)); - -if (error == EINTR) { -COVERAGE_INC(lockfile_timeout); -VLOG_WARN("%s: giving up on lock file after %lld ms", - lock_name, elapsed); -error = ETIMEDOUT; -} else if (error) { + +error = lockfile_try_lock(lock_name, 0, lockfilep); + +if (error) { COVERAGE_INC(lockfile_error); if (error == EACCES) { error = EAGAIN; } -VLOG_WARN("%s: failed to lock file " - "(after %lld ms, with %d-ms timeout): %s", - lock_name, elapsed, timeout, strerror(error)); +VLOG_WARN("%s: failed to lock file: %s", + lock_name, strerror(error)); } free(lock_name); diff --git a/lib/lockfile.h b/lib/lockfile.h index 202134b..3e6b54c 100644 --- a/lib/lockfile.h +++ b/lib/lockfile.h @@ -19,7 +19,7 @@ struct lockfile; char *lockfile_name(const char *file); -int lockfile_lock(const char *file, int timeout, struct lockfile **); +int lockfile_lock(const char *file, struct lockfile **); void lockfile_unlock(struct lockfile *); void lockfile_postfork(void); diff --git a/ovsdb/file.c b/ovsdb/file.c index 9e2dd50..43fcb95 100644 --- a/ovsdb/file.c +++ b/ovsdb/file.c @@ -649,7 +649,7 @@ ovsdb_file_compact(struct ovsdb_file *file) /* Lock temporary file. */ tmp_name = xasprintf("%s.tmp", file->file_name); -retval = lockfile_lock(tmp_name, 0, &tmp_lock); +retval = lockfile_lock(tmp_name, &tmp_lock); if (retval) { error = ovsdb_io_error(retval, "could not get lock on %s", tmp_name); goto exit; diff --git a/ovsdb/log.c b/ovsdb/log.c index 9b882dc..b79535a 100644 --- a/ovsdb/log.c +++ b/ovsdb/log.c @@ -80,7 +80,7 @@ ovsdb_log_open(const char *name, enum ovsdb_log_open_mode open_mode, locking = open_mode != OVSDB_LOG_READ_ONLY; } if (locking) { -int retval = lockfile_lock(name, 0, &lockfile); +int retval = lockfile_lock(name, &lockfile); if (retval) { error = ovsdb_io_error(retval, "%s: failed to lock lockfile", name); diff --git a/ovsdb/ovsdb-tool.c b/ovsdb/ovsdb-tool.c index f31bdd1..6b75f49 100644 --- a/ovsdb/ovsdb-tool.c +++ b/ovsdb/ovsdb-tool.c @@ -229,14 +229,14 @@ compact_or_convert(const char *src_name_, const char *dst_name_, /* Lock the source, if we will be replacing it. */ if (in_place) { -retval = lockfile_lock(src_name, 0, &src_lock); +retval = lockfile_lock(src_name, &src_lock); if (retval) { ovs_fatal(retval, "%s: failed to lock lockfile", src_name); } } /* Get (temporary) destination and lock it. */ -retval = lock
Re: [ovs-dev] [PATCH] timeval: On 64-bit systems refresh time whenever it is requested.
Thanks guys. Submitted revised patch series which checks for Linux, uses fewer preprocessor conditions, and deals with some of the resulting unit test fallout. ~Leo On Wed, Aug 8, 2012 at 12:24 PM, Ben Pfaff wrote: > On Tue, Aug 07, 2012 at 05:00:59PM -0700, Leo Alterman wrote: > > 64-bit glibc appears to avoid syscalls for clock_gettime(), so we can get > > higher resolution timing and avoid having a timer firing off SIGALRM > > without introducing extra overhead. > > > > Signed-off-by: Leo Alterman > > I agree with Ed Maste that this should be limited to Linux on x86-64 > (not ia64 == Itanium). > > Preprocessor directives should be at the left margin, not indented to > the same level as the code. However, after looking at it, I think > that we shouldn't be using preprocessor directives. Instead, do > something like vswitchd/system-stats.c, which has: > > /* #ifdefs make it a pain to maintain code: you have to try to build both > ways. > * Thus, this file tries to compile as much of the code as possible > regardless > * of the target, by writing "if (LINUX)" instead of "#ifdef __linux__" > where > * this is possible. */ > #ifdef __linux__ > #include > #define LINUX 1 > #else > #define LINUX 0 > #endif > > and then uses "if" instead of "#ifdef" wherever it can. > > Does it pass the unit tests? > > Thanks, > > Ben. > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] tests: Add delay in NetFlow unit tests before killing processes.
At the end of the NetFlow unit tests we warp time to force any remaining flows to expire and then immediately kill OVS and the collector. However, this creates a race where sometimes these processes are killed before the last records are sent or collected. It's possible to force OVS to go through the run loop one last time before exiting but it's harder to enforce that the collector receives the packet. This simply avoids the problem by adding a 1 second delay before killing the processes, which should be more than enough time. Signed-off-by: Jesse Gross --- tests/ofproto-dpif.at |2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 300d091..8818316 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -958,6 +958,7 @@ for delay in 1000 3; do ovs-appctl time/warp $delay done +sleep 1 OVS_VSWITCHD_STOP ovs-appctl -t test-netflow exit @@ -1006,6 +1007,7 @@ done ovs-appctl time/warp 1 +sleep 1 OVS_VSWITCHD_STOP ovs-appctl -t test-netflow exit -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 12/41] ofp-util: Allow encoding Open Flow 1.2 Flow Stats Request Messages
On Wed, Aug 08, 2012 at 05:10:18PM -0700, Ben Pfaff wrote: > On Wed, Aug 08, 2012 at 06:49:47AM +0900, Simon Horman wrote: > > Signed-off-by: Simon Horman > > +case OFPUTIL_P_OF12: { > > +struct ofp11_flow_stats_request *ofsr; > > + > > +raw = (fsr->aggregate > > + ? OFPRAW_OFPST_AGGREGATE_REQUEST > > + : OFPRAW_OFPST_FLOW_REQUEST); > > +msg = ofpraw_alloc(raw, OFP12_VERSION, 0); > > I changed 0 to NXM_TYPICAL_LEN here, and in the existing OFPUTIL_P_NXM > case too. Thanks, that looks good to me. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 11/41] ofp-util: Allow encoding of Open Flow 1.2 Flow Removed messages
On Wed, Aug 08, 2012 at 05:07:32PM -0700, Ben Pfaff wrote: > On Wed, Aug 08, 2012 at 06:49:46AM +0900, Simon Horman wrote: > > Signed-off-by: Simon Horman > > Hmm, ofp12_flow_removed has a 'hard_timeout' member that this doesn't > populate. Sorry for missing that. Do you think it should be set to the same value as idle_timeout? Or perhaps htonl(UINT32_MAX)? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 10/41] ofp-util: Allow decoding of Open Flow 1.1 and 1.2 Packet Out Messages
On Wed, Aug 08, 2012 at 05:04:01PM -0700, Ben Pfaff wrote: > On Wed, Aug 08, 2012 at 06:49:45AM +0900, Simon Horman wrote: > > Signed-off-by: Simon Horman > > > +bad_in_port_err = OFPERR_OFPBMC_BAD_VALUE; > > Oh, I see, I misunderstood that there *was* a reasonable error code in > other versions of OpenFlow. Sorry about that. Sorry for the confusion. > > I applied the following incremental to your patch which, I think, > makes 9+10 the same as 9+10 as you sent, just with bits moved between > the patches: Looks good. I think the only difference in my code was to set bad_in_port_err on the line it is declared on, just in case it isn't set. I believe that is what lead to the confusion. > diff --git a/lib/ofp-util.c b/lib/ofp-util.c > index 1139bb1..6b92dcd 100644 > --- a/lib/ofp-util.c > +++ b/lib/ofp-util.c > @@ -2149,6 +2149,7 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, >const struct ofp_header *oh, >struct ofpbuf *ofpacts) > { > +enum ofperr bad_in_port_err; > enum ofpraw raw; > struct ofpbuf b; > > @@ -2183,6 +2184,8 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, > if (error) { > return error; > } > + > +bad_in_port_err = OFPERR_NXBRC_BAD_IN_PORT; > } else { > NOT_REACHED(); > } > @@ -2191,7 +2194,7 @@ ofputil_decode_packet_out(struct ofputil_packet_out *po, > && po->in_port != OFPP_NONE && po->in_port != OFPP_CONTROLLER) { > VLOG_WARN_RL(&bad_ofmsg_rl, "packet-out has bad input port %#"PRIx16, > po->in_port); > -return OFPERR_NXBRC_BAD_IN_PORT; > +return bad_in_port_err; > } > > po->ofpacts = ofpacts->data; > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [controller 4/6] learning-switch: Delay sending handshake until version negotiation is done.
On Tue, Aug 07, 2012 at 11:50:19AM -0700, Ben Pfaff wrote: > The learning-switch implementation needs to know the OpenFlow version in > use to send the initial handshake messages (e.g. the feature request), but > the version is not always available at the time that the code currently > sends the handshake. This can cause an assertion failure later when > ofputil_encode_flow_mod() checks the protocol, which will be 0 if the > version wasn't known. > > This commit fixes the problem by introducing a state machine that sends the > handshake messages only after version negotiation has finished. I have done a limited amount of testing of this, using Ryu and ovs-controller, and the results look good. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [controller 4/6] learning-switch: Delay sending handshake until version negotiation is done.
On Thu, Aug 09, 2012 at 12:21:36PM +0900, Simon Horman wrote: > On Tue, Aug 07, 2012 at 11:50:19AM -0700, Ben Pfaff wrote: > > The learning-switch implementation needs to know the OpenFlow version in > > use to send the initial handshake messages (e.g. the feature request), but > > the version is not always available at the time that the code currently > > sends the handshake. This can cause an assertion failure later when > > ofputil_encode_flow_mod() checks the protocol, which will be 0 if the > > version wasn't known. > > > > This commit fixes the problem by introducing a state machine that sends the > > handshake messages only after version negotiation has finished. > > I have done a limited amount of testing of this, > using Ryu and ovs-controller, and the results look good. I don't see how Ryu would be relevant, since it is also a controller, but I'll take your word for it that it works for you too. Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/2] lockfile: Remove lockfile_lock timeout argument
On Wed, Aug 08, 2012 at 05:52:58PM -0700, Leo Alterman wrote: > lockfile_lock() accepts a timeout argument but, aside from unit tests > pertaining to timeout, its value is always 0. Since this feature relies on > a periodic SIGALRM signal, which is no longer a given after refactoring > timeval.c, the cleanest solution is to just remove it. > > Signed-off-by: Leo Alterman I think that this fixes a failure in the unit tests that the first patch causes. If so, then that means that this patch should be the first in the series, not the second, so that there is no point at which the testsuite is broken. (One way to reorder the commits is to run "git rebase -i origin/master", then reorder the two "pick" lines in the editor that comes up and save and exit the editor.) The lockfile_lock() comment should mention that it won't block waiting for the lock. I believe that lockfile_try_lock()'s 'block' parameter may now be removed because it is always passed as false. Thanks, Ben. (I'll look at patch 1/2 now.) ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] timeval: On Linux x86-64 systems refresh time whenever it is requested.
On Wed, Aug 08, 2012 at 05:52:57PM -0700, Leo Alterman wrote: > 64-bit glibc appears to avoid syscalls for clock_gettime(), so we can get > higher resolution timing and avoid having a timer firing off SIGALRM > without introducing extra overhead. > > Signed-off-by: Leo Alterman "git am" reported some trailing whitespace: Applying: timeval: On Linux x86-64 systems refresh time whenever it is requested. /home/blp/nicira/ovs/.git/rebase-apply/patch:108: trailing whitespace. * /home/blp/nicira/ovs/.git/rebase-apply/patch:112: trailing whitespace. * /home/blp/nicira/ovs/.git/rebase-apply/patch:121: trailing whitespace. > +/* On Linux x86-64 systems, glibc avoids using syscalls for clock_gettime(). I don't think that this is a feature of glibc. I think that it is a feature of the kernel system call interface, which includes a stub that implements clock_gettime() without doing a user-to-supervisor ring transition and thereby avoids most of the normal system call overhead. > +#if defined __LP64__ && defined __linux__ > +#define CACHE_TIME 0 > +#else > +#define CACHE_TIME 1 > +#endif I think that __LP64__ is meant to indicate that the system has "long" and pointer types that are 64-bits wide. (But I can't find it in the usually comprehensive listing at predef.sf.net.) I think it would be better to test for __x86_64__ instead. In test-timeval.c please add {} around the conditional statements. Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [controller 4/6] learning-switch: Delay sending handshake until version negotiation is done.
On Wed, Aug 08, 2012 at 08:57:56PM -0700, Ben Pfaff wrote: > On Thu, Aug 09, 2012 at 12:21:36PM +0900, Simon Horman wrote: > > On Tue, Aug 07, 2012 at 11:50:19AM -0700, Ben Pfaff wrote: > > > The learning-switch implementation needs to know the OpenFlow version in > > > use to send the initial handshake messages (e.g. the feature request), but > > > the version is not always available at the time that the code currently > > > sends the handshake. This can cause an assertion failure later when > > > ofputil_encode_flow_mod() checks the protocol, which will be 0 if the > > > version wasn't known. > > > > > > This commit fixes the problem by introducing a state machine that sends > > > the > > > handshake messages only after version negotiation has finished. > > > > I have done a limited amount of testing of this, > > using Ryu and ovs-controller, and the results look good. > > I don't see how Ryu would be relevant, since it is also a controller, > but I'll take your word for it that it works for you too. True, scratch the Ryu portion of the comment above. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev