On Sat, Jun 22, 2013 at 5:32 AM, Andy Zhou <[email protected]> wrote:
> For architectures can load and store unaligned long efficiently, use 4
> or 8 bytes operations. This improves the efficiency compare to byte wise
> operations.
>
> This patch is uses ideas and code from a patch submitted by Peter Klausler
> titled "replace memcmp() with specialized comparator". The flow compare
> function is essentially his implementation. The original patch
> mentioned 7X speed up with this optimization.
>
> Co-authored-by: Peter Klausler <[email protected]>
> Signed-off-by: Andy Zhou <[email protected]>
OK, I think the time has come for this patch...
> diff --git a/datapath/flow.c b/datapath/flow.c
> index 39de931..273cbea 100644
> --- a/datapath/flow.c
> +++ b/datapath/flow.c
> @@ -343,16 +350,26 @@ static void flow_key_mask(struct sw_flow_key *dst,
> const struct sw_flow_key *src,
> const struct sw_flow_mask *mask)
> {
> - u8 *m = (u8 *)&mask->key + mask->range.start;
> - u8 *s = (u8 *)src + mask->range.start;
> - u8 *d = (u8 *)dst + mask->range.start;
> - int i;
> + const u8 *m = (u8 *)&mask->key;
> + const u8 *s = (u8 *)src;
> + u8 *d = (u8 *)dst;
> + int len = sizeof(*dst);
What's the rationale for dropping the mask->range calculations here?
It shouldn't cause a problem but I think that offset should always be
aligned. Since we also use the full length this ends up pulling in
extra data on both sides.
I wonder if it makes sense to just force a common length for all of
our operations so there isn't a potential for mismatch. That might
potentially also eliminate the need to do any tail checking depending
on the length we choose.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev