From: 'Simon Horman' <[email protected]>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/

Simon says:

There is also an AI-generated review of this patch on sashiko.dev.
And AI-generated reviews of other patches of this series.
But I believe what is forwarded here covers what should
be considered in the context of this patch-set.

---
selftests: net: add vxlan vnifilter notification test

This patch adds a selftest that exercises VXLAN vnifilter netlink
notifications by running bridge monitor vni in the background while
adding, updating, deleting and re-adding VNIs, then counting the
matching lines in the captured output.

> diff --git a/tools/testing/selftests/net/test_vxlan_vnifilter_notify.sh 
> b/tools/testing/selftests/net/test_vxlan_vnifilter_notify.sh
> new file mode 100755
> index 000000000000..9d51a9e02ae0
> --- /dev/null
> +++ b/tools/testing/selftests/net/test_vxlan_vnifilter_notify.sh
> @@ -0,0 +1,184 @@

[ ... ]

> +# Run bridge monitor in the background, execute a command, then count
> +# the notification lines.
> +# Usage: vni_notify_check <command> [args...]
> +# Sets: NOTIFY_COUNT with the number of notifications observed.
> +vni_notify_check()
> +{
> +     local tmpf cmd_ret monitor_pid
> +
> +     tmpf=$(mktemp)
> +     defer rm "$tmpf"
> +
> +     defer_scope_push
> +             ip netns exec "$NS1" bridge monitor vni > "$tmpf" 2>/dev/null &
> +             monitor_pid=$!
> +             defer kill_process "$monitor_pid"
> +
> +             sleep 0.5
> +             if [ ! -e "/proc/$monitor_pid" ]; then
> +                     RET=$ksft_skip
> +                     log_test "iproute2 'bridge monitor vni' not supported"
> +                     return "$RET"
> +             fi

[Medium]
Could the skip path here turn a legitimate skip into a failure?

When bridge monitor vni is not supported, the helper sets RET=$ksft_skip,
calls log_test from inside the helper, and then returns ksft_skip (4).
Every caller follows the helper with:

    vni_notify_check \
        bridge -n "$NS1" vni add vni 1000 dev "$VXLAN_DEV"
    check_err $? "Failed to add VNI"

Since ksft_skip is non-zero, check_err treats the return as a failure
and calls ret_set_ksft_status with ksft_fail. Per the merge weights in
ksft_status_merge (pass < xfail < skip < fail), that demotes RET from
skip to fail.

NOTIFY_COUNT is also never assigned because the function returned
before the grep, so the next:

    [ "$NOTIFY_COUNT" -eq 1 ]
    check_err $? "Expected 1 notification for VNI add, got $NOTIFY_COUNT"

also records a failure, and the caller's log_test then prints FAIL for
the real test name in addition to the SKIP printed by the helper. The
script ends up reporting both a SKIP and a FAIL line for the same case
and exits with a failure status.

Two related observations on this same path:

The bare return is issued between defer_scope_push and defer_scope_pop,
so defer_scope_pop is skipped on the skip path. Does that leave
__DEFER__SCOPE_ID off by one and the registered defer kill_process
"$monitor_pid" only unwound through the EXIT trap rather than the
normal scope unwind?

Would it be cleaner to signal skip back to the caller without going
through check_err (for example, set RET plus a flag and have the caller
short-circuit), and either move log_test to the caller or restructure
the skip path so it does not straddle defer_scope_push/defer_scope_pop?
fdb_notify.sh uses a similar pattern without mixing log_test and an
early return inside the scope.

> +
> +             "$@"
> +             cmd_ret=$?
> +             sleep 0.2
> +     defer_scope_pop
> +
> +     NOTIFY_COUNT=$(grep -c "$VXLAN_DEV" "$tmpf")
> +     NOTIFY_COUNT=${NOTIFY_COUNT:-0}
> +     return "$cmd_ret"
> +}

[Low]
Does this counting strategy actually distinguish RTM_NEWTUNNEL from
RTM_DELTUNNEL?

The commit message states:

    verifies that RTM_NEWTUNNEL and RTM_DELTUNNEL are sent correctly
    when VNIs are added, deleted, or updated

but NOTIFY_COUNT comes from:

    NOTIFY_COUNT=$(grep -c "$VXLAN_DEV" "$tmpf")

which matches both add and delete events emitted by bridge monitor vni.
A regression that emitted RTM_NEWTUNNEL on delete (or vice versa) would
still produce NOTIFY_COUNT=1 and pass every positive test. Would
differentiating the two (for instance, by separately counting lines
that begin with "Deleted " versus those that do not) make the test
match what the commit message describes?

Reply via email to