On 03/06/2024 14.48, Daniel P. Berrangé wrote:
On Wed, May 29, 2024 at 02:53:37PM +0100, Peter Maydell wrote:
On Wed, 29 May 2024 at 14:32, Thomas Huth <th...@redhat.com> wrote:
Casting function pointers from one type to another causes undefined
behavior errors when compiling with -fsanitize=undefined with Clang v18:
$ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket
TAP version 13
# random seed: R02S4424f4f460de783fdd3d72c5571d3adc
1..10
# Start of mips64 tests
# Start of netdev tests
# Start of stream tests
# starting QEMU: exec ./qemu-system-mips64 -qtest
unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control
-display none -audio none -nodefaults -M none -netdev
stream,id=st0,addr.type=fd,addr.str=3 -accel qtest
../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress
through pointer to incorrect function type 'void (*)(void *)'
/tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note:
qapi_free_SocketAddress defined here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13
It's a pity the sanitizer error doesn't tell you the actual
function type as well as the incorrect one it got cast to
(especially since in this case the function and its declaration
are both in generated code in the build tree not conveniently
findable with 'git grep'.)
In this case the function being called is:
void qapi_free_SocketAddress(SocketAddress *obj)
and it's cast to a GDestroyNotify, which is
typedef void (*GDestroyNotify) (gpointer data);
(and gpointer is void*)
and although you can pass a foo* to a function expecting void*,
you can't treat a pointer to a function taking foo* as if it was
a pointer to a function taking void*, just in case the compiler
needs to do some clever trickery with the pointer value.
So the wrapper function looks like it doesn't do anything,
but it's doing the permitted implicit-cast of the argument
I guess that's the letter of the law in C, but does that actually
matter in practice, historically ?
The use of "(GDestroyNotify)blah" casts is standard practice
across any application using GLib, and even in QEMU this is
far from the only place that does such a cast:
...
I presume this means the rest of the code merely isn't exercised by the
CI job test case this is fixing, but if we're saying this needs fixing,
then we should be fixing all usage.
There are more spots discovered by the CI, I just didn't tackle them yet.
Another problem is e.g. the call_rcu macro from include/qemu/rcu.h...
So overall, I'm not in favour of this patch unless there's a compelling
functional reason why just this 1 case is special and all the others
can be safely ignored.
We'd need to tackle them all. I thought it might be a good idea since we
then could also switch to a more strict CFI mode (i.e. stop using
-fsanitize-cfi-icall-generalize-pointers) ... but yes, looking at the amount
of spots that need fixes, it feels like tilting at windmills.
Maybe best if we really just add -fno-sanitize=function to the CFLAGS when
we're compiling with the sanitizer enabled.
Thomas