On Tue, May 19, 2020 at 05:24:18PM -0300, Adhemerval Zanella wrote: > > > On 19/05/2020 16:54, Arnd Bergmann wrote: > > Jack Schmidt reported a bug for the arm32 clock_gettimeofday64 vdso call > > last > > month: https://github.com/richfelker/musl-cross-make/issues/96 and > > https://github.com/raspberrypi/linux/issues/3579 > > > > As Will Deacon pointed out, this was never reported on the mailing list, > > so I'll try to summarize what we know, so this can hopefully be resolved > > soon. > > > > - This happened reproducibly on Linux-5.6 on a 32-bit Raspberry Pi patched > > kernel running on a 64-bit Raspberry Pi 4b (bcm2711) when calling > > clock_gettime64(CLOCK_REALTIME) > > Does it happen with other clocks as well? > > > > > - The kernel tree is at https://github.com/raspberrypi/linux/, but I could > > see no relevant changes compared to a mainline kernel. > > Is this bug reproducible with mainline kernel or mainline kernel can't be > booted on bcm2711? > > > > > - From the report, I see that the returned time value is larger than the > > expected time, by 3.4 to 14.5 million seconds in four samples, my > > guess is that a random number gets added in at some point. > > What kind code are you using to reproduce it? It is threaded or issue > clock_gettime from signal handlers?
Original report thread is here: https://github.com/richfelker/musl-cross-make/issues/96 The reporter originally misunderstood the issue and wrongly attributed it to difference between gettimeofday and clock_gettime but it was just big jumps between successive vdso clock_gettime64 calls. No transformation was being done on the output of the vdso function; as long as it succeeds musl just returns directly with the value it stored in the timespec. No threads or anything fancy were involved. Current musl will no longer call it but you should be able to dlopen("linux-gate.so.1", RTLD_NOW|RTLD_LOCAL) then use dlsym to get its address and call it (not tested; I've never used it this way). > > - The current musl git tree has been patched to not call clock_gettime64 > > on ARM because of this problem, so it cannot be used for reproducing it. > > So should glibc follow musl and remove arm clock_gettime6y4 vDSO support > or this bug is localized to an specific kernel version running on an > specific hardware? For musl it was important to disable it asap pending a fix, because users are expected to generate static binaries, and these could make it into the wild without anyone realizing they're broken until much later when run on an affected kernel (especially since pre-5.6 kernels would hide the issue entirely due to lacking vdso). Ideally a fix will be something we can detect (e.g. new symbol version) so as not to risk calling the broken one, but whether that's necessary may depend on what's affected. I'm not sure if glibc should do the same; it's not often used in static linking, and replacing libc (shared lib, or re-static-linking which LGPL requires you to facilitate to distribute static binaries) could solve the issue on affected systems. Rich