On Fri, Mar 22, 2024 at 7:34 PM Stephen Hemminger <step...@networkplumber.org> wrote: > > > Could you build a simple test to see if TSC every runs backwards on > this machine. Or there could be yet another math error. > Or maybe container TSC is huge an wrapping around? > > The point of the test is to make sure that there wasn't wraparound errors.
Sorry about the wait on this one, but we did write that simple C program to check for whether TSC ever runs backwards on this system. It gets TSC using __rdtsc() because that's the same approach from the x86 rte_cycles.c. And it just loops for 10 seconds or so and compares n TSC to n-1 TSC, and if n's TSC is ever less than n-1's TSC it prints a message saying so. Otherwise at the end it prints that TSC is working normally. From running this the first time, it showed TSC as never running backwards. Another thing I can do is trigger a full set of testing (so that the system is under normal load) and then run the tsc checking program concurrently. Another idea - maybe multiple timestamps are gathered from different CPU registers during the same test, and they are misaligned for that reason. Maybe we can try reducing the cores for each unit test to 1 and checking whether the issue persists. Or there could be another math error as you say. And I should mention that now that I'm looking at this more closely I did see that unfortunately all these fail results are coming from a new debian 12 x86 environment which was added a few weeks ago, but mistakenly labeled as debian 11 x86. So, the fact that fails started can be explained by the fact that we added this new debian 12 container recently. So, I'll try a few more things and keep yall updated.