On Jan 5, 2015, at 12:49 AM, Bernd Edlinger <bernd.edlin...@hotmail.de> wrote: > On Sun, 4 Jan 2015 14:18:59, Mike Stump wrote: >> >>> But tsan still gets at least 90% chance to spot that. As a matter of fact, >>> the tsan runtime is just _incredibly_ fast, >>> and catches errors, with a very high reliability. But it is racy by design. >> >> You say by design. I’m skeptical of that. Can you quote the email or the >> code were they said I wanted to design a system that wasn’t 100% reliable, >> because the best they could do was 2% slower, and they didn’t want it to be >> 2% slower? I tend to think someone didn’t figure out they could atomically, >> locklessly do the update, or they didn’t understand the performance hit on >> the lockless code was small, or worse, they had no clue it was racy. I’d be >> curious to know. Maybe, there are other complications that prevent it from >> being non-racy. > > see for instance this thread "Is TSan an exact tool?": > https://groups.google.com/forum/#!topic/thread-sanitizer/mB73m6Nltaw
Yeah, that’s not the issue. The issue is, in this one narrow case, was is possible to use a lockless method of updating the data so that the tool would be slightly better. That post doesn’t address this question. For example, if a patch were submitted to locklessly update so both can notice who is first, was it rejected and why? Anyway, that’s a side issue, and we can safely ignore it. I’m fine with making testing reliable with how it works today. > Nevertheless I would prefer to silence the test now with sleep(1), > And come back later with a follow-up patch to use a un-instrumented and > non-interceptable synchronization primitive in all tsan tests. As it is at > least > not obvious how to do that in the test suite. But it must be possible. I thought the code was posted, I thought how to put it in a file was posted. I think it is a 2 minute endeavor to put all those pieces together and run a test case? You do know how to run just 1 tsan test case with make RUNTESTFLAGS=tsan.exp=aligned_vs_unaligned_race.C check-g++ sort of thing? This can shorten the testing time from hours per iteration to seconds per iteration. Slightly non-obvious, but very handy when developing some types of test cases. So, to help you out, I tried my hand at wiring up the extra library code: gcc/gcc/testsuite/gcc.dg/tsan$ cat test1.c /* { dg-additional-options "-O2" } */ /* { dg-additional-sources "lib.c" } */ void sync(int); int main(int argc, char**argv) { sync(1); sync(2); sync(3); } gcc/gcc/testsuite/gcc.dg/tsan$ cat lib.c /* { dg-do compile } */ #include <stdlib.h> volatile int serial; __attribute__((no_sanitize_address)) void sync(int i) { while (serial != i-1) ; while (1) { if (++serial == i) return; --serial; sleep (); } } and it ran just fine. The options are for both files. I read some of the tsan documentation, and they seem to claim you can have tsan not process a function by merely asking. I did that. I don’t know if that is enough to hide all the semantics that need to be hidden. In the above code, the ++ and -- should be an atomic increment/decrement. The idea is that can can introduce a deterministic ordering to the execution of the code by adding sync(n), where n is the step number, whose range starts at 1. So, for example, here is the test case, with the bits added, so you can see how I transformed what would have been the sleep into the use of the synchronizing primitive from the library. /* { dg-additional-sources "lib.c" } */ /* { dg-shouldfail "tsan" } */ #include <pthread.h> #include <stdio.h> #include <stdint.h> #include <unistd.h> void sync(int); uint64_t Global[2]; void *Thread1(void *x) { sync (2); Global[1]++; return NULL; } void *Thread2(void *x) { char *p1 = reinterpret_cast<char *>(&Global[0]); struct __attribute__((packed, aligned(1))) u_uint64_t { uint64_t val; }; u_uint64_t *p4 = reinterpret_cast<u_uint64_t *>(p1 + 1); (*p4).val++; sync (1); return NULL; } int main() { pthread_t t[2]; pthread_create(&t[0], NULL, Thread1, NULL); pthread_create(&t[1], NULL, Thread2, NULL); pthread_join(t[0], NULL); pthread_join(t[1], NULL); printf("Pass\n"); /* { dg-output "WARNING: ThreadSanitizer: data race.*(\n|\r\n|\r)" } */ /* { dg-output "Pass.*" } */ return 0; } The question is, does this 100% reliably solve the problem?