I've skimmed through the changes. They look sane. You know what you are doing. So unless you have any other open issues, I don't have much to add.
On Wed, 17 May 2023 at 12:21, Dmitry Vyukov <dvyu...@google.com> wrote: > > Hi Dmitry, > > > > > Where can I see the actual changes? Preferably side-by-side diffs with > > > full context. Are they (or can they be) uploaded somewhere on > > > github/gerrit? > > > > Here's the link to the first patch of the set on our Gerrit > > <https://review.lttng.org/c/userspace-rcu/+/9737>. > > > > > Are there any remaining open questions? > > > > On the Gerrit no. But I would add this for the known issues: > > > > We have a regression test for forking. We get the following when > > running it: > > > > ==23432==ThreadSanitizer: starting new threads after multi-threaded fork is > > not supported. Dying (set die_after_fork=0 to override) > > > With TSAN_OPTIONS=die_after_fork=0, here are the results for GCC and > > Clang. > > > > * gcc 11.3.0 > > ==25266==ThreadSanitizer: dup thread with used id 0x7fd40dafe600 > > > > Looks like this was fixed with recent merged of TSAN in gcc 13. > > > > * clang 14.0.6 > > Hi Olivier, > > Forking under tsan is a bit tricky. But I see we now take a number of > internal mutexes around fork (but I think still not all of them), so > maybe it's not that bad. If TSAN_OPTIONS=die_after_fork=0 works > reasonably reliably for you, then export it for testing. > > Older compilers are missing a number of bug fixes. > So if you can restrict testing to newer compilers only, that's the way to go. > > > > Tests pass but are very slow. This seems to be because we're calling > > exit(3) in the childs. Changing it to _exit(2) solves the issue. The > > only thing I can think of is that exit(3) is not async-signal-safe like > > _exit(2), yet we do other none-async safe calls like malloc(3) and > > free(3) in the childs. > > By default tsan sleeps for 1 second at exit, since exit sequence is a > common source of races. Perhaps it's just these sleeps. Try > TSAN_OPTIONS=atexit_sleep_ms=0 (or maybe =50). > > > > Here are some perf records of that: > > > > With exit(3) > > ``` > > 12.35% test_urcu_fork. [unknown] [k] 0xffffffff89ebd22b > > 8.69% test_urcu_fork. [unknown] [k] 0xffffffff89ebd8fa > > 7.50% test_urcu_fork. [unknown] [k] 0xffffffff8a001360 > > 6.11% test_urcu_fork. test_urcu_fork.tap [.] > > __sanitizer::internal_memset > > 5.85% test_urcu_fork. test_urcu_fork.tap [.] > > __tsan::ForkChildAfter > > 1.96% test_urcu_fork. [unknown] [k] 0xffffffff892f8efd > > 1.88% test_urcu_fork. [unknown] [k] 0xffffffff89ec8e2c > > 1.29% test_urcu_fork. [unknown] [k] 0xffffffff890a99d1 > > 0.79% test_urcu_fork. [unknown] [k] 0xffffffff8918c566 > > 0.75% test_urcu_fork. test_urcu_fork.tap [.] > > __sanitizer::ThreadRegistry::OnFork > > 0.71% test_urcu_fork. [unknown] [k] 0xffffffff89349574 > > 0.64% test_urcu_fork. [unknown] [k] 0xffffffff89ee424f > > 0.57% test_urcu_fork. test_urcu_fork.tap [.] > > __tsan::MetaMap::AllocBlock > > 0.55% test_urcu_fork. [unknown] [k] 0xffffffff89367249 > > 0.53% test_urcu_fork. test_urcu_fork.tap [.] __tsan_read8 > > 0.51% test_urcu_fork. [unknown] [k] 0xffffffff89ec8e05 > > ``` > > > > With _exit(2) > > ``` > > 12.26% test_urcu_fork. [unknown] [k] 0xffffffff89ebd8fa > > 9.51% test_urcu_fork. [unknown] [k] 0xffffffff89ebd22b > > 6.78% test_urcu_fork. test_urcu_fork.tap [.] > > __sanitizer::internal_memset > > 6.49% test_urcu_fork. [unknown] [k] 0xffffffff8a001360 > > 4.92% test_urcu_fork. test_urcu_fork.tap [.] __tsan::ForkChildAfter > > 2.92% test_urcu_fork. [unknown] [k] 0xffffffff892f8efd > > 2.19% test_urcu_fork. [unknown] [k] 0xffffffff89ec8e2c > > 1.88% test_urcu_fork. [unknown] [k] 0xffffffff890a99d1 > > 0.83% test_urcu_fork. test_urcu_fork.tap [.] > > __tsan::MetaMap::AllocBlock > > 0.72% test_urcu_fork. [unknown] [k] 0xffffffff892f8f1c > > 0.67% test_urcu_fork. [unknown] [k] 0xffffffff8918c566 > > 0.65% test_urcu_fork. [unknown] [k] 0xffffffff89ec16dd > > 0.62% test_urcu_fork. test_urcu_fork.tap [.] > > __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator32<__sanitizer::AP32>, > > __sanitizer::LargeMmapAllocatorPtrArrayStatic>::Allocate > > 0.61% test_urcu_fork. test_urcu_fork.tap [.] __tsan_write4 > > 0.60% test_urcu_fork. [unknown] [k] 0xffffffff89367249 > > 0.59% test_urcu_fork. [unknown] [k] 0xffffffff89ee424f > > 0.50% test_urcu_fork. test_urcu_fork.tap [.] > > __sanitizer::ThreadRegistry::OnFork > > 0.50% test_urcu_fork. test_urcu_fork.tap [.] __tsan_write8 > > 0.47% test_urcu_fork. test_urcu_fork.tap [.] __tsan::ForkBefore > > 0.45% test_urcu_fork. test_urcu_fork.tap [.] __tsan_func_entry > > ``` > > > > > Is there CI coverage with -fsanitize=thread? > > > > Since the urcu-signal flavor deadlocks with TSAN (see reproducer in > > commit log), the CI simply timeouts. I do however have a job with TSAN > > as an matrix axis here <https://ci.lttng.org/job/dev_odion_liburcu/>. > > Sounds good. _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev