Intent to Deploy: ThreadSanitizer

2020-02-06 Thread Christian Holler
Data races in C/C++ code are a class of bugs that can severely impact
stability of the product while being hard to reproduce and debug.
Furthermore, data races are undefined behavior and can lead to
unforeseeable code behavior once compilers exploit this fact for better
optimizations. We have evidence that data races can cause intermittent
crashes and use-after-free memory safety violations that are hard to detect
by the existing sanitizers (e.g. AddressSanitizer) due to their
intermittent behavior.

ThreadSanitizer  (TSan)
is another sanitizer, specifically aimed at detecting data races and
related problems (e.g. mutex ordering issues, potential deadlock
situations, etc).

One of the problems with deploying ThreadSanitizer in CI is that we have a
fair amount of existing data races that orange pretty much every test we
have. In order to solve this situation, we are currently working on the
following strategy:


   1.

   Add a Linux TSan build as Tier1 to avoid build regressions (done in bug
   1590162 )
   2.

   Run a set of tests and generate a runtime suppression list
   
   for all of the existing issues.
   3.

   File the existing issues so we can track them (tracking bug is bug 929478
   ).
   4.

   Enable now-green tests to avoid further regressions (tracked in bug
   1612711 ).


As part of this process is to file existing race reports, you might already
have seen related bug reports in your component. There is no need to
immediately react to these reports, but we would of course very much
appreciate it if they could eventually be triaged and fixed (Many of you
have done so already, thank you!). Keep in mind that some of these reports
might point to potential sources of instability and other intermittent
misbehavior, so there might be potential to eliminate some nasty bugs. In
fact, we have already identified several major issues in our codebase just
from running tests. If you identify such a case, we would also ask for you
to indicate this somehow in the bug, as we track such bugs separately to
assess the value of the tool.

It is also likely that you will see benign race reports (or at least
reports that look benign). Unfortunately, it is incredibly hard to tell if
a race is really benign or not [1][2][3][4], so if an issue is easy to fix,
we suggest just fix it and not spend too much time on the analysis. There
might be cases where fixing a confirmed-benign race is not worth the
investment. In this case, we can add a permanent suppression. Since every
suppression costs some performance, we should try to use these carefully
though.

Overall we hope that this tool will make it easier for all of us to produce
more stable and secure code, debug existing issues more effectively and
maybe even move the needle when it comes to inexplicable crashes in
crash-stats.

[1]
https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong


[2]
https://blog.mozilla.org/nfroyd/2015/02/20/finding-races-in-firefox-with-threadsanitizer/

[3]
https://blog.mozilla.org/nnethercote/2015/02/24/fix-your-damned-data-races/
[4] https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Eliminating data races (ThreadSanitizer Project Update)

2020-11-19 Thread Christian Holler
tl;dr: TSan has proven useful to find even critical problems in our
codebase and we are extending the project by tackling existing issues,
adding more test suites and working on better Rust support. This may mean
that you see more TSan-related bugs in your component, please prioritize
triaging and helping to fix them accordingly.

In February 2020 we communicated

our plans to deploy ThreadSanitizer (TSan) for Firefox to eliminate data
races in our codebase. Since then, we have successfully deployed a Tier1
build and a basic set of Tier1 tests (xpcshell-tests and mochitest-plain).
In the process, we have found even more evidence that data races have an
impact on security and stability on the browser, cause intermittent
failures in CI and even cause intermittent correctness and performance
issues (see e.g. a list of critical bugs
that
have been opened up already). Therefore, we plan to continue and extend the
project by the following means:


   -

   Decrease the amount of active data race suppression - Suppressions are
   an important mechanism in TSan to temporarily ignore a data race until we
   can fix it (or even permanently if it is deemed unfixable). Some
   suppressions however previously already caused us to accidentally hide more
   (unrelated) data races. For this reason and for the sake of making sure the
   bugs are actually addressed at some point, we will look into each bug and
   either provide a fix ourselves or request your help for doing so. Open TSan
   bugs are currently tracked in bug 929478
   .



   -

   Run more tests - Developers have already started to run more test suites
   on their own, thereby uncovering several bugs, including critical ones. We
   should make sure to run all applicable tests to avoid for these bugs to go
   undetected and avoid regressions in these areas.



   -

   Fully support rust - While (safe) rust code does not suffer from data
   races, we do have a lot of cases where rust and C++ code interact with each
   other and run in parallel. Due to how TSan works, we need to build the full
   rust standard library with TSan for it to properly function and not produce
   false positives with rust code (tracked in bug 1671691
   )


Here’s how you can help: If you see bugs related to data races being filed
in your component, it would greatly help us if you could triage the issue,
let us know about the potential impact and advise us on how the issue can
be fixed or (in more complex cases) devise a fix yourself.

We greatly appreciate your support in making Firefox more stable and secure
and also expect the project to save developers more time in debugging
complex issues caused by data races.

If you have any questions, concerns or suggestions, please reach out to me.


- Chris (:decoder)
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform