Thanks for your note of appreciation, cpasmaboiteaspam. On Friday, June 6, 2025 at 7:05:35 AM UTC+1 cpasmaboiteaspam wrote:
> Hello Jason, > > Intuitively, to play with timing of goroutines, > *to manually inject delays where none exists,* > is what new comers do, > to experiment with deadlocks, > or simply build better understanding, > at least *I* did that very often. > Therefor your suggestion definitely makes sense, > to me, a *non* low level programmer perspective. > > (*not a useful email*, i only want to show some support > to your various posts which i read with lots of interest, > others too... but physically speaking > it is hard to follow everybody, > I liked the last post from Robert Griesemer > I like all the post of the Go blog anyway...... > and the coroutines.... *c'est interminable*...= ) > > Thank you, > Thank you all. > > Le jeudi 5 juin 2025 à 22:28:07 UTC+2, Jason E. Aten a écrit : > >> Hmm. Maybe rr found a bug in runtime GC code(?)... or >> maybe it will be hard to use rr on Go. Let's see what the runtime folks >> say >> on this issue: >> >> https://github.com/golang/go/issues/74019 >> >> On Wednesday, June 4, 2025 at 12:18:45 PM UTC+1 Jason E. Aten wrote: >> >>> This is a fascinating approach to finding hard to >>> reproduce event-interleaving related bugs. >>> >>> I'm particularly interested in this approach >>> because rr record and replay plus chaos >>> mode is directly applicable to >>> Go programs -- whereas deterministic simulation >>> testing (DST) is next to impossible in a Go program >>> using more than 4GB of memory (like most of my >>> programs) because this rules out wasm. >>> >>> In contrast to DST, the rr+chaos approach >>> accepts you will be randomly >>> sampling executions, but by recording all of them you >>> can still get reproducibility when you do hit the issue. >>> >>> rr is very efficient at recording. Green test runs can be quickly >>> discarded. >>> >>> In a blog from 2016, Robert O'Callahan, one of the principal rr authors, >>> talks about the design of rr's chaos mode for provoking hard to find >>> concurrency bugs: >>> >>> > To cut a long story short, here's an approach that works. >>> > Use just two thread priorities, "high" and "low". Make >>> > most threads high-priority; I give each thread a 0.1 >>> > probability of being low priority. Periodically re-randomize >>> > thread priorities. Randomize timeslice lengths. >>> > >>> > Here's the good part: periodically choose a short random interval, >>> > up to a few seconds long, and during that interval do not >>> > allow low-priority threads to run at all, even if they're >>> > the only runnable threads. Since these intervals can >>> > prevent all forward progress (no control of priority inversion), >>> > limit their length to no more than 20% of total run time. >>> > >>> > The intuition is that many of our intermittent test failures >>> > depend on CPU starvation (e.g. a machine temporarily >>> > hanging), so we're emulating intense starvation of a few >>> > "victim" threads, and allowing high-priority threads to >>> > wait for timeouts or input from the environment >>> > without interruption. >>> > >>> > With this approach, rr can reproduce my bug >>> <https://bugzilla.mozilla.org/show_bug.cgi?id=1213938> in >>> > several runs out of a thousand. I've also been able >>> > to reproduce a top intermittent >>> <https://bugzilla.mozilla.org/show_bug.cgi?id=1237176> (now being >>> fixed), >>> > an intermittent test failure >>> <https://bugzilla.mozilla.org/show_bug.cgi?id=1203417> that was >>> assigned to me, >>> > and an intermittent shutdown hang in IndexedDB >>> <https://bugzilla.mozilla.org/show_bug.cgi?id=1150737#c197> >>> > we've been chasing for a while. A couple of other >>> > people have found this enabled reproducing their >>> > bugs. I'm sure there are still bugs this approach >>> > can't reproduce, but it's good progress. >>> > >>> > I just landed all this work on rr master. The >>> > normal scheduler doesn't do this randomization, >>> > because it reduces throughput, i.e. slows down >>> > recording for easy-to-reproduce bugs. >>> > Run rr record -h to enable chaos mode for >>> > hard-to-reproduce bugs. >>> -- https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mode.html >>> >>> Links to more info and background on rr: >>> >>> https://rr-project.org/ >>> https://github.com/rr-debugger/rr >>> https://github.com/rr-debugger/rr/wiki/Usage >>> https://github.com/rr-debugger/rr/wiki/Testimonials >>> https://github.com/rr-debugger/rr/wiki/Building-And-Installing >>> >>> https://arxiv.org/pdf/1705.05937 >>> >>> https://fitzgen.com/2015/11/02/back-to-the-futurre.html >>> >>> >>> https://www.percona.com/blog/replay-the-execution-of-mysql-with-rr-record-and-replay/ >>> https://www.youtube.com/watch?v=61kD3x4Pu8I >>> >>> Robert's talk, "Taming Non-determinism" from 9 years ago is >>> a good technical introduction to rr. >>> https://www.youtube.com/watch?v=H4iNuufAe_8 >>> >>> NB The Delve debugger for Go supports rr, so you can get goroutine stack >>> traces. >>> >>> Enjoy. >>> >>> - Jason >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/c539eaaf-b0dc-4042-ba3a-e67cac8d262en%40googlegroups.com.