It sounds like a paradox.

Did adding another goroutine really make my testing/synctest 
based network simulation fully deterministic, suitable for DST?

Yep. 

On the fifth rewrite, I finally discovered the fundamental
way to leverage the testing/synctest package and get 
a fully deterministic network simulation.

The trick, now implemented and available in the latest
release of my network package and its simulation engine

https://github.com/glycerine/rpc25519 and
https://github.com/glycerine/gosimnet

is to use one additional goroutine to accept and queue all channel 
operations
"in the background".

Don't try to interleave synctest.Wait with select 
and channel operations on the same goroutine. 

Its too much of a mess. More importantly, it didn't work.

It was incredibly hard to get determinism out of it. I tried four 
different ways that did not work. They would look like they were
going to work, but then under load testing I would get straggling
requests that missed their previous batch. This created 
non-determinism, aka non reproducible simulation. 

That's not good. We want the determinism of DST so that any bug 
we find in our distributed system is instantly reproducible. 
If DST is a new idea, this is a great motivating conversation[1].

Instead of mixing client requests over channels with sleep/synctest.Wait
logic directly, what you want to do is: buffer all client goroutine 
channel requests into a master event queue (MEQ) on a separate goroutine 
that runs 
completely independently of the main scheduler goroutine (the
one that will sleep and call synctest.Wait).

Let that background accumulator goroutine be the one 
with your big for/select loop to service client requests. 

Those requests that used to go directly to 
the scheduler goroutine now all get queued, and then handled in
one batch once the scheduling time quantum ends.

The scheduler simply sleeps for its time quantum, invokes the barrier 
synctest.Wait(),
and, and then locks and reads out the accumulated events from the MEQ, and
then unlocks the MEQ so the background goroutine will have access when
the scheduler restarts the clock (with their next sleep). 

The scheduler sorts the accumulated batch of events using deterministic 
sorting
criteria, dispatches them (matching sends and reads and firing timers
in the network), and then deterministically orders the any newly available 
replies.

And voila: deterministic simulation testing (DST) of network operations in 
Go.

Enjoy.

Jason

[1] https://www.youtube.com/watch?v=C1nZzQqcPZw&t=936s
"FoundationDB: from idea to Apple acquisition"
Dave Scherer, CTO of FoundationDB and Antithesis, 
really motivates why they invented DST. In short, its
crazy difficult to test distributed systems well in any other way.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/4dbdabc5-4221-493a-a045-fbc5e74fec73n%40googlegroups.com.

Reply via email to