Hi Roger, thanks for the compliment.

Yes, there is quite some overlap with the new "testing/synctest" package. 
The tests you can write with synctest I think you can also write with 
gosim, as gosim's scheduler does what synctest does: If threads are paused, 
synctest and gosim both advance an internal clock, and so tests that take a 
long wall-clock time can be fast in both.

I think synctest is an interesting point in the design space. In Go tests 
you can use interfaces to mock the OS, the network, etc, but time and 
scheduling is impossible to mock because you don't know when goroutines are 
blocked. Synctest fixes that, and once you have synctest, you can test 
almost all the same scenarios as in Gosim _if_ you mock all interactions 
with the OS and avoid using any shared global state.

The trade-off is where the complexity is: With mocks and synctest you do 
not need significant changes to the runtime, but none of your code (or your 
dependencies) can use standard OS calls. With Gosim, the program under test 
does not need to change, but you rely on a more complicated mocking and 
rewriting mechanism. Practically this means Gosim can test programs using 
Bolt (https://pkg.go.dev/go.etcd.io/bbolt, 
https://github.com/jellevandenhooff/gosim/blob/main/examples/bolt/bolt_test.go) 
and test how Bolt behaves when a machine restarts without having to change 
any of the code in Bolt.

You could perhaps reuse the underlying mocks (for a network that drops 
packets, etc.) between Gosim and synctest. However, Gosim currently 
integrates at the syscall layer, so the interface exposed is quite 
different than the high-level mocks you would need to replace os.File, 
net.Conn, etc. In an earlier version of Gosim I tried mocking those 
higher-level interfaces, but I found it quite difficult: The API-surface is 
broad and not nearly as well-defined as Posix. Simulating that API 
accurately is important for testing error handlers that match error types 
returned by a net.Conn.

Gosim also adds determinism (running the same test twice results in the 
same output) which is helpful if you are trying to debug rare failures. You 
can imagine future Antithesis-like tricks to test behavior: Run with same 
seed up to an interesting simulated time, and then change the seed. I think 
adding that to synctest would be quite difficult.

This blog 
post https://www.polarsignals.com/blog/posts/2024/05/28/mostly-dst-in-go 
describes yet another approach, running go with the -faketime flag (used on 
the go playground) inside of wasm to get deterministic execution and 
standard OS calls by interposing at the wasm-syscall boundary, which means 
the program needs to build under wasm.

Jelle
Op dinsdag 10 december 2024 om 14:53:22 UTC-8 schreef roger peppe:

Impressive stuff! Some potentially interesting overlap with the new 
"synctest" package. Do you have any thoughts on that?


On Tue, 10 Dec 2024 at 17:41, Jelle van den Hooff <je...@vandenhooff.name> 
wrote:

Hi golang-nuts,

I am excited to share Gosim: simulation testing for Go (
https://github.com/jellevandenhooff/gosim). Gosim is a project I have been 
working on for quite a while that aims to make testing distributed systems 
easier. It implements simulation testing as popularized by FoundationDB (
https://www.youtube.com/watch?v=4fFDFbi3toc).

Gosim runs mostly-standard Go code in its simulated environment. It 
supports standard packages like `os`, `net`, gRPC, protobuf, and more; the 
largest real-world program I have successfully run is etcd. Inside of the 
simulation, Gosim implements fake time, network, disks, and machines. Tests 
can manipulate the network to eg. partition a host, or restart a machine, 
and verify that code still behaves as it should -- and all that without 
needing to manage real VMs or containers.

Gosim works by source-translating Go to replace all references to 
concurrency primitives, the operating system, and non-deterministic code to 
its own runtime. So `go foo()` becomes `gosimruntime.Go(foo)`, etc. Then, 
Gosim implements a (subset of) the Linux system call interface to simulate 
disk and network. More details on the design are in 
https://github.com/jellevandenhooff/gosim/blob/main/docs/design.md. Gosim's 
system call implementations are (currently) in 
https://github.com/jellevandenhooff/gosim/blob/main/internal/simulation/os_linux.go
.

To give you a taste of the kinds of tests Gosim can write, below is a 
snippet of a test running Etcd (taken from 
https://github.com/jellevandenhooff/gosim/blob/main/examples/etcd/etcd_test.go).
 
The test creates several Gosim machines that have their own network stack, 
disk, global variables, and more, and lets them run and communicate. From 
the point of view of the code, each Etcd instance runs on its own machine 
and is its own independent process. The simulation however runs all 
machines in the same Go process so that you can easily debug what happens, 
the test is reproducible, and overhead is low.

I have tried to make Gosim easy to use. To get started you can run a test 
by replacing `go test ...` with `gosim test`. If Gosim might be useful for 
you, I would be happy to chat and prioritize future features. Some things I 
would certainly like to add are support for running main() functions; 
simulating clock drift; support for running different versions of code; and 
built-in simulation of common cloud APIs like S3.

Gosim is experimental, so it will change and break, and only runs Go code. 
So it can test systems that are written in Go, but it will not work with 
external dependencies. I have some ideas on using eg. Wazero to run Sqlite 
or Postgres inside of the Go process but those are, well, still ideas.

Jelle

// TestEtcd runs a 3 node etcd cluster, partitions the network between the 
// nodes, and makes sure key-value puts and gets work. func TestEtcd(t 
*testing.T) { gosim.SetSimulationTimeout(2 * time.Minute) // run machines: 
gosim.NewMachine(gosim.MachineConfig{ Label: "etcd-1", Addr: 
netip.MustParseAddr("10.0.0.1"), MainFunc: func() { runEtcdNode("etcd-1", 
"10.0.0.1") }, }) gosim.NewMachine(gosim.MachineConfig{ Label: "etcd-2", 
Addr: netip.MustParseAddr("10.0.0.2"), MainFunc: func() { time.Sleep(100 * 
time.Millisecond) runEtcdNode("etcd-2", "10.0.0.2") }, }) 
gosim.NewMachine(gosim.MachineConfig{ Label: "etcd-3", Addr: 
netip.MustParseAddr("10.0.0.3"), MainFunc: func() { time.Sleep(200 * 
time.Millisecond) runEtcdNode("etcd-3", "10.0.0.3") }, }) // mess with the 
network in the background go nemesis.Sequence( nemesis.Sleep{ Duration: 10 
* time.Second, }, nemesis.PartitionMachines{ Addresses: []string{ 
"10.0.0.1", "10.0.0.2", "10.0.0.3", }, Duration: 30 * time.Second, }, 
).Run()

 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to golang-nuts...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/CAP%3DJquaBu1O5rN6aR6fMs03q4O92cPAc9DfGQZ9fck9zB2sEkw%40mail.gmail.com
 
<https://groups.google.com/d/msgid/golang-nuts/CAP%3DJquaBu1O5rN6aR6fMs03q4O92cPAc9DfGQZ9fck9zB2sEkw%40mail.gmail.com?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/3d870b72-5946-41f1-b8f9-5b9063c8eb6an%40googlegroups.com.

Reply via email to