Here is a simple test that demonstrates the dynamics https://play.golang.org/p/6SZcxCEAfFp (cannot run in playground)

Notice that the call that uses an over allocated number of routines takes 5x longer wall time than the properly sized one - this is due to scheduling and contention on the underlying structures. So blindly creating go routines does not achieve optimum performance for many workloads (even when the number of OS threads is capped).

rengels@rengels:~/gotest$ go run main_bench.go 
2.261805812s
1.311269725s
6.341378965s


-----Original Message-----
From: Robert Engels
Sent: Dec 30, 2019 9:43 AM
To: Jesper Louis Andersen
Cc: Brian Candler , golang-nuts
Subject: Re: [go-nuts] Simple worker pool in golnag

Right, but the overhead is not constant nor free. So if you parallelize the CPU bound  task into 100 segments and you only have 10 cores, the contention on the internal locking structures (scheduler, locks in channels) will be significant and the entire process will probably take far longer - working on a simple test case to demonstrate - so blindly spinning up Go routines will not be the best solution for some workloads. 

On Dec 30, 2019, at 9:11 AM, Jesper Louis Andersen <jesper.louis.ander...@gmail.com> wrote:


On Mon, Dec 30, 2019 at 10:46 AM Brian Candler <b.cand...@pobox.com> wrote:
Which switching cost are you referring to?  The switching cost between goroutines? This is minimal, as it takes places within a single thread.  Or are you referring to cache invalidation issues?  Or something else?


It is the usual dichotomy between concurrency and parallelism. If your problem is concurrent, then it is usually the case that the switching cost is minimal and it is often far easier just to run thousands of goroutines on a much smaller set of cores. However, if your problem is parallel, you often have one task which is your goal, and you want it to finish as soon as possible[0]. Here, explicit control over the cores tend to be more efficient due to caches, memory bandwidth, latency hiding, etc. Processor affinity and pinning threads to processors is often important in this game. But unless you can gain a significant speedup by doing this explicitly, I wouldn't bother.

[0] There is another, sometimes better, way to look at the problem. Rather than being able to solve a problem N times faster, you can also see at as being able to solve a problem N times larger in the same time. This has the advantage that communication is less of a problem. When the problem size gets small enough, the overhead of multiple processing cores gets too large.


--
J.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAGrdgiVje8SyiNZG-5JE8gE-%3DYOyDjdvVn-uyy95P3zzepc3wA%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/BA8435BE-3D84-446D-8C82-66A728A3F9AE%40ix.netcom.com.




-- 
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/740063269.1900.1577740472713%40wamui-lola.atl.sa.earthlink.net.

Reply via email to