It's not clear why when you use "a set of per-thread caches" you "lose advantages of bump allocator". At any point of time, a single goroutine is executed on a thread. The points when a goroutine gains and loses the execution context of a thread, and when it is transferred from one thread to another are known to runtime. At those points a goroutine could cache (eg in a register) the current thread's bump allocation address and use it for very fast bump allocation during execution.
-- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.