One more bit of progress which shaved off more time: following this page, https://blog.golang.org/profiling-go-programs
I introduced a pair of global cache big.Int variables for the function which was consuming most of the time; one for the smaller intermediate results and one for the larger. Turns out that the z.make function listed below inspects the capacity of z before doing anything ... if it's OK it just returns the slice as is. The problem I was having was that the intermediate results were of mixed size, so the slice was being created one or two times per function execution. With the cache variables in place, mallocgc is taking only 13% of the total time instead of 46%. Thanks for all your suggestions. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.