I’d be tempted to just use C for this. That is, generate C code from a register 
level description of your N simulation cores and run that. That is more or less 
what “cycle based” verilog simulators (used to) do. The code gen can also split 
it M ways to run on M physical cores. You can also generate optimal 
synchronization code.

With linked lists you’re wasting half of the memory bandwidth and potentially 
the cache. Your # of elements are not going to change so a linked list doesn’t 
buy you anything. An array is ideal from a performance PoV.

> On Jan 17, 2021, at 7:49 AM, Pete Wilson <peter.wil...@bsc.es> wrote:
> That’s exactly the plan.
> 
> The idea is to simulate  perhaps the workload of a complete chiplet. That 
> might be (assuming no SIMD in the processors to keep the example light) 2K 
> cores. Each worklet is perhaps 25-50 nsec (worklet = work done for one core) 
> for each simulated core
> 
> The simplest mechanism is probably that on finishing work, every worker sends 
> a message to main; when it’s got all the messages, it sends a message to each 
> of the workers. Nice and simple. But it seems as though a channel 
> communication is of the order of 100ns, so I’m eating 200nsec per phase 
> change in each worker
> 
> With 10 executing cores and 2K simulated cores, we get to do around 200 
> worklets per phase per executing core. At 25ns per worklet that’s 5 
> microseconds of work per worker, and losing 200nsec out of that will still 
> let the thing scale reasonably to some useful number of cores. 
> 
> But tools are more useful if they’re relative broad-spectrum. If I want to 
> have 20 worklets per core per phase (running a simulation on a subset of the 
> system, to gain simulation speed), I now am using ~ 200ns out of 500 ns of 
> work, which is not a hugely scalable number at all. Probably it’d run slower 
> than a standard single core sequential implementation regardless of the 
> number of cores, so not a Big Win
> 
> Were the runtime.Unique() function to exist (a hypothetical scheduler call 
> that, for some number of goroutines and cores, allows a goroutine to declare 
> that it should be the sole workload for a core; limited to a fairly large 
> subset of available cores) I could spinloop on an atomic load, emulating 
> waitgroup/barrier behaviour without any scheduler involvement and with times 
> closer to the 10ns level (when worklet path lengths  were well-balanced)
> 
> I’d also welcome the news that under useful circumstances channel 
> communication is only (say) 20ns. That’d simplify things beauteously. (All ns 
> measured on a ~3GHz Core i7 of 2018)
> 
> [Historical Note: When I were a young lad, I wrote quite a bit of stuff in 
> occam, so channel-y stuff is lapsed second nature - channels just work - and 
> all this building barrier stuff is terribly unnatural. So my instincts are 
> (were?) good, but platform performance doesn’t seem to want to play along]
> 
>> On Jan 17, 2021, at 9:21 AM, Robert Engels <reng...@ix.netcom.com> wrote:
>> 
>> If there is very little work to be done - then you have N “threads” do M 
>> partitioned work. If M is 10x N you’ve decimated the synchronization cost. 
> 
> 
> 
> WARNING / LEGAL TEXT: This message is intended only for the use of the 
> individual or entity to which it is addressed and may contain information 
> which is privileged, confidential, proprietary, or exempt from disclosure 
> under applicable law. If you are not the intended recipient or the person 
> responsible for delivering the message to the intended recipient, you are 
> strictly prohibited from disclosing, distributing, copying, or in any way 
> using this message. If you have received this communication in error, please 
> notify the sender and destroy and delete any copies you may have received. 
> 
> http://www.bsc.es/disclaimer 
> 
> 
> 
> 
> 
> 
> WARNING / LEGAL TEXT: This message is intended only for the use of the 
> individual or entity to which it is addressed and may contain information 
> which is privileged, confidential, proprietary, or exempt from disclosure 
> under applicable law. If you are not the intended recipient or the person 
> responsible for delivering the message to the intended recipient, you are 
> strictly prohibited from disclosing, distributing, copying, or in any way 
> using this message. If you have received this communication in error, please 
> notify the sender and destroy and delete any copies you may have received. 
> 
> http://www.bsc.es/disclaimer

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/BE0A5A72-2FF3-4FD5-8EC1-F2EB53532428%40iitbombay.org.

Reply via email to