Hi, For those familiar with gem5 + GCN3 simulations, I need some answers to questions.
I downloaded and followed instructions at https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/DNNMark/ to build gem5 + GCN3. I ran the DNN test : test_fwd_softmax With two runs to check how latency affects the runs Original value for mem_rd_latency and mem_resp_latency was 50 but I also ran with 40. gem5/build/GCN3_X86/gpu-compute/GPU.py mem_req_latency = Param.Int(40, "Latency for request from the cu to ruby. "\ "Represents the pipeline to reach the TCP "\ "and specified in GPU clock cycles") mem_resp_latency = Param.Int(40, "Latency for responses from ruby to the "\ "cu. Represents the pipeline between the "\ "TCP and cu as well as TCP data array "\ "access. Specified in GPU clock cycles") 1. Why does stats.txt for "40" show reduced number of instructions ? system.cpu3.CUs2.numInstrExecuted 71838 (50) system.cpu3.CUs2.numInstrExecuted 69075 (40) 1. Does the GPU kernel perform optimizations on the instructions due to less waiting time? 1. Do some stats like below make sense (first line is "50") and (second line is "40") ? system.cpu3.CUs2.ScheduleStage.dispNrdyStalls::Ready 347264 system.cpu3.CUs2.ScheduleStage.dispNrdyStalls::Ready 351634 system.cpu3.CUs2.instCyclesLdsPerSimd::0 800 system.cpu3.CUs2.instCyclesLdsPerSimd::0 696 system.cpu3.CUs2.tlbRequests 104000 system.cpu3.CUs2.tlbRequests 100000 system.cpu3.CUs2.tlbCycles 10101232000 system.cpu3.CUs2.tlbCycles 9141288000 system.cpu3.CUs2.numInstrExecuted 71838 system.cpu3.CUs2.numInstrExecuted 69075 system.cpu3.CUs2.headTailLatency::mean 68651.850962 system.cpu3.CUs2.headTailLatency::mean 64891.258333 system.cpu3.CUs2.headTailLatency::stdev 157090.173635 system.cpu3.CUs2.headTailLatency::stdev 155057.054245 1. The runtime is the same. Is there a way to end simulation based upon completion of all instructions instead of a fixed time ? This would be another way for me to know that the run with latency = 40 should end sooner. ---------- Begin Simulation Statistics ---------- "50" simSeconds 0.126230 # Number of seconds simulated (Second) simTicks 126229990500 # Number of ticks simulated (Tick) finalTick 126229990500 # Number of ticks from beginning of simulation (restored from checkpoints and never reset) (Tick) simFreq 1000000000000 # The number of ticks per simulated second ((Tick/Second)) hostSeconds 199.21 # Real time elapsed on the host (Second) hostTickRate 633665289 # The number of ticks simulated per host second (ticks/s) ((Tick/Second)) hostMemory 3596200 # Number of bytes of host memory used (Byte) simInsts 38011242 # Number of instructions simulated (Count) simOps 72305276 # Number of ops (including micro ops) simulated (Count) hostInstRate 190811 # Simulator instruction rate (inst/s) ((Count/Second)) hostOpRate 362962 # Simulator op (including micro ops) rate (op/s) ((Count/Second)) ---------- Begin Simulation Statistics ---------- "40" simSeconds 0.126230 # Number of seconds simulated (Second) simTicks 126229990500 # Number of ticks simulated (Tick) finalTick 126229990500 # Number of ticks from beginning of simulation (restored from checkpoints and never reset) (Tick) simFreq 1000000000000 # The number of ticks per simulated second ((Tick/Second)) hostSeconds 199.32 # Real time elapsed on the host (Second) hostTickRate 633294503 # The number of ticks simulated per host second (ticks/s) ((Tick/Second)) hostMemory 3598508 # Number of bytes of host memory used (Byte) simInsts 38010420 # Number of instructions simulated (Count) simOps 72303566 # Number of ops (including micro ops) simulated (Count) hostInstRate 190696 # Simulator instruction rate (inst/s) ((Count/Second)) hostOpRate 362742 # Simulator op (including micro ops) rate (op/s) ((Count/Second)) Thanks, David
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s