Hi Jason, We are testing the memory bandwidth program STREAM (https://www.cs.virginia.edu/stream/), but the results show that the CPU cannot fully utilize the DDR bandwidth, and the achieved bandwidth is quite low and about 1/10 of the peak bandwidth (peakBW in stats.txt). I tested the STREAM binary on my x86 computer and got the near peak bandwidth, so I believe the program is ok.
I've seen the maillist dialogue https://www.mail-archive.com/gem5-users@gem5.org/msg12965.html, and I think I've met the similar problem. So I tried the suggestions proposed by Andreas, including enable l1/l2 prefetcher, using ARM detailed CPU. Although these methods can improve the bandwidth, the results show it has limited effect. Besides, I've also tested the STREAM program in FS mode with x86 O3/Minor/TimingSimple CPU, and tested it in SE mode with ruby option, but all the results are similar and there is no essential difference. I guess it is a general problem in simulation with gem5. I'm wondering if the result is expected or is there something wrong with the system model? Two of the experimental results are attached for reference: 1. X86 O3CPU, SE-mode, w/o l2 prefetcher: ./build/X86/gem5.opt --outdir=m5out-stream configs/example/se.py --cpu-type=O3CPU --caches --l1d_size=256kB --l1i_size=256kB --l2cache --l2_size=8MB --mem-type=DDR3_1600_8x8 -c ../stream/stream STREAM output: ------------------------------------------------------------- Function Best Rate MB/s Avg time Min time Max time Copy: 1099.0 0.014559 0.014559 0.014559 Scale: 1089.7 0.014683 0.014683 0.014683 Add: 1213.0 0.019786 0.019786 0.019786 Triad: 1222.1 0.019639 0.019639 0.019639 ------------------------------------------------------------- stats.txt (dram related): system.mem_ctrls.dram.bytesRead 238807808 # Total bytes read (Byte) system.mem_ctrls.dram.bytesWritten 121179776 # Total bytes written (Byte) system.mem_ctrls.dram.avgRdBW 718.689026 # Average DRAM read bandwidth in MiBytes/s ((Byte/Second)) system.mem_ctrls.dram.avgWrBW 364.688977 # Average DRAM write bandwidth in MiBytes/s ((Byte/Second)) system.mem_ctrls.dram.peakBW 12800.00 # Theoretical peak bandwidth in MiByte/s ((Byte/Second)) system.mem_ctrls.dram.busUtil 8.46 # Data bus utilization in percentage (Ratio) system.mem_ctrls.dram.busUtilRead 5.61 # Data bus utilization in percentage for reads (Ratio) system.mem_ctrls.dram.busUtilWrite 2.85 # Data bus utilization in percentage for writes (Ratio) system.mem_ctrls.dram.pageHitRate 40.57 # Row buffer hit rate, read and write combined (Ratio) 2. X86 O3CPU, SE-mode, w/l2 prefetcher: ./build/X86/gem5.opt --outdir=m5out-stream-l2hwp configs/example/se.py --cpu-type=O3CPU --caches --l1d_size=256kB --l1i_size=256kB --l2cache --l2_size=8MB --l2-hwp-typ=StridePrefetcher --mem-type=DDR3_1600_8x8 -c ../stream/stream STREAM output: ------------------------------------------------------------- Function Best Rate MB/s Avg time Min time Max time Copy: 1703.9 0.009390 0.009390 0.009390 Scale: 1718.6 0.009310 0.009310 0.009310 Add: 2087.3 0.011498 0.011498 0.011498 Triad: 2227.2 0.010776 0.010776 0.010776 ------------------------------------------------------------- stats.txt (dram related): system.mem_ctrls.dram.bytesRead 238811712 # Total bytes read (Byte) system.mem_ctrls.dram.bytesWritten 121179840 # Total bytes written (Byte) system.mem_ctrls.dram.avgRdBW 1014.129912 # Average DRAM read bandwidth in MiBytes/s ((Byte/Second)) system.mem_ctrls.dram.avgWrBW 514.598298 # Average DRAM write bandwidth in MiBytes/s ((Byte/Second)) system.mem_ctrls.dram.peakBW 12800.00 # Theoretical peak bandwidth in MiByte/s ((Byte/Second)) system.mem_ctrls.dram.busUtil 11.94 # Data bus utilization in percentage (Ratio) system.mem_ctrls.dram.busUtilRead 7.92 # Data bus utilization in percentage for reads (Ratio) system.mem_ctrls.dram.busUtilWrite 4.02 # Data bus utilization in percentage for writes (Ratio) system.mem_ctrls.dram.pageHitRate 75.37 # Row buffer hit rate, read and write combined (Ratio) STREAM compiling options: gcc -O2 -static -DSTREAM_ARRAY_SIZE=1000000 -DNTIMES=2 stream.c -o stream All the experiments are performed on the latest stable version (141cc37c2d4b93959d4c249b8f7e6a8b2ef75338, v21.2.1). Thank you very much! Best Regards, Zicong
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s