Hi, all. I am trying to run some experiments about sparse matrix
multiplication, and it is mainly about float multiplication and addition.
To ensure the compiler doesn't vectorize the source code, I compile this source
code with GCC flag " -march=armv8-a+nosimd+nosve " .this should indicate that
my assembly code doesn't have any SIMD instructions. Besides, I objdump the elf
file to ensure that there are no SIMD instructions.
The questions are:
I use rename flag to observe the rename stage situation. Hoever, most of
rename stage stall is because rename map has 0 free entries. More specifically,
the numFreeVecEntries is always zero, such as :
1539442000: system.cpu.rename: [tid:0] Free IQ: 29, Free ROB: 97, Free LQ: 26,
Free SQ: 48, FreeRM 0(64 192 0 14 625)
1539442000: system.cpu.rename: [tid:0] 0 instructions not yet in ROB
1539442000: system.cpu.rename: calcFreeLQEntries: free lqEntries: 26,
loadsInProgress: 0, loads dispatchedToLQ: 0
1539442000: system.cpu.rename: [tid:0] Stall: RenameMap has 0 free entries.
1539442000: system.cpu.rename: [tid:0] Blocking.
However, I did not use SIMD instruction, so I am confused why the vector
entry in rename stage can be used. And it seems to be the bottleneck of rename
stage efficiency.
The command lines are:
../../../build/ARM/gem5.opt ../../../configs/example/se.py \
--cpu-type=O3_ARM_v7a_3 --cpu-clock=1GHz \
--mem-type DDR3_1600_8x8 --mem-size 16GB \
--num-cpu 1 \
-c "/home/fugelin/Work/Research/gem5/gem5_git/se_files/spmv/bin/spmv_csr.elf" \
--caches --l2cache --l1i_size 64kB --l1d_size 64kB --l2_size 256kB --l1i_assoc
4 --l1d_assoc 4 --l2_assoc 8 --cacheline_size 128
The output in stats are:
system.cpu.iq.FU_type_0::No_OpClass 1 0.03% 0.03% #
Type of FU issued
system.cpu.iq.FU_type_0::IntAlu 2036 52.38% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::IntMult 0 0.00% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::IntDiv 0 0.00% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatAdd 0 0.00% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatCmp 0 0.00% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatCvt 0 0.00% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatMult 0 0.00% 52.41% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatMultAcc 376 9.67% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatDiv 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatMisc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatSqrt 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdAdd 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdAddAcc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdAlu 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdCmp 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdCvt 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdMisc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdMult 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdMultAcc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdShift 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdShiftAcc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdDiv 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdSqrt 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatAdd 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatAlu 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatCmp 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatCvt 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatDiv 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatMisc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatMult 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatMultAcc 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatSqrt 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdReduceAdd 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdReduceAlu 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdReduceCmp 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatReduceAdd 0 0.00% 62.08%
# Type of FU issued
system.cpu.iq.FU_type_0::SimdFloatReduceCmp 0 0.00% 62.08%
# Type of FU issued
system.cpu.iq.FU_type_0::SimdAes 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdAesMix 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdSha1Hash 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdSha1Hash2 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdSha256Hash 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdSha256Hash2 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdShaSigma2 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdShaSigma3 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::SimdPredAlu 0 0.00% 62.08% #
Type of FU issued
system.cpu.iq.FU_type_0::MemRead 1361 35.01% 97.09% #
Type of FU issued
system.cpu.iq.FU_type_0::MemWrite 113 2.91% 100.00% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatMemRead 0 0.00% 100.00% #
Type of FU issued
system.cpu.iq.FU_type_0::FloatMemWrite 0 0.00% 100.00% #
Type of FU issued
system.cpu.iq.FU_type_0::IprAccess 0 0.00% 100.00% #
Type of FU issued
system.cpu.iq.FU_type_0::InstPrefetch 0 0.00% 100.00% #
Type of FU issued
system.cpu.iq.FU_type_0::total 3887 #
Type of FU issued
Gelin Fu
Xian Jiaotong University
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s