Hi Gelin, I understand the confusion.
Even if there are no SIMD operations, you are issuing FP operations and those are using vector elements as single and double precision scalar registers are elements of the SIMD vector registers Kind Regards Giacomo > -----Original Message----- > From: Gelin Fu via gem5-users <gem5-users@gem5.org> > Sent: 01 September 2021 04:17 > To: gem5-users@gem5.org > Cc: Gelin Fu <20153...@cqu.edu.cn> > Subject: [gem5-users] Problem with SIMD instructions execution in se mode > when ISA is ARM > > Hi, all. I am trying to run some experiments about sparse matrix > multiplication, > and it is mainly about float multiplication and addition. > To ensure the compiler doesn't vectorize the source code, I compile this > source code with GCC flag " -march=armv8-a+nosimd+nosve " .this should > indicate that my assembly code doesn't have any SIMD instructions. Besides, > I objdump the elf file to ensure that there are no SIMD instructions. > The questions are: > I use rename flag to observe the rename stage situation. Hoever, most of > rename stage stall is because rename map has 0 free entries. More > specifically, the numFreeVecEntries is always zero, such as : > 1539442000: system.cpu.rename: [tid:0] Free IQ: 29, Free ROB: 97, Free LQ: > 26, Free SQ: 48, FreeRM 0(64 192 0 14 625) > 1539442000: system.cpu.rename: [tid:0] 0 instructions not yet in ROB > 1539442000: system.cpu.rename: calcFreeLQEntries: free lqEntries: 26, > loadsInProgress: 0, loads dispatchedToLQ: 0 > 1539442000: system.cpu.rename: [tid:0] Stall: RenameMap has 0 free entries. > 1539442000: system.cpu.rename: [tid:0] Blocking. > However, I did not use SIMD instruction, so I am confused why the vector > entry in rename stage can be used. And it seems to be the bottleneck of > rename stage efficiency. > The command lines are: > ../../../build/ARM/gem5.opt ../../../configs/example/se.py \ > --cpu-type=O3_ARM_v7a_3 --cpu-clock=1GHz \ --mem-type DDR3_1600_8x8 > --mem-size 16GB \ --num-cpu 1 \ -c > "/home/fugelin/Work/Research/gem5/gem5_git/se_files/spmv/bin/spmv_c > sr.elf" \ --caches --l2cache --l1i_size 64kB --l1d_size 64kB --l2_size 256kB > -- > l1i_assoc 4 --l1d_assoc 4 --l2_assoc 8 --cacheline_size 128 The output in > stats > are: > system.cpu.iq.FU_type_0::No_OpClass 1 0.03% 0.03% # > Type of > FU issued > system.cpu.iq.FU_type_0::IntAlu 2036 52.38% 52.41% # > Type of > FU issued > system.cpu.iq.FU_type_0::IntMult 0 0.00% 52.41% # > Type of FU > issued > system.cpu.iq.FU_type_0::IntDiv 0 0.00% 52.41% # > Type of FU > issued > system.cpu.iq.FU_type_0::FloatAdd 0 0.00% 52.41% # > Type of > FU issued > system.cpu.iq.FU_type_0::FloatCmp 0 0.00% 52.41% # > Type of > FU issued > system.cpu.iq.FU_type_0::FloatCvt 0 0.00% 52.41% # > Type of > FU issued > system.cpu.iq.FU_type_0::FloatMult 0 0.00% 52.41% # > Type of > FU issued > system.cpu.iq.FU_type_0::FloatMultAcc 376 9.67% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::FloatDiv 0 0.00% 62.08% # > Type of FU > issued > system.cpu.iq.FU_type_0::FloatMisc 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::FloatSqrt 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdAdd 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdAddAcc 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdAlu 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdCmp 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdCvt 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdMisc 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdMult 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdMultAcc 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdShift 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdShiftAcc 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdDiv 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdSqrt 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdFloatAdd 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdFloatAlu 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdFloatCmp 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdFloatCvt 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdFloatDiv 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdFloatMisc 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdFloatMult 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdFloatMultAcc 0 0.00% 62.08% # > Type of FU issued > system.cpu.iq.FU_type_0::SimdFloatSqrt 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdReduceAdd 0 0.00% 62.08% # > Type of FU issued > system.cpu.iq.FU_type_0::SimdReduceAlu 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdReduceCmp 0 0.00% 62.08% # > Type of FU issued > system.cpu.iq.FU_type_0::SimdFloatReduceAdd 0 0.00% > 62.08% # > Type of FU issued > system.cpu.iq.FU_type_0::SimdFloatReduceCmp 0 0.00% > 62.08% # > Type of FU issued > system.cpu.iq.FU_type_0::SimdAes 0 0.00% 62.08% # > Type of > FU issued > system.cpu.iq.FU_type_0::SimdAesMix 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdSha1Hash 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdSha1Hash2 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdSha256Hash 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdSha256Hash2 0 0.00% 62.08% # > Type of FU issued > system.cpu.iq.FU_type_0::SimdShaSigma2 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdShaSigma3 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::SimdPredAlu 0 0.00% 62.08% # > Type > of FU issued > system.cpu.iq.FU_type_0::MemRead 1361 35.01% 97.09% # > Type > of FU issued > system.cpu.iq.FU_type_0::MemWrite 113 2.91% 100.00% # > Type > of FU issued > system.cpu.iq.FU_type_0::FloatMemRead 0 0.00% 100.00% # > Type of FU issued > system.cpu.iq.FU_type_0::FloatMemWrite 0 0.00% 100.00% # > Type of FU issued > system.cpu.iq.FU_type_0::IprAccess 0 0.00% 100.00% # > Type of > FU issued > system.cpu.iq.FU_type_0::InstPrefetch 0 0.00% 100.00% # > Type of > FU issued > system.cpu.iq.FU_type_0::total 3887 # > Type of FU issued > Gelin Fu > Xian Jiaotong University > _______________________________________________ > gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an > email to gem5-users- > le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s