Hello. I am trying to run a simple program with SVE instructions on gem5. However, the output with debug flag ExecALL suggests there is a issue with the decoder. Here is the test code: #define STREAM_ARRAY_SIZE 16 void main() { for (int j=0; j<STREAM_ARRAY_SIZE; j++) { A[j]=3; B[j]=2; } int x=add(A,B); printf("return %d \n",A[3]); // should print 6, does not in gem5 }
int add(int * restrict p, int * restrict q) { for (int i=0; i<STREAM_ARRAY_SIZE; i+=1) { *(p+i)=*(q+i)+4; } printf("dummy %d %d \n", *(p+3), *(q+3)); // should print 6 and 2, does not in gem5 return *(p+3); } I compiled it with gcc cross compiler for arm with following command: aarch64-linux-gnu-gcc-11 -O3 -static -mcpu=a64fx+sve2 -msve-vector-bits=512 -o test test.c Without the-mcpu=a64fx+sve2, SVE instructions are not generated. Here is the command I used: ./build/ARM/gem5.opt ./configs/deprecated/example/se.py --cpu-type=ArmO3CPU --caches --cacheline_size=64 --mem-size=8GB --arm-iset=aarch64 -c ./test I have also used "./configs/example/arm/starter_se.py", but the results are same. When I use --debug-flag=Execall, I see the following isssues: 1) 12589500: system.cpu: A0 T0 : 0x400524 @main+4 : ptrue p0, VL64 : SimdPredAlu : D=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] FetchSeq=14292 CPSeq=4962 flags=() The D=[] should not be all zeros. 2) 12591000: system.cpu: A0 T0 : 0x400550 @main+48 : st1 {z1}, p0/z, , [x19] : MemWrite : A=0x491040 FetchSeq=14305 CPSeq=4975 flags=(IsInteger|IsVector|IsStore) 12591000: system.cpu: A0 T0 : 0x400554 @main+52 : st1 {z0}, p0/z, , [x19, #1, mul vl] : MemWrite : A=0x491050 FetchSeq=14306 CPSeq=4976 flags=(IsInteger|IsVector|IsStore) The second A should be 0x491080, not 0x491050. I have run the same thing on RIKEN simulator, which was built on top of gem5 for Fujitsu A64FX. Here are the same instructions seen in RIKEN. 1) 15322000: system.cpu A0 T0 : @main+4 : ptrue p0, VL64 : SimdPredAlu : D=0b[0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111] FetchSeq=18146 CPSeq=5254 flags=() As you can see, my data arrays are 64 bytes and appropriate bits in predicate registers are set to 1. 2) 15323000: system.cpu A0 T0 : @main+48 : st1 {z1}, p0/z, , [x19] : SveMemWrite : A=0x491040 FetchSeq=18159 CPSeq=5267 flags=(IsInteger|IsVector|IsMemRef|IsStore) 15323000: system.cpu A0 T0 : @main+52 : st1 {z0}, p0/z, , [x19, #1, mul vl] : SveMemWrite : A=0x491080 FetchSeq=18160 CPSeq=5268 The second address is calcuated as 0x491080, which is the correct result for x19, #1, mul vl, as vl=64. I tried to compare the files in src/arch/arm/ISA from riken with current gem5. Since RIKEN is based on old gem5, there are obvious syntax differences. Other than that, I have found 2 things: 1) in ArmISA.py, in riken, there is this: id_aa64pfr0_el1 = Param.UInt64(0x0000000100000022, "AArch64 Processor Feature Register 0")" I did not find anything similar in gem5. I did find id_aa64pfr0_el1 in ar/arm/reg/misch.hh but its value wasnt set anwhere. 2) In ArmISA.py in current gem5, there is this "FEAT_SVE" extension in class ArmDefaultSERelease. However, this is for armv8.2, and I dont know how to specify this architecture in command line. What I am trying to find out is, am I missing any runtime flags that would enable the proper SVE instructions in gem5, or is it due to any compile time flags since I am setting -mcpu to a64fx (setting -march to armv8.2-a+sve or whatever does not produce SVE instructions, it has to be -mcpu=a64fx+sve), or is it a possible issue/bug in the new gem5 itself. Any suggestions would be appreciated. Thank you.
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org