Hello.
I am trying to run a simple program with SVE instructions on gem5. However, the 
output with debug flag ExecALL suggests there is a issue with the decoder.
Here is the test code:
#define STREAM_ARRAY_SIZE 16
void main()
{
for (int j=0; j<STREAM_ARRAY_SIZE; j++)
       {
      A[j]=3; B[j]=2;
       }
int x=add(A,B);
printf("return %d \n",A[3]);  // should print 6, does not in gem5
}

int add(int * restrict p, int * restrict q)
{  
for (int i=0; i<STREAM_ARRAY_SIZE; i+=1)
      {
        *(p+i)=*(q+i)+4;
               }
printf("dummy %d %d \n",  *(p+3),  *(q+3));    // should print 6 and 2, does 
not in gem5
return *(p+3);
}
I compiled it with gcc cross compiler for arm with following command:
aarch64-linux-gnu-gcc-11 -O3 -static  -mcpu=a64fx+sve2 -msve-vector-bits=512 -o 
test test.c
Without the-mcpu=a64fx+sve2, SVE instructions are not generated.
Here is the command I used:
./build/ARM/gem5.opt ./configs/deprecated/example/se.py --cpu-type=ArmO3CPU 
--caches --cacheline_size=64 --mem-size=8GB --arm-iset=aarch64 -c ./test
I have also used "./configs/example/arm/starter_se.py", but the results are 
same.
When I use --debug-flag=Execall, I see the following isssues:
1) 12589500: system.cpu: A0 T0 : 0x400524 @main+4    :   ptrue   p0, VL64       
  : SimdPredAlu
:  D=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]  FetchSeq=14292  CPSeq=4962  flags=()

The D=[] should not be all zeros.

2)
12591000: system.cpu: A0 T0 : 0x400550 @main+48    :   st1   {z1}, p0/z, , 
[x19] : MemWrite :
 A=0x491040  FetchSeq=14305  CPSeq=4975  flags=(IsInteger|IsVector|IsStore)
12591000: system.cpu: A0 T0 : 0x400554 @main+52    :   st1   {z0}, p0/z, , 
[x19, #1, mul vl] : MemWrite : A=0x491050  FetchSeq=14306  CPSeq=4976  
flags=(IsInteger|IsVector|IsStore)

The second A should be 0x491080, not 0x491050.

I have run the same thing on RIKEN simulator, which was built on top of gem5 
for Fujitsu A64FX.
Here are the same instructions seen in RIKEN.
1) 15322000: system.cpu A0 T0 : @main+4    :   ptrue   p0, VL64         : 
SimdPredAlu :  
D=0b[0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111]
  FetchSeq=18146  CPSeq=5254  flags=()
As you can see, my data arrays are 64 bytes and appropriate bits in predicate 
registers are set to 1.
2)
15323000: system.cpu A0 T0 : @main+48    :   st1   {z1}, p0/z, , [x19] : 
SveMemWrite :
 A=0x491040  FetchSeq=18159  CPSeq=5267  
flags=(IsInteger|IsVector|IsMemRef|IsStore)
15323000: system.cpu A0 T0 : @main+52    :   st1   {z0}, p0/z, , [x19, #1, mul 
vl] : SveMemWrite :
  A=0x491080  FetchSeq=18160  CPSeq=5268

The second address is calcuated as 0x491080, which is the correct result for 
x19, #1, mul vl, as vl=64.

I tried to compare the files in src/arch/arm/ISA from riken with current gem5. 
Since RIKEN is based on old gem5, there are obvious syntax differences. Other 
than that, I have found 2 things:
1) in ArmISA.py, in riken, there is this:
     id_aa64pfr0_el1 = Param.UInt64(0x0000000100000022, "AArch64 Processor 
Feature Register 0")"
I did not find anything similar in gem5. I did find id_aa64pfr0_el1 in 
ar/arm/reg/misch.hh but its value wasnt set anwhere.
2) In ArmISA.py in current gem5, there is this "FEAT_SVE" extension in class 
ArmDefaultSERelease. However, this is for armv8.2, and I dont know how to 
specify this architecture in command line.

What I am trying to find out is, am I missing any runtime flags that would 
enable the proper SVE instructions in gem5, or is it due to any compile time 
flags since I am setting -mcpu to a64fx (setting -march to armv8.2-a+sve or 
whatever does not produce SVE instructions, it has to be -mcpu=a64fx+sve), or 
is it a possible issue/bug in the new gem5 itself. Any suggestions would be 
appreciated.
Thank you.

_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to