You are right, I created a PR to fix this: https://github.com/gem5/gem5/pull/764
Kind Regards Giacomo From: Nazmus Sakib <nsak...@nmsu.edu> Date: Thursday, 11 January 2024 at 19:34 To: Giacomo Travaglini <giacomo.travagl...@arm.com>, The gem5 Users mailing list <gem5-users@gem5.org> Cc: Jason Lowe-Power <jlowepo...@ucdavis.edu> Subject: Re: ARM SVE ISA Not compiling with -msve-vector-bits did the trick. It runs perfectly, whether I set the cpu[0].isa[0].sve_vl_se to 4 or keep it to 1. Thank you for the suggestions !! One last thing, the starter_se.py does not seem to have support for --cpu-type=ArmO3CPU (or am I missing something) ? ________________________________ From: Giacomo Travaglini <giacomo.travagl...@arm.com> Sent: 11 January 2024 12:16 To: The gem5 Users mailing list <gem5-users@gem5.org> Cc: Jason Lowe-Power <jlowepo...@ucdavis.edu>; Nazmus Sakib <nsak...@nmsu.edu> Subject: Re: ARM SVE ISA You don't often get email from giacomo.travagl...@arm.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> WARNING This email originated external to the NMSU email system. Do not click on links or open attachments unless you are sure the content is safe. Hi Nazmus, I can see from what you posted you are compiling the testcase with 512b vector width. I believe you should amend the gem5 VL accordingly… Basically writing up in the gem5 config: cpu.isa[0].sve_vl_se = 4 According to [1]. This should fix your problem. Another solution I believe would be to compile without specifying the VL. Then it should be VL agnostic code I presume. Anyway, I also recommend you use configs/example/arm/starter_se.py as se.py is per se deprecated Kind Regards Giacomo [1]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/ArmISA.py#L179 From: Nazmus Sakib via gem5-users <gem5-users@gem5.org> Date: Thursday, 11 January 2024 at 17:54 To: gem5-users@gem5.org <gem5-users@gem5.org> Cc: Jason Lowe-Power <jlowepo...@ucdavis.edu>, Nazmus Sakib <nsak...@nmsu.edu> Subject: [gem5-users] ARM SVE ISA Hello. I am trying to run a simple program with SVE instructions on gem5. However, the output with debug flag ExecALL suggests there is a issue with the decoder. Here is the test code: #define STREAM_ARRAY_SIZE 16 void main() { for (int j=0; j<STREAM_ARRAY_SIZE; j++) { A[j]=3; B[j]=2; } int x=add(A,B); printf("return %d \n",A[3]); // should print 6, does not in gem5 } int add(int * restrict p, int * restrict q) { for (int i=0; i<STREAM_ARRAY_SIZE; i+=1) { *(p+i)=*(q+i)+4; } printf("dummy %d %d \n", *(p+3), *(q+3)); // should print 6 and 2, does not in gem5 return *(p+3); } I compiled it with gcc cross compiler for arm with following command: aarch64-linux-gnu-gcc-11 -O3 -static -mcpu=a64fx+sve2 -msve-vector-bits=512 -o test test.c Without the-mcpu=a64fx+sve2, SVE instructions are not generated. Here is the command I used: ./build/ARM/gem5.opt ./configs/deprecated/example/se.py --cpu-type=ArmO3CPU --caches --cacheline_size=64 --mem-size=8GB --arm-iset=aarch64 -c ./test I have also used "./configs/example/arm/starter_se.py", but the results are same. When I use --debug-flag=Execall, I see the following isssues: 1) 12589500: system.cpu: A0 T0 : 0x400524 @main+4 : ptrue p0, VL64 : SimdPredAlu : D=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] FetchSeq=14292 CPSeq=4962 flags=() The D=[] should not be all zeros. 2) 12591000: system.cpu: A0 T0 : 0x400550 @main+48 : st1 {z1}, p0/z, , [x19] : MemWrite : A=0x491040 FetchSeq=14305 CPSeq=4975 flags=(IsInteger|IsVector|IsStore) 12591000: system.cpu: A0 T0 : 0x400554 @main+52 : st1 {z0}, p0/z, , [x19, #1, mul vl] : MemWrite : A=0x491050 FetchSeq=14306 CPSeq=4976 flags=(IsInteger|IsVector|IsStore) The second A should be 0x491080, not 0x491050. I have run the same thing on RIKEN simulator, which was built on top of gem5 for Fujitsu A64FX. Here are the same instructions seen in RIKEN. 1) 15322000: system.cpu A0 T0 : @main+4 : ptrue p0, VL64 : SimdPredAlu : D=0b[0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111] FetchSeq=18146 CPSeq=5254 flags=() As you can see, my data arrays are 64 bytes and appropriate bits in predicate registers are set to 1. 2) 15323000: system.cpu A0 T0 : @main+48 : st1 {z1}, p0/z, , [x19] : SveMemWrite : A=0x491040 FetchSeq=18159 CPSeq=5267 flags=(IsInteger|IsVector|IsMemRef|IsStore) 15323000: system.cpu A0 T0 : @main+52 : st1 {z0}, p0/z, , [x19, #1, mul vl] : SveMemWrite : A=0x491080 FetchSeq=18160 CPSeq=5268 The second address is calcuated as 0x491080, which is the correct result for x19, #1, mul vl, as vl=64. I tried to compare the files in src/arch/arm/ISA from riken with current gem5. Since RIKEN is based on old gem5, there are obvious syntax differences. Other than that, I have found 2 things: 1) in ArmISA.py, in riken, there is this: id_aa64pfr0_el1 = Param.UInt64(0x0000000100000022, "AArch64 Processor Feature Register 0")" I did not find anything similar in gem5. I did find id_aa64pfr0_el1 in ar/arm/reg/misch.hh but its value wasnt set anwhere. 2) In ArmISA.py in current gem5, there is this "FEAT_SVE" extension in class ArmDefaultSERelease. However, this is for armv8.2, and I dont know how to specify this architecture in command line. What I am trying to find out is, am I missing any runtime flags that would enable the proper SVE instructions in gem5, or is it due to any compile time flags since I am setting -mcpu to a64fx (setting -march to armv8.2-a+sve or whatever does not produce SVE instructions, it has to be -mcpu=a64fx+sve), or is it a possible issue/bug in the new gem5 itself. Any suggestions would be appreciated. Thank you. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org