On Sun, Nov 10, 2024 at 03:34:54AM +0800, Yangyu Chen wrote:
> Hi Charlie,
> 
> I have tested this patchset with ghostwrite rebased to linux commit 
> da4373fbcf ("Merge tag 'thermal-6.12-rc7' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm") [1] on my D1 
> Nezha board, with defconfig + CONFIG_ERRATA_THEAD_GHOSTWRITE=n, I got this 
> message during boot:
> 
> [    0.027584] Kernel panic - not syncing: __kmem_cache_create_args: Failed 
> to create slab 'riscv_vector_ctx'. Error -22
> [    0.038057] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 
> 6.12.0-rc6-00310-gb276cf69df24-dirty #11
> [    0.047240] Hardware name: Allwinner D1 Nezha (DT)
> [    0.052007] Call Trace:
> [    0.054434] [<ffffffff80007172>] dump_backtrace+0x1c/0x24
> [    0.059806] [<ffffffff809f6834>] show_stack+0x2c/0x38
> [    0.064833] [<ffffffff80a040f0>] dump_stack_lvl+0x52/0x74
> [    0.070206] [<ffffffff80a04126>] dump_stack+0x14/0x1c
> [    0.075233] [<ffffffff809f6db6>] panic+0x10c/0x300
> [    0.080000] [<ffffffff8017b5a0>] __kmem_cache_create_args+0x24a/0x2b6
> [    0.086413] [<ffffffff80c04c68>] riscv_v_setup_ctx_cache+0x56/0x84
> [    0.092566] [<ffffffff80c04288>] arch_task_cache_init+0x10/0x1c
> [    0.098460] [<ffffffff80c07d02>] fork_init+0x68/0x1a8
> [    0.103486] [<ffffffff80c00ed2>] start_kernel+0x77e/0x822
> [    0.108870] ---[ end Kernel panic - not syncing: __kmem_cache_create_args: 
> Failed to create slab 'riscv_vector_ctx'. Error -22 ]---
> 
> [1] https://github.com/cyyself/linux/tree/xtheadvector_20241110
> 
> On 9/12/24 13:55, Charlie Jenkins wrote:
> >  diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
> > index 682b3feee451..9775d6a9c8ee 100644
> > --- a/arch/riscv/kernel/vector.c
> > +++ b/arch/riscv/kernel/vector.c
> > @@ -33,7 +33,17 @@ int riscv_v_setup_vsize(void)
> >  {
> >     unsigned long this_vsize;
> >  -  /* There are 32 vector registers with vlenb length. */
> > +   /*
> > +    * There are 32 vector registers with vlenb length.
> > +    *
> > +    * If the thead,vlenb property was provided by the firmware, use that
> > +    * instead of probing the CSRs.
> > +    */
> > +   if (thead_vlenb_of) {
> > +           this_vsize = thead_vlenb_of * 32;
> 
> Then, I patched here which replaces "this_vsize" with "riscv_v_vsize". The 
> kernel boots normally and I can see “xtheadvector" in /proc/cpuinfo.
> 
> However, when I try to run the "v_exec_initval_nolibc" test, the kernel 
> panics with these outputs:
> 
> [  978.788878] Oops - illegal instruction [#1]
> [  978.788897] Modules linked in:
> [  978.788908] CPU: 0 UID: 1000 PID: 461 Comm: v_exec_initval_ Not tainted 
> 6.12.0-rc6-00310-gb276cf69df24-dirty #12
> [  978.788924] Hardware name: Allwinner D1 Nezha (DT)
> [  978.788929] epc : do_trap_ecall_u+0x56/0x20a
> [  978.788956]  ra : _new_vmalloc_restore_context_a0+0xc2/0xce
> [  978.788974] epc : ffffffff80a04afe ra : ffffffff80a0e742 sp : 
> ffffffc6003fbeb0
> [  978.788983]  gp : ffffffff81717080 tp : ffffffd60723b300 t0 : 
> ffffffff81001268
> [  978.788991]  t1 : ffffffff80a04aa8 t2 : ffffffff810012a8 s0 : 
> ffffffc6003fbee0
> [  978.789000]  s1 : ffffffc6003fbee0 a0 : ffffffc6003fbee0 a1 : 
> 000000000000005d
> [  978.789007]  a2 : 0000000000000000 a3 : ffffffffffffffda a4 : 
> 0000000000000003
> [  978.789015]  a5 : 0000000000000000 a6 : 0000000002adb5fe a7 : 
> 000000000000005d
> [  978.789022]  s2 : 00000000000108a8 s3 : 0000000000000000 s4 : 
> 0000000000000008
> [  978.789030]  s5 : 0000003fb42ab780 s6 : 0000002adb5fe420 s7 : 
> 0000002adb5fb9e0
> [  978.789038]  s8 : 0000002adb5fe440 s9 : 0000002adb5fe420 s10: 
> 0000002adb572ad4
> [  978.789046]  s11: 0000002adb572ad0 t3 : 0000003fb43c5e3c t4 : 
> 622f7273752f3d5f
> [  978.789053]  t5 : 0000002adb5fd5a1 t6 : 0000000002adb5ff
> [  978.789060] status: 8000000201800100 badaddr: 000000005e0fb057 cause: 
> 0000000000000002
> [  978.789069] [<ffffffff80a04afe>] do_trap_ecall_u+0x56/0x20a
> [  978.789086] [<ffffffff80a0e742>] _new_vmalloc_restore_context_a0+0xc2/0xce
> [  978.789113] Code: a073 1007 006f 1a60 7057 0c30 57fd 17fe 77d7 0c30 (b057) 
> 5e0f
> [  978.789123] ---[ end trace 0000000000000000 ]---
> [  978.789131] Kernel panic - not syncing: Fatal exception in interrupt
> [  978.937158] ---[ end Kernel panic - not syncing: Fatal exception in 
> interrupt ]---
> 
> Is something wrong with my setup?

Thanks for reporting this! I just sent out a new version with the fix.
Something went wrong with the __riscv_v_vstate_discard() and was
triggering this failure. I have tested that this new version is able to
pass the testcase.

https://lore.kernel.org/linux-riscv/20241113-xtheadvector-v11-0-236c22791...@rivosinc.com/T/#t

- Charlie

> 
> Thanks,
> Yangyu Chen
> 
> > +           return 0;
> > +   }
> > +
> >     riscv_v_enable();
> >     this_vsize = csr_read(CSR_VLENB) * 32;
> >     riscv_v_disable();
> > diff --git a/arch/riscv/kernel/vendor_extensions/thead.c 
> > b/arch/riscv/kernel/vendor_extensions/thead.c
> > index 0f27baf8d245..519dbf70710a 100644
> > --- a/arch/riscv/kernel/vendor_extensions/thead.c
> > +++ b/arch/riscv/kernel/vendor_extensions/thead.c
> > @@ -5,6 +5,7 @@
> >  #include <asm/vendor_extensions/thead.h>
> >    #include <linux/array_size.h>
> > +#include <linux/cpumask.h>
> >  #include <linux/types.h>
> >    /* All T-Head vendor extensions supported in Linux */
> > @@ -16,3 +17,13 @@ struct riscv_isa_vendor_ext_data_list 
> > riscv_isa_vendor_ext_list_thead = {
> >     .ext_data_count = ARRAY_SIZE(riscv_isa_vendor_ext_thead),
> >     .ext_data = riscv_isa_vendor_ext_thead,
> >  };
> > +
> > +void disable_xtheadvector(void)
> > +{
> > +   int cpu;
> > +
> > +   for_each_possible_cpu(cpu)
> > +           clear_bit(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR, 
> > riscv_isa_vendor_ext_list_thead.per_hart_isa_bitmap[cpu].isa);
> > +
> > +   clear_bit(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR, 
> > riscv_isa_vendor_ext_list_thead.all_harts_isa_bitmap.isa);
> > +}
> 

Reply via email to