Hello, I'm using arm64 board with uncommon for mass market Baikal-M
CPU (Baikal-M, built with 8x Cortex-A57 cores).

While booting ok in single-core kernel I met problem with mp kernel.
Booting process was hanging.
I found another user of this processor, let's refer to this person as "slaytor".
He uses board from different company, but with same CPU and had same problem.

Together with him we pinpoint problem to

cpu_boot_secondary(struct cpu_info *ci)

and was able to compare amd64 and arm64 versions of this functions to
found differences
that was able to unbreak/unhang booting process of mp kernel for this CPUs
(and maybe some other arm64 CPUs)

I attach patch for file sys/arch/arm64/arm64/cpu.c.
This patch was written based on amd64 version and contain ability
to break endless loop with drop to DDB or continue booting.

We both tested patch for more than 6 months (using our boards as daily drivers,
arm64 build machines, etc) and had no problems with it.

Looking at diff and remembering that similar "trick" already and
successfully implemented
at least in amd64 cpu.c for a long time I see no future problems in it.

Thanks in advance for any feedback.

-- 
Slava Voronzoff
Index: sys/arch/arm64/arm64/cpu.c
===================================================================
RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
retrieving revision 1.98
diff -u -p -r1.98 cpu.c
--- sys/arch/arm64/arm64/cpu.c  10 Aug 2023 19:29:32 -0000      1.98
+++ sys/arch/arm64/arm64/cpu.c  14 Sep 2023 00:58:17 -0000
@@ -1096,6 +1096,8 @@ cpu_start_secondary(struct cpu_info *ci,
 void
 cpu_boot_secondary(struct cpu_info *ci)
 {
+       int i;
+
        atomic_setbits_int(&ci->ci_flags, CPUF_GO);
        __asm volatile("dsb sy; sev" ::: "memory");
 
@@ -1105,8 +1107,17 @@ cpu_boot_secondary(struct cpu_info *ci)
         */
        arm_send_ipi(ci, ARM_IPI_NOP);
 
-       while ((ci->ci_flags & CPUF_RUNNING) == 0)
+       for (i = 1000; (!(ci->ci_flags & CPUF_RUNNING)) && i>0;i--) {
                __asm volatile("wfe");
+       }
+       if (! (ci->ci_flags & CPUF_RUNNING)) {
+               printf("cpu %d failed to start\n", ci->ci_cpuid);
+#if defined(MPDEBUG) && defined(DDB)
+               printf("dropping into debugger; continue from here to resume 
boot\n");
+               db_enter();
+#endif
+       }
+
 }
 
 void

Reply via email to