Actually attaching the patch this time (**** gmail client) On Sun, Feb 5, 2017 at 10:58 AM, Jason Harmening <jason.harmen...@gmail.com> wrote:
> Hmm, it's a good idea to consider the possibility of a barrier issue. It > wouldn't be the first time we've had such a problem on a weakly-ordered > architecture. That said, I don't see a problem in this case. > smp_rendezvous_cpus() takes a spinlock and then issues > atomic_store_rel_int() to ensure the rendezvous params are visible to > other cpus. The latter corresponds to lwsync on powerpc, which AFAIK > should be sufficient to ensure visibility of prior stores. > > For now I'm going with the simpler explanation that I made a bad > assumption in the powerpc get_pcpu() and there is some context in which > the read of sprg0 doesn't return a consistent pointer value. Unfortunately > I don't see where that might be right now. > > On the mips side, Kurt/Alexander can you test the attached patch? It > contains a simple fix to ensure get_pcpu() returns the consistent per-cpu > pointer. > > On Sat, Feb 4, 2017 at 1:34 PM, Svatopluk Kraus <onw...@gmail.com> wrote: > >> Probably not related. But when I took short look to the patch to see >> what could go wrong, I walked into the following comment in >> _rm_wlock(): "Assumes rm->rm_writecpus update is visible on other CPUs >> before rm_cleanIPI is called." There is no explicit barrier to ensure >> it. However, there might be some barriers inside of >> smp_rendezvous_cpus(). I have no idea what could happened if this >> assumption is not met. Note that rm_cleanIPI() is affected by the >> patch. >> >> >> >> On Sat, Feb 4, 2017 at 9:39 PM, Jason Harmening >> <jason.harmen...@gmail.com> wrote: >> > Can you post an example of such panic? Only 2 MI pieces were changed, >> > netisr and rmlock. I haven't seen problems on my own amd64/i386/arm >> testing >> > of this, so a backtrace might help to narrow down the cause. >> > >> > On Sat, Feb 4, 2017 at 12:22 PM, Andreas Tobler <andre...@freebsd.org> >> > wrote: >> >> >> >> On 04.02.17 20:54, Jason Harmening wrote: >> >>> >> >>> I suspect this broke rmlocks for mips because the rmlock >> implementation >> >>> takes the address of the per-CPU pc_rm_queue when building tracker >> >>> lists. That address may be later accessed from another CPU and will >> >>> then translate to the wrong physical region if the address was taken >> >>> relative to the globally-constant pcpup VA used on mips. >> >>> >> >>> Regardless, for mips get_pcpup() should be implemented as >> >>> pcpu_find(curcpu) since returning an address that may mean something >> >>> different depending on the CPU seems like a big POLA violation if >> >>> nothing else. >> >>> >> >>> I'm more concerned about the report of powerpc breakage. For powerpc >> we >> >>> simply take each pcpu pointer from the pc_allcpu list (which is the >> same >> >>> value stored in the cpuid_to_pcpu array) and pass it through the >> ap_pcpu >> >>> global to each AP's startup code, which then stores it in sprg0. It >> >>> should be globally unique and won't have the variable-translation >> issues >> >>> seen on mips. Andreas, are you certain this change was responsible >> the >> >>> breakage you saw, and was it the same sort of hang observed on mips? >> >> >> >> >> >> I'm really sure. 313036 booted fine, allowed me to execute heavy >> >> compilation jobs, np. 313037 on the other side gave me various >> patterns of >> >> panics. During startup, but I also succeeded to get into multiuser and >> then >> >> the panic happend during port building. >> >> >> >> I have no deeper inside where pcpu data is used. Justin mentioned >> netisr? >> >> >> >> Andreas >> >> >> > >> > >
Index: sys/amd64/include/pcpu.h =================================================================== --- sys/amd64/include/pcpu.h (revision 313193) +++ sys/amd64/include/pcpu.h (working copy) @@ -78,6 +78,7 @@ extern struct pcpu *pcpup; +#define get_pcpu() (pcpup) #define PCPU_GET(member) (pcpup->pc_ ## member) #define PCPU_ADD(member, val) (pcpup->pc_ ## member += (val)) #define PCPU_INC(member) PCPU_ADD(member, 1) @@ -203,6 +204,15 @@ } \ } +#define get_pcpu() __extension__ ({ \ + struct pcpu *__pc; \ + \ + __asm __volatile("movq %%gs:%1,%0" \ + : "=r" (__pc) \ + : "m" (*(struct pcpu *)(__pcpu_offset(pc_prvspace)))); \ + __pc; \ +}) + #define PCPU_GET(member) __PCPU_GET(pc_ ## member) #define PCPU_ADD(member, val) __PCPU_ADD(pc_ ## member, val) #define PCPU_INC(member) __PCPU_INC(pc_ ## member) Index: sys/kern/kern_rmlock.c =================================================================== --- sys/kern/kern_rmlock.c (revision 313193) +++ sys/kern/kern_rmlock.c (working copy) @@ -156,7 +156,7 @@ */ critical_enter(); td = curthread; - pc = pcpu_find(curcpu); + pc = get_pcpu(); for (queue = pc->pc_rm_queue.rmq_next; queue != &pc->pc_rm_queue; queue = queue->rmq_next) { tracker = (struct rm_priotracker *)queue; @@ -258,7 +258,7 @@ struct rmlock *rm = arg; struct rm_priotracker *tracker; struct rm_queue *queue; - pc = pcpu_find(curcpu); + pc = get_pcpu(); for (queue = pc->pc_rm_queue.rmq_next; queue != &pc->pc_rm_queue; queue = queue->rmq_next) { @@ -355,7 +355,7 @@ struct pcpu *pc; critical_enter(); - pc = pcpu_find(curcpu); + pc = get_pcpu(); /* Check if we just need to do a proper critical_exit. */ if (!CPU_ISSET(pc->pc_cpuid, &rm->rm_writecpus)) { @@ -416,7 +416,7 @@ } critical_enter(); - pc = pcpu_find(curcpu); + pc = get_pcpu(); CPU_CLR(pc->pc_cpuid, &rm->rm_writecpus); rm_tracker_add(pc, tracker); sched_pin(); @@ -641,7 +641,7 @@ #ifdef INVARIANTS if (!(rm->lock_object.lo_flags & LO_RECURSABLE) && !trylock) { critical_enter(); - KASSERT(rm_trackers_present(pcpu_find(curcpu), rm, + KASSERT(rm_trackers_present(get_pcpu(), rm, curthread) == 0, ("rm_rlock: recursed on non-recursive rmlock %s @ %s:%d\n", rm->lock_object.lo_name, file, line)); @@ -771,7 +771,7 @@ } critical_enter(); - count = rm_trackers_present(pcpu_find(curcpu), rm, curthread); + count = rm_trackers_present(get_pcpu(), rm, curthread); critical_exit(); if (count == 0) @@ -797,7 +797,7 @@ rm->lock_object.lo_name, file, line); critical_enter(); - count = rm_trackers_present(pcpu_find(curcpu), rm, curthread); + count = rm_trackers_present(get_pcpu(), rm, curthread); critical_exit(); if (count != 0) Index: sys/mips/include/pcpu.h =================================================================== --- sys/mips/include/pcpu.h (revision 313193) +++ sys/mips/include/pcpu.h (working copy) @@ -39,16 +39,17 @@ struct pmap *pc_curpmap; /* pmap of curthread */ \ u_int32_t pc_next_asid; /* next ASID to alloc */ \ u_int32_t pc_asid_generation; /* current ASID generation */ \ - u_int pc_pending_ipis; /* IPIs pending to this CPU */ + u_int pc_pending_ipis; /* IPIs pending to this CPU */ \ + struct pcpu *pc_self; /* globally-uniqe self pointer */ #ifdef __mips_n64 #define PCPU_MD_MIPS64_FIELDS \ PCPU_MD_COMMON_FIELDS \ - char __pad[61] + char __pad[53] #else #define PCPU_MD_MIPS32_FIELDS \ PCPU_MD_COMMON_FIELDS \ - char __pad[193] + char __pad[189] #endif #ifdef __mips_n64 @@ -65,6 +66,13 @@ extern struct pcpu *pcpup; #define PCPUP pcpup +/* + * Since we use a wired TLB entry to map the same VA to a different + * physical page for each CPU, get_pcpu() must use the pc_self + * field to obtain a globally-unique pointer. + */ +#define get_pcpu() (PCPUP->pc_self) + #define PCPU_ADD(member, value) (PCPUP->pc_ ## member += (value)) #define PCPU_GET(member) (PCPUP->pc_ ## member) #define PCPU_INC(member) PCPU_ADD(member, 1) Index: sys/mips/mips/machdep.c =================================================================== --- sys/mips/mips/machdep.c (revision 313193) +++ sys/mips/mips/machdep.c (working copy) @@ -475,6 +475,7 @@ pcpu->pc_next_asid = 1; pcpu->pc_asid_generation = 1; + pcpu->pc_self = pcpu; #ifdef SMP if ((vm_offset_t)pcpup >= VM_MIN_KERNEL_ADDRESS && (vm_offset_t)pcpup <= VM_MAX_KERNEL_ADDRESS) { Index: sys/net/netisr.c =================================================================== --- sys/net/netisr.c (revision 313193) +++ sys/net/netisr.c (working copy) @@ -1268,9 +1268,7 @@ static void netisr_init(void *arg) { -#ifdef EARLY_AP_STARTUP struct pcpu *pc; -#endif NETISR_LOCK_INIT(); if (netisr_maxthreads == 0 || netisr_maxthreads < -1 ) @@ -1308,7 +1306,8 @@ netisr_start_swi(pc->pc_cpuid, pc); } #else - netisr_start_swi(curcpu, pcpu_find(curcpu)); + pc = get_pcpu(); + netisr_start_swi(pc->pc_cpuid, pc); #endif } SYSINIT(netisr_init, SI_SUB_SOFTINTR, SI_ORDER_FIRST, netisr_init, NULL); Index: sys/powerpc/include/cpufunc.h =================================================================== --- sys/powerpc/include/cpufunc.h (revision 313193) +++ sys/powerpc/include/cpufunc.h (working copy) @@ -201,7 +201,7 @@ } static __inline struct pcpu * -powerpc_get_pcpup(void) +get_pcpu(void) { struct pcpu *ret; Index: sys/powerpc/include/pcpu.h =================================================================== --- sys/powerpc/include/pcpu.h (revision 313193) +++ sys/powerpc/include/pcpu.h (working copy) @@ -142,7 +142,7 @@ #ifdef _KERNEL -#define pcpup ((struct pcpu *) powerpc_get_pcpup()) +#define pcpup (get_pcpu()) static __inline __pure2 struct thread * __curthread(void) Index: sys/sparc64/include/pcpu.h =================================================================== --- sys/sparc64/include/pcpu.h (revision 313193) +++ sys/sparc64/include/pcpu.h (working copy) @@ -74,6 +74,7 @@ register struct pcb *curpcb __asm__(__XSTRING(PCB_REG)); register struct pcpu *pcpup __asm__(__XSTRING(PCPU_REG)); +#define get_pcpu() (pcpup) #define PCPU_GET(member) (pcpup->pc_ ## member) static __inline __pure2 struct thread *
_______________________________________________ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"