On Thu, Feb 15, 2024 at 04:38:07PM +0100, Claudio Jeker wrote:
> On Thu, Feb 15, 2024 at 03:30:39PM +0000, Miod Vallat wrote:
> > > > A lot of this points towards a register window error in__tfork() which
> > > > affects only the parent process.
> > > 
> > > I think the issue is that cpu_fork() copies the u_pcb from parent to child
> > > and with that the special user register windows that got spilled because
> > > of some TL>0 fault. Since __tfork() uses a new stack it makes no sense
> > > to keep those windows around and possibly fault them back from the new
> > > thread (onto the old stack).
> > 
> > That makes sense.
> > 
> > > The following diff seems to fix the above repoducer for me. Please test.
> > > I will need to add some comments etc there (also not sure if the bzero()
> > > calls are needed or if clearing pcb_nsaved is enough.
> > 
> > Actually, rather than bzero'ing pcb_rwsp and pcb_rw, it would be better
> > to only bcopy offsetof(struct pcb, pcb_rw) rather than sizeof(struct
> > pcb) on line 139, i.e. something like this:
> > 
> > Index: vm_machdep.c
> > ===================================================================
> > RCS file: /OpenBSD/src/sys/arch/sparc64/sparc64/vm_machdep.c,v
> > retrieving revision 1.42
> > diff -u -p -r1.42 vm_machdep.c
> > --- vm_machdep.c    25 Oct 2022 06:05:57 -0000      1.42
> > +++ vm_machdep.c    15 Feb 2024 15:29:37 -0000
> > @@ -100,6 +100,7 @@ cpu_fork(struct proc *p1, struct proc *p
> >     struct pcb *npcb = &p2->p_addr->u_pcb;
> >     struct trapframe *tf2;
> >     struct rwindow *rp;
> > +   size_t pcbsz;
> >     extern struct proc proc0;
> >  
> >     /*
> > @@ -136,7 +137,15 @@ cpu_fork(struct proc *p1, struct proc *p
> >  #else
> >     opcb->lastcall = NULL;
> >  #endif
> > -   bcopy((caddr_t)opcb, (caddr_t)npcb, sizeof(struct pcb));
> > +   /*
> > +    * If a new stack is provided, do not bother copying saved windows
> > +    * in the new pcb. Also, we'll reset pcb_nsaved accordingly below.
> > +    */
> > +   if (stack != NULL)
> > +           pcbsz = offsetof(struct pcb, pcb_rw);
> > +   else
> > +           pcbsz = sizeof(struct pcb);
> > +   bcopy((caddr_t)opcb, (caddr_t)npcb, pcbsz);
> >     if (p1->p_md.md_fpstate) {
> >             fpusave_proc(p1, 1);
> >             p2->p_md.md_fpstate = malloc(sizeof(struct fpstate),
> > @@ -162,6 +171,7 @@ cpu_fork(struct proc *p1, struct proc *p
> >      * with space reserved for the frame, and zero the frame pointer.
> >      */
> >     if (stack != NULL) {
> > +           npcb->pcb_nsaved = 0;
> >             tf2->tf_out[6] = (u_int64_t)(u_long)stack - (BIAS + CC64FSZ);
> >             tf2->tf_in[6] = 0;
> >     }
> 
> Looks good to me. Will throw that onto my testbox and see.

Has been running for the last few hours without any issue.
OK claudio@ on that diff.

-- 
:wq Claudio

Reply via email to