Hello,
On Thu, 28 Nov 2024 19:37:21 +0900, Benjamin Berg wrote: > > +#ifndef CONFIG_MMU > > + memset(&r, 0, sizeof(r)); > > + /* mark is_user=1 when the IP is from userspace code. */ > > + if (mc && (REGS_IP(mc->gregs) > uml_reserved > > + && REGS_IP(mc->gregs) < high_physmem)) > > + r.is_user = 1; > > + else > > +#endif > > + r.is_user = 0; > > Does this work if we load modules dynamically? > > I suppose one could map them into a separate memory area rather than > running them directly from the physical memory. > Otherwise we'll also get problem with the SECCOMP filter. currently, I thought modules use the separate area from execmem, but nommu allocator ignores this location info to map the memory; instead mixing up with area used by userspace programs. we may be able to come up with execmem_arch_setup() to fix this situation. so, no, this is_user detection doesn't work; modules also become is_user=1. MMU full allocator (normal UML and seccomp asl well ?) seems to be fine as long as using execmem. I will look into detail how we should handle. > > if (sig == SIGSEGV) { > > /* For segfaults, we want the data from the sigcontext. */ > > get_regs_from_mc(&r, mc); > > @@ -191,6 +199,7 @@ static void hard_handler(int sig, siginfo_t *si, void > > *p) > > ucontext_t *uc = p; > > mcontext_t *mc = &uc->uc_mcontext; > > unsigned long pending = 1UL << sig; > > + int is_segv = 0; > > > > do { > > int nested, bail; > > @@ -214,6 +223,7 @@ static void hard_handler(int sig, siginfo_t *si, void > > *p) > > > > while ((sig = ffs(pending)) != 0){ > > sig--; > > + is_segv = (sig == SIGSEGV) ? 1 : 0; > > pending &= ~(1 << sig); > > (*handlers[sig])(sig, (struct siginfo *)si, mc); > > } > > @@ -227,6 +237,12 @@ static void hard_handler(int sig, siginfo_t *si, void > > *p) > > if (!nested) > > pending = from_irq_stack(nested); > > } while (pending); > > + > > +#ifndef CONFIG_MMU > > + /* if there is SIGSEGV notified, let the userspace run w/ __noreturn */ > > + if (is_segv) > > + sigsegv_post_routine(); > > +#endif > > } > > I am confused, this doesn't feel quite correct to me. thanks for pointing this out. the above code, which I spot the working example under nommu, is indeed suspicious and doesn't look a right code. that signal handing (this patch) is immature, and need more work to understand existing code, nommu characteristic, etc. > So, for normal UML, I think we always do an rt_sigreturn. Which means, > we always go back to the corresponding *kernel* task. To schedule in > response to SIGALRM, we forward the signal to the userspace process. > I believe that means: > 1. We cannot schedule kernel threads (that seems like a bug) > 2. Scheduling for userspace happens once the signal is delivered. > Then userspace() saves the state and calls interrupt_end(). > > > Now, keep in mind that we are on the separate signal stack here. If we > jump anywhere directly, we abandon the old state information stored by > the host kernel into the mcontext. We can absolutely do that, but we > need to be careful to not forget anything. > > As such, I wonder whether nommu should: > 1. When entering from kernel, update "current->thread.switch_buf" > from the mcontext. > - If we need to schedule, push a stack frame that calls the scheduling > code and returns with the correct state. > 2. When entering from user, store the task registers from the > mcontext. At some point (here or earlier) ensure that the > "current->thread.switch_buf" is set up so that we can return to > userspace by restoring the task registers. > - To schedule, piggy back on 1. or add special code. > 3. Always do a UML_LONGJMP() back into the "current" task. thanks, the current code jumps in the signal handler and unblocking signals without returning the handler (and not calling rt_sigreturn at host either) upon SIGSEGV, which should not work as you mentioned. I will also investigate how I can handle. > That said, I am probably not having the full picture right now. > > Benjamin > > PS: On a further note, I think the current code to enter userspace > cannot handle single stepping. I suppose that is fine, but you should > probably set arch_has_single_step to 0 for nommu. I did almost zero tests with ptrace(2) (inside nommu UM) and might miss a lot of features that mmu-UM could. will also look into that. thanks, -- Hajime