On January 13, 2021 8:58:58 PM UTC, John Baldwin <j...@freebsd.org> wrote: >On 1/13/21 3:42 AM, myfreeweb wrote: >> >> >> On January 13, 2021 10:08:26 AM UTC, Emmanuel Vadot <m...@bidouilliste.com> >> wrote: >>> On Tue, 12 Jan 2021 15:16:55 +0200 >>> Konstantin Belousov <kostik...@gmail.com> wrote: >>> >>>> On Tue, Jan 12, 2021 at 11:43:00AM +0000, Emmanuel Vadot wrote: >>>>> The branch main has been updated by manu: >>>>> >>>>> URL: >>>>> https://cgit.FreeBSD.org/src/commit/?id=11d62b6f31ab4e99df6d0c6c23406b57eaa37f41 >>>>> >>>>> commit 11d62b6f31ab4e99df6d0c6c23406b57eaa37f41 >>>>> Author: Emmanuel Vadot <m...@freebsd.org> >>>>> AuthorDate: 2021-01-12 11:02:38 +0000 >>>>> Commit: Emmanuel Vadot <m...@freebsd.org> >>>>> CommitDate: 2021-01-12 11:31:00 +0000 >>>>> >>>>> linuxkpi: add kernel_fpu_begin/kernel_fpu_end >>>>> >>>>> With newer AMD GPUs (>=Navi,Renoir) there is FPU context usage in the >>>>> amdgpu driver. >>>>> The `kernel_fpu_begin/end` implementations in drm did not even allow >>>>> nested >>>>> begin-end blocks. >>>> >>>> Does Linux allow more then one thread to execute kernel_fpu_begin ? >>> >>> I actually have no idea, adding Greg to cc. >> >> Looks like they save the context into the current thread state, so yes? (drm >> doesn't need that) >> >> Also they seem to do something FPU_KERN_NOCTX like (??) because they disable >> preemption inside these blocks. >> (Where does our NOCTX actually store the state?) > >It doesn't store at all because threads aren't allowed to sleep in a critical >section, so the thread will never give up the CPU while in the FPU section. If >threads can voluntarily sleep (cv_wait*, *sleep(), etc.) while using >kernel_fpu_begin(), then NOCTX won't work and we will need something else.
Hmm but with no storage at all, how would it work from a syscall? The manpage does mention a "usermode save area" – I was talking about that. Linux kernel_fpu_begin starts with preempt_disable, so definitely no condvars and the like. No idea about simple time sleeps. But amdgpu doesn't seem to do even that. >However, the code snippet from the stackoverflow URL I posted earlier looks >exactly like the NOCTX case where we flush the user FPU state to the thread >if the FPU state is "dirty" and then load a clean initial state for use by >the FPU. It would also seem to never save the kernel FPU state anywhere by >counting on avoiding context switches. So, I think you probably should just >make this use NOCTX. NOCTX was the first thing I've tried, and it didn't work, but probably just because of the nesting. Haven't retried it with the nesting counter. Testing a bunch of things would be easier if I had one of the GPUs that use this code instead of having to ask someone else to test… _______________________________________________ dev-commits-src-main@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/dev-commits-src-main To unsubscribe, send any mail to "dev-commits-src-main-unsubscr...@freebsd.org"