On Tue, 2023-09-26 at 13:16 +0100, Anton Ivanov wrote: > > For the time being it is mostly negative :)
Oh well :) > 1. The performance after the mm patch is down. By 30-40% on my standard bench. For the record, you mean this three-patch series that we're discussing in the thread of? Btw, Benjamin realized that MADV_DONTFORK is broken in UML, precisely _because_ we fork/copy the whole mm process and then try to fix it up. But we can only fix up things that actually have VMAs, and of course there are no VMAs with VM_DONTCOPY (set by MADV_DONTFORK) in the new mm after fork. To fix this, really we should either 1. Start from scratch, without copying, which my other patch [1] did. [1] https://lore.kernel.org/all/20230922131638.2c57ec713d1c.Id11dff4b349e6a8f0136bb6bb09f6e01a80befbb@changeid/ But of course that's more expensive because we now have to page-fault everything in the new process, and page faults are expensive. 2. Compare the new mm and the old mm, which requires putting it into arch_dup_mmap() like these patches here - where I'm not sure I understand at all why they cause a perf regression - and remove the VMAs that are marked VM_DONTCOPY in the old one. To be honest I don't really like _either_ of these approaches, nor the current "fork the process" approach that UML takes. It's very magic, and very much works around how Linux works. Remember that basically the mm process contents should match the page tables in the VMAs; but this is decidedly not true where fork() is involved, because while the VMAs are copied, most of the page tables are _not_ copied. Thus, we have a situation where after fork we don't take page faults in UML that we would take in a normal system (this part is good for performance), and I believe also vice versa, which would then perhaps explain the flush_tlb_page() in handle_page_fault(), because honestly I don't otherwise have an explanation for it. I think the better approach for correctness and integration into the kernel would be to actually admit that UML is special because page faults are so expensive, and * start with a fresh mm process every time * have vma_needs_copy() return true * completely fill the mappings according to only the new mm's VMAs in arch_dup_mmap() or perhaps later I don't know how that'd behave wrt. performance, though it likely cannot be better than with these patches, but at least it'd be more correct, and more obviously correct too, for starters, because then the actual mappings in the UML mm process would actually reflect the PTEs that Linux knows about. > 2. The preemption patches work fine on top (all 3 cases). The performance > difference stays. OK. > 3. We do not have anything of value to add in term of cond_resched() to the > drivers :( > Most drivers are fairly simplistic with no safe points to add this. Yeah, not surprised by this. > 6. Do we still need force_flush_all() in the arch_dup_mmap()? This works with > a non-forced tlb flush > using flush_tlb_mm(mm); Maybe not, does it make a difference though? > 7. In all cases, UML is doing something silly. > The CPU usage while doing find -type f -exec cat {} > /dev/null measured from > outside in non-preemptive and > PREEMPT_VOLUNTARY stays around 8-15%. The UML takes a sabbatical for the > remaining 85 instead of actually > doing work. PREEMPT is slightly better at 60, but still far from 100%. It > just keeps going into idle and I > cannot understand why. Is it just waiting for IO? johannes _______________________________________________ linux-um mailing list linux-um@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um