On 2011-09-26 20:20, Blue Swirl wrote: > On Mon, Sep 26, 2011 at 5:33 PM, Jan Kiszka <jan.kis...@siemens.com> wrote: >> On 2011-09-26 19:22, Blue Swirl wrote: >>> On Mon, Sep 26, 2011 at 11:56 AM, Peter Maydell >>> <peter.mayd...@linaro.org> wrote: >>>> On 26 September 2011 12:43, Jan Kiszka <jan.kis...@siemens.com> wrote: >>>>> On 2011-09-26 13:33, Peter Maydell wrote: >>>>>> On 26 September 2011 11:51, Jan Kiszka <jan.kis...@siemens.com> wrote: >>>>>>> This increases the overhead of frequently executed helpers. We need to >>>>>>> move rule past QEMU_CFLAGS assignment to ensure that the required simple >>>>>>> assignment picks up all bits. The signal workaround is moved just for >>>>>>> the sake of consistency. >>>>>> >>>>>>> +# NOTE: Must be after the last QEMU_CFLAGS assignment >>>>>>> +op_helper.o user-exec.o: QEMU_CFLAGS := $(subst >>>>>>> -fstack-protector-all,,$(QEMU_CFLAGS)) $(HELPER_CFLAGS) >>>>>> >>>>>> Why also user-exec.o ? >>>>> >>>>> That's a good question. It doesn't look like it's deserving this. >>>>> >>>>>> Why not the other source files with helpers in? >>>>> >>>>> Name them and I add them. >>>> >>>> target-*/*helper.c ? >>>> >>>> But mostly I think what I'm trying to say is that this is making >>>> a tradeoff between safety and speed, so it ought to come with a >>>> rationale for why it is OK to remove the safety checks for these >>>> files. Given that rationale you can identify other files that are >>>> also safe/worthwhile to flip the flag for. >>> >>> I wouldn't remove -fstack-protector-all by default. Especially op code >>> interfaces with the guest. >> >> I'd love to have some function attribute for this, because a stack >> protector for rather simple arithmetic operations or something like >> helper_cli/sti are pointlessly burned cycles. > > In order to avoid burning the cycles, there is a certain kernel module > which gives almost native performance.
Well, even in 2011 there are cases remaining where VT-x/AMV-V is not available to your favorite hypervisor. > >> Maybe we can introduce op_helper_simple.c. >> >>> >>> For max performance version, I'd check if -fomit-frame-pointer and -O3 >>> makes sense. See also this article: >>> https://www.debian-administration.org/article/672/Optimizing_code_via_compiler_flags >> >> We already run without frame pointers, -O3 might be worth exploring in >> addition. Still, that won't take the protector overhead away. > > It would be interesting to have some benchmarks. I'd expect that most > of the run time is spent within generated code, the next largest item > should be the translator and any helpers should be marginal. At least we've a rather static, long-running guest, not much is recompiled once the system has settled. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux