On Mon, Sep 26, 2011 at 6:25 PM, Jan Kiszka <jan.kis...@siemens.com> wrote: > On 2011-09-26 20:20, Blue Swirl wrote: >> On Mon, Sep 26, 2011 at 5:33 PM, Jan Kiszka <jan.kis...@siemens.com> wrote: >>> On 2011-09-26 19:22, Blue Swirl wrote: >>>> On Mon, Sep 26, 2011 at 11:56 AM, Peter Maydell >>>> <peter.mayd...@linaro.org> wrote: >>>>> On 26 September 2011 12:43, Jan Kiszka <jan.kis...@siemens.com> wrote: >>>>>> On 2011-09-26 13:33, Peter Maydell wrote: >>>>>>> On 26 September 2011 11:51, Jan Kiszka <jan.kis...@siemens.com> wrote: >>>>>>>> This increases the overhead of frequently executed helpers. We need to >>>>>>>> move rule past QEMU_CFLAGS assignment to ensure that the required >>>>>>>> simple >>>>>>>> assignment picks up all bits. The signal workaround is moved just for >>>>>>>> the sake of consistency. >>>>>>> >>>>>>>> +# NOTE: Must be after the last QEMU_CFLAGS assignment >>>>>>>> +op_helper.o user-exec.o: QEMU_CFLAGS := $(subst >>>>>>>> -fstack-protector-all,,$(QEMU_CFLAGS)) $(HELPER_CFLAGS) >>>>>>> >>>>>>> Why also user-exec.o ? >>>>>> >>>>>> That's a good question. It doesn't look like it's deserving this. >>>>>> >>>>>>> Why not the other source files with helpers in? >>>>>> >>>>>> Name them and I add them. >>>>> >>>>> target-*/*helper.c ? >>>>> >>>>> But mostly I think what I'm trying to say is that this is making >>>>> a tradeoff between safety and speed, so it ought to come with a >>>>> rationale for why it is OK to remove the safety checks for these >>>>> files. Given that rationale you can identify other files that are >>>>> also safe/worthwhile to flip the flag for. >>>> >>>> I wouldn't remove -fstack-protector-all by default. Especially op code >>>> interfaces with the guest. >>> >>> I'd love to have some function attribute for this, because a stack >>> protector for rather simple arithmetic operations or something like >>> helper_cli/sti are pointlessly burned cycles. >> >> In order to avoid burning the cycles, there is a certain kernel module >> which gives almost native performance. > > Well, even in 2011 there are cases remaining where VT-x/AMV-V is not > available to your favorite hypervisor. > >> >>> Maybe we can introduce op_helper_simple.c. >>> >>>> >>>> For max performance version, I'd check if -fomit-frame-pointer and -O3 >>>> makes sense. See also this article: >>>> https://www.debian-administration.org/article/672/Optimizing_code_via_compiler_flags >>> >>> We already run without frame pointers, -O3 might be worth exploring in >>> addition. Still, that won't take the protector overhead away. >> >> It would be interesting to have some benchmarks. I'd expect that most >> of the run time is spent within generated code, the next largest item >> should be the translator and any helpers should be marginal. > > At least we've a rather static, long-running guest, not much is > recompiled once the system has settled.
Optimizing the translator or improving TCG optimizer could also improve performance. By the way, -fomit-frame-pointer is not enabled globally for some reason, it should be the default for non-debug builds.