Re: [Qemu-devel] [PATCH v7 04/11] target-mips: improve exception handling

Leon Alrae Tue, 15 Sep 2015 09:49:39 -0700

On 28/08/2015 10:08, Pavel Dovgaluk wrote:
>> From: Aurelien Jarno [mailto:aurel...@aurel32.net]
>> On 2015-08-13 14:12, Leon Alrae wrote:
>>> On 10/07/2015 10:57, Pavel Dovgalyuk wrote:
>>>> @@ -2364,14 +2363,12 @@ static void gen_st_cond (DisasContext *ctx, 
>>>> uint32_t opc, int rt,
>>>>  #if defined(TARGET_MIPS64)
>>>>      case OPC_SCD:
>>>>      case R6_OPC_SCD:
>>>> -        save_cpu_state(ctx, 1);
>>>>          op_st_scd(t1, t0, rt, ctx);
>>>>          opn = "scd";
>>>>          break;
>>>>  #endif
>>>>      case OPC_SC:
>>>>      case R6_OPC_SC:
>>>> -        save_cpu_state(ctx, 1);
>>>>          op_st_sc(t1, t0, rt, ctx);
>>>>          opn = "sc";
>>>>          break;
>>>
>>> Wouldn't we be better off assuming that conditional stores in linux-user
>>> always take an exception (we generate fake EXCP_SC exception) and avoid
>>> retranslation? After applying these changes I observed significant impact on
>>> performance in linux-user multithreaded apps, for instance c11-atomic-exec
>>> test before the change took just 2 seconds to finish, whereas now more than 
>>> 30...
>>
>> This really show the impact of retranslation and why we should avoid
>> it when not necessary. Coming back to the issue here, the fact that we
>> go through retranslation is actually due to the fact that
>> helper_raise_exception has been changed to go through retranslation.
>>
>> Given the code path between user-mode and softmmu is quite different,
>> we definitely need a different code path wrt exception and retranslation
>> for the two cases. That said if we want deterministic code execution
>> (the original purpose of this patch), I don't see how we can do without
>> forcing retranslation. Pavel, do you have an idea for that?
> 
> There is only one case when we can execute without retranslation -
> when the instruction is the last instruction in translation block.
> Then we can setup PC and flags before this last instruction.
> If the exception happens, we can just break the execution.
> The drawback of this method is breaking translation blocks into
> the smaller parts.


c11-atomic-exec.4 test execution time in linux-user:

* no changes:
real    0m3.039s
user    0m2.976s
sys     0m1.908s

* tb_lock + patch:
real    1m1.167s
user    0m57.240s
sys     0m36.678s

* tb_lock + patch + SC-without-retranslation:
real    0m3.016s
user    0m2.988s
sys     0m1.848s

I had to add tb_lock() to cpu_restore_state() in the first place, otherwise
all of my multithreaded user mode tests crash QEMU with this patch.

SC-without-retranslation (the diff below) seems to improve the situation,
and if I understand correctly we retain deterministic code execution.
Therefore if there are no objections I'll apply this patch + SC correction
to mips-next.

Thanks,
Leon

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 99b99c5..006cb96 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -2060,7 +2060,7 @@ static inline void op_st_##insn(TCGv arg1, TCGv arg2, int 
rt, DisasContext *ctx)
     tcg_gen_movi_tl(t0, rt | ((almask << 3) & 0x20));                        \
     tcg_gen_st_tl(t0, cpu_env, offsetof(CPUMIPSState, llreg));                 
  \
     tcg_gen_st_tl(arg1, cpu_env, offsetof(CPUMIPSState, llnewval));            
  \
-    gen_helper_0e0i(raise_exception, EXCP_SC);                               \
+    generate_exception_end(ctx, EXCP_SC);                                    \
     gen_set_label(l2);                                                       \
     tcg_gen_movi_tl(t0, 0);                                                  \
     gen_store_gpr(t0, rt);                                                   \

Re: [Qemu-devel] [PATCH v7 04/11] target-mips: improve exception handling

Reply via email to