On 2/7/20 12:43 PM, Taylor Simpson wrote: > >> -----Original Message----- >> From: Richard Henderson <richard.hender...@linaro.org> >> >> But I encourage you to re-think your purely mechanical approach to the >> hexagon port. It seems to me that you should be doing much more during >> the translation phase so that you can minimize the number of helpers that >> you require. > > There are a couple of things we could do > - Short term: Add #ifdef's to the generated code so that the helper isn't > compiled when there is a fWRAP_<tag> defined. There are currently ~500 > instructions where this is the case.
Definitely. > - Long term: Integrate rev.ng's approach that uses flex/bison to parse the > semantics and generate TCG code. There is perhaps an intermediate step that merely special-cases the load/store insns. With rare exceptions (hah!) these are the cases that will most often raise an exception. Moreover, they are the *only* cases that can raise an exception without requiring a helper call anyway. There are a number of cases that I can think of: { r6 = memb(r1) r7 = memb(r2) } qemu_ld t0, r1, MO_UB, mmu_idx qemu_ld t1, r2, MO_UB, mmu_idx mov r6, t0 mov r7, t1 { r6 = memb(r1) memb(r2) = r7 } qemu_ld t0, r1, MO_UB, mmu_idx qemu_st r7, r2, MO_UB, mmu_idx mov r6, t0 These being the "normal" case wherein the memops are unconditional, and can simply use a temp for semantics. Similarly for MEMOP, NV, or SYSTEM insns in slot0. { r6 = memb(r1) if (p0) r7 = memb(r7) } qemu_ld l0, r1, MO_UB, mmu_idx andi t1, p0, 1 brcondi t1, 0, L1 qemu_ld r7, r2, MO_UB, mmu_idx L1: mov r6, l0 For a conditional load in slot 0, we can load directly into the final destination register and skip the temporary. Because TCG doesn't do global register allocation, any temporary crossing a basic block boundary gets flushed to stack. So this avoids sending the r7 value through an unnecessary round trip. This works because (obviously) nothing can raise an exception after slot0, and the only thing that comes after is the commit phase. This can be extended to a conditional load in slot1, when we notice that the insn in slot0 cannot raise an exception. { memb(r1) = r3 memb(r2) = r4 } call helper_probe_access, r1, MMU_DATA_STORE, 1 call helper_probe_access, r2, MMU_DATA_STORE, 1 qemu_st r3, r1, MO_UB, mmu_idx qemu_st r4, r2, MO_UB, mmu_idx { memb(r1) = r3 r4 = memb(r2) } call helper_probe_access, r1, MMU_DATA_STORE, 1 call helper_probe_access, r2, MMU_DATA_LOAD, 1 qemu_st r3, r1, MO_UB, mmu_idx qemu_ld r4, r2, MO_UB, mmu_idx These cases with a store in slot1 are irritating, because I see that (1) all exceptions must be recognized before anything commits, and (2) slot1 exceptions must have preference over slot0 exceptions. But we can probe them easily enough. > - Long long term: A much more general approach will be to turn the C > semantics code into LLVM IR and generate TCG from the IR. Why would you imagine this to be more interesting than using flex/bison? r~