On Mon, Sep 12, 2011 at 8:13 AM, Paolo Bonzini <pbonz...@redhat.com> wrote: > On 09/12/2011 10:01 AM, Richard Henderson wrote: >> >> > > After this patch set, only load and store op helpers remain in >> > > op_helper.c. I have some patches for those but they need more >> > > thought. >> > >> > Have you benchmarked it? >> >> Asking for a benchmark without full conversion is pointless. > > Agreed. But I would not push these patches without having tried them out on > a prototype of a full conversion (i.e. with the load/store helpers > converted, for which Blue Swirl said he has patches, and with the > environment not pinned to AREG0 in TCG code).
The load/store helpers are tricky. Some Sparc64 helpers now need five 32/64 bit arguments, that may be a problem on some hosts. Changing functions like tlb_fill() and do_unaligned_access() to use passed CPUState pointer instead of AREG0 needs global changes. The template system for generating the load/store functions is interesting. Then there are __ldb_mmu() and friends, called from TCG generated code. It would be highly desirable to limit the changes to only Sparc translator but I don't think global changes can be avoided. > So I hoped that he did have such a prototype, or alternatively that he > benchmarked them and showed only minor degradations. I don't see any slowdown. Maybe a real benchmark is needed. Looking at the code, there are only minor differences. On amd64 host, r14 is now available but does not get used for the new code, so that doesn't help. On i386 there are larger differences, but that is mostly because ebp is normally used for the frame pointer. Using it for a global register needs -fomit-frame-pointer. Disregarding the frame pointer issues, the changes are minor. For example i386 host, unpatched, op_helper.o: 00000dc0 <helper_udiv>: dc0: 83 ec 1c sub $0x1c,%esp dc3: 65 8b 0d 14 00 00 00 mov %gs:0x14,%ecx dca: 89 4c 24 0c mov %ecx,0xc(%esp) dce: 31 c9 xor %ecx,%ecx dd0: 8b 44 24 20 mov 0x20(%esp),%eax dd4: 8b 54 24 24 mov 0x24(%esp),%edx dd8: 8b 4c 24 0c mov 0xc(%esp),%ecx ddc: 65 33 0d 14 00 00 00 xor %gs:0x14,%ecx de3: 75 0a jne def <helper_udiv+0x2f> de5: 31 c9 xor %ecx,%ecx de7: 83 c4 1c add $0x1c,%esp dea: e9 f1 fe ff ff jmp ce0 <helper_udiv_common> def: e8 fc ff ff ff call df0 <helper_udiv+0x30> df4: 8d b6 00 00 00 00 lea 0x0(%esi),%esi dfa: 8d bf 00 00 00 00 lea 0x0(%edi),%edi Patched, function in helper.o: 000002a0 <helper_udiv>: 2a0: 55 push %ebp 2a1: 89 e5 mov %esp,%ebp 2a3: 53 push %ebx 2a4: 83 ec 14 sub $0x14,%esp 2a7: 8b 45 08 mov 0x8(%ebp),%eax 2aa: 65 8b 1d 14 00 00 00 mov %gs:0x14,%ebx 2b1: 89 5d f4 mov %ebx,-0xc(%ebp) 2b4: 31 db xor %ebx,%ebx 2b6: 8b 55 0c mov 0xc(%ebp),%edx 2b9: 8b 4d 10 mov 0x10(%ebp),%ecx 2bc: 8b 5d f4 mov -0xc(%ebp),%ebx 2bf: 65 33 1d 14 00 00 00 xor %gs:0x14,%ebx 2c6: 75 11 jne 2d9 <helper_udiv+0x39> 2c8: c7 45 08 00 00 00 00 movl $0x0,0x8(%ebp) 2cf: 83 c4 14 add $0x14,%esp 2d2: 5b pop %ebx 2d3: 5d pop %ebp 2d4: e9 e7 fe ff ff jmp 1c0 <helper_udiv_common> 2d9: e8 fc ff ff ff call 2da <helper_udiv+0x3a>