On Sun, Apr 10, 2011 at 09:25:33PM +0200, Alexander Graf wrote: > > On 10.04.2011, at 21:23, Aurelien Jarno wrote: > > > On Tue, Apr 05, 2011 at 09:55:09AM +0200, Alexander Graf wrote: > >> > >> On 05.04.2011, at 06:54, Aurelien Jarno wrote: > >> > >>> On Mon, Apr 04, 2011 at 04:32:24PM +0200, Alexander Graf wrote: > >>>> With the s390x target we use the deposit instruction to store 32bit > >>>> values > >>>> into 64bit registers without clobbering the upper 32 bits. > >>>> > >>>> This specific operation can be optimized slightly by using the ext > >>>> operation > >>>> instead of an explicit and in the deposit instruction. This patch adds > >>>> that > >>>> special case to the generic deposit implementation. > >>>> > >>>> Signed-off-by: Alexander Graf <ag...@suse.de> > >>>> --- > >>>> tcg/tcg-op.h | 6 +++++- > >>>> 1 files changed, 5 insertions(+), 1 deletions(-) > >>> > >>> Have you really measuring a difference here? This should already be > >>> handled, at least on x86, by this code: > >>> > >>> if (TCG_TARGET_REG_BITS == 64) { > >>> if (val == 0xffffffffu) { > >>> tcg_out_ext32u(s, r0, r0); > >>> return; > >>> } > >>> if (val == (uint32_t)val) { > >>> /* AND with no high bits set can use a 32-bit operation. */ > >>> rexw = 0; > >>> } > >>> } > >> > >> I've certainly looked at the -d op logs and seen that instead of creating > >> a const tcg variable plus an AND there was now an extu opcode issued, yes. > >> No idea why the case up there didn't trigger. > >> > > > > The question there is looking at -d out_asm. They should be the same at > > the end as the code I pasted above is from tcg/i386/tcg-target.c. > > Yes. I was trying to optimize for maximum op length. TCG defines a maximum > number of tcg ops to be issued by each target instruction. Since s390 is very > CISCy, there are instructions that translate into lots of microops, but are > still faster than a C call (register save/restore mostly). > > Without this patch, there are some places where we hit that number :).
Is it on 32-bit on or 64-bit? If we reach this number, it's probably better to either implement this instruction with an helper, or maybe increase the number of maximum ops. What is this instruction? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net