On Sun, May 15, 2011 at 02:33:55PM +0300, Blue Swirl wrote: > On Sun, May 15, 2011 at 2:14 PM, Aurelien Jarno <aurel...@aurel32.net> wrote: > > On Sun, May 15, 2011 at 02:03:40PM +0300, Blue Swirl wrote: > >> On Sun, May 15, 2011 at 12:27 PM, Laurent Desnogues > >> <laurent.desnog...@gmail.com> wrote: > >> > On Sun, May 15, 2011 at 9:15 AM, Blue Swirl <blauwir...@gmail.com> wrote: > >> >> On Sun, May 15, 2011 at 1:04 AM, Aurelien Jarno <aurel...@aurel32.net> > >> >> wrote: > >> >>> On Sun, May 15, 2011 at 12:52:35AM +0300, Blue Swirl wrote: > >> >>>> On Sun, May 15, 2011 at 12:16 AM, Aurelien Jarno > >> >>>> <aurel...@aurel32.net> wrote: > >> >>>> > On Sat, May 14, 2011 at 10:35:20PM +0300, Blue Swirl wrote: > >> > [...] > >> >>>> > The env register is used very often (basically for every > >> >>>> > load/store, but > >> >>>> > also a lot of helpers), so it makes sense to reserve a register for > >> >>>> > it. > >> >>>> > > >> >>>> > For what I understand from your patch series, you prefer to pass > >> >>>> > this > >> >>>> > register explicitly to TCG functions. This basically means this TCG > >> >>>> > global will be loaded to host register as soon as it is used, but > >> >>>> > also > >> >>>> > regularly, as globals are saved back to their canonical location > >> >>>> > before > >> >>>> > an helper or a load/store. > >> >>>> > > >> >>>> > So it seems that this patch series will just allowing the "env > >> >>>> > register" > >> >>>> > to change over time, though it will not spare one more register for > >> >>>> > the > >> >>>> > TCG code, and it will emit longer TCG code to regularly reload the > >> >>>> > env > >> >>>> > global into a host register. > >> >>>> > >> >>>> But there will be one more register available in some cases. In other > >> >>> > >> >>> Inside the TCG code, it will basically happens very rarely, given > >> >>> load/store are really the most used instructions, and they need to load > >> >>> the env register. > >> >> > >> >> Not exactly, from a sample run with -d op_opt: > >> >> $ egrep -v -e '^$' -v -e 'OP after' -v -e ' end' -v -e 'Search PC' > >> >> /tmp/qemu.log | awk '{print $1}' | sort | uniq -c|sort -rn > >> >> 1673966 movi_i32 > >> >> 653931 ld_i32 > >> >> 607432 mov_i32 > >> >> 428684 st_i32 > >> >> 326878 movi_i64 > >> >> 308626 add_i32 > >> >> 283186 call > >> >> 256817 exit_tb > >> >> 207232 nopn > >> >> 189388 goto_tb > >> >> 122398 and_i32 > >> >> 117997 shr_i32 > >> >> 89107 qemu_ld32 > >> >> 82926 set_label > >> >> 82713 brcond_i32 > >> >> 67169 qemu_st32 > >> >> 55109 or_i32 > >> >> 46536 ext32u_i64 > >> >> 44288 xor_i32 > >> >> 38103 sub_i32 > >> >> 26361 shl_i32 > >> >> 23218 shl_i64 > >> >> 23218 qemu_st64 > >> >> 23218 or_i64 > >> >> 20474 shr_i64 > >> >> 20445 qemu_ld64 > >> >> 11161 qemu_ld8u > >> >> 10409 qemu_st8 > >> >> 5013 qemu_ld16u > >> >> 3795 qemu_st16 > >> >> 2776 qemu_ld8s > >> >> 1915 sar_i32 > >> >> 1414 qemu_ld16s > >> >> 839 not_i32 > >> >> 579 setcond_i32 > >> >> 213 br > >> >> 42 ext32s_i64 > >> >> 30 mul_i64 > >> > > >> > Unless I missed something, this doesn't show the usage of > >> > ld/st per TB, which is what Aurélien was looking for if I > >> > understood correctly. All I can say is that you had at > >> > most 256817 TB's and 234507 qemu_ld/st, so about one per > >> > TB. > >> > >> The question was ratio of loads/stores to other instructions. The > >> statistics are not per TB. There were about 174880 TBs. > >> > >> > Anyway I must be thick, because I fail to see how > >> > generated code could access guest CPU registers without a > >> > pointer to the CPU env :-) > >> > > >> > IIUC the SPARC translator uses ld_i32/st_i32 mainly for > >> > accessing the guest CPU registers, which due to register > >> > windows is held in a dedicated global temp. Is that > >> > correct? If so this is kind of hiding accesses to the > >> > CPU env; all other targets read/write registers by using > >> > CPU env (through the use global temps in most cases). > >> > > >> > So I think most (if not almost all) TB will need a pointer > >> > to CPU env, which is why I think Aurélien's proposal to > >> > keep a dedicated register that'd be loaded in the prologue > >> > is the only way to not degrade performance of the > >> > generated code (I'd add that this dedicated register > >> > should be the one defined by the ABI as holding the first > >> > parameter value, if that's possible; I'm afraid this is > >> > not necessarily a good idea). > >> > >> CPU env will be used, but the register could be made available for > >> other uses too. > >> > > > > What other uses? It is needed for almost every TB, so there is not other > > possible use. Let's take an example, a simple TB from mips on x86_64: > > > > | IN: start_kernel > > | 0x80510958: jal 0x8051d50c > > | 0x8051095c: nop > > | > > | OP after liveness analysis: > > | movi_i32 ra,$0x80510960 > > | movi_i32 PC,$0x8051d50c > > | exit_tb $0x0 > > | end > > > > | OUT: [size=28] > > | 0x40f22960: mov $0x80510960,%ebp > > | 0x40f22965: mov %ebp,0x7c(%r14) > > | 0x40f22969: mov $0x8051d50c,%ebp > > | 0x40f2296e: mov %ebp,0x80(%r14) > > | 0x40f22975: xor %eax,%eax > > | 0x40f22977: jmpq 0x115042e > > > > x86_64 uses r14 as TCG_AREG0. Despite the instructions being quite > > simple (only 2 movi_i32), the resulting code makes 2 access to env to > > save the two registers. Having to reload the env pointer each time to a > > register would clearly increase the size of this TB. > > I don't think TCG would be that simple, instead the pointer would be > loaded only once in this case. > > > Another question, imagine env is not in a register anymore. What kind of > > code do you plan to use to load the env pointer into a register? This is > > something we need to know to evaluate the cost of not having a fixed > > register for env withing the TCG code. > > The prologue could save it to the first temp dedicated for this. >
So basically you mean you want to keep a dedicated temp for env in the TCG generated code??? It seems to go in the opposite of your previous mails stating a full elimination of TCG_AREG0. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net