On 2/24/15 22:21, Chris Metcalf wrote: > On 2/24/2015 2:53 AM, Chen Gang S wrote: >> typedef struct CPUTLState { >> + uint64_t regs[TILEGX_R_COUNT]; /* Common used registers by outside */ >> + uint64_t zero; /* Zero register */ >> + uint64_t pc; /* Current pc */ >> CPU_COMMON >> } CPUTLState; > > I skimmed through this and my only comment is that I was surprised to see > "zero" as part of the state, since it's always 0. :-) No doubt there is > some reason that this makes sense. >
Originally, I did not add zero register, but after think of, for gen_st operation, tcg_gen_st*() only support from tcg_target_long to register, so I add zero register for "offsetof(CPUTLState, zero)". Welcome any better ways for it. >> +#define TILEGX_GEN_CHK_BEGIN(x) \ >> + if ((x) == TILEGX_R_ZERO) { >> +#define TILEGX_GEN_CHK_END(x) \ >> + return 0; \ >> + } \ >> + if ((x) >= TILEGX_R_COUNT) { \ >> + return -1; \ >> + } > > This macro pattern seems potentially a little confusing and I do wonder if > there is some way to avoid having to explicitly check the zero register every > time; for example, maybe you make it a legitimate part of the state and > declare that there are 64 registers, and then just always finish any > register-update phase by re-zeroing that register? It might yield a smaller > code footprint and probably be just as fast, as long as it was easy to know > where registers were updated. > Originally, I really used 64 registers, but after tried, I found that I still had to process TILEGX_R_ZERO specially, e.g. - When src is zero, we can use mov operation instead of add operation. - When dst is zero, in most cases, we can just skip the insn, but in some cases, it may cause exception in user mode (e.g st zero r0). Originally, I also tried a wrap function for zero register, but I found for different operands, when they meet zero register, they would need different operations: - For add/addi operation, it will change to mov/movi operation. - For mov operation, it will change to movi operation. - For shl3add, if srcb is zero register, it will change to shli operation. - For ld/st operation, it still be ld/st operation, but they need "offsetof(CPUTLState, zero)", and in some cases, it should be cause exception. So after think of, for me, I still prefer to use 56 registers with individual zero register, and use macros for it. > Also, note that you should model accesses to registers 56..62 as causing an > interrupt to be raised, rather than returning -1. But you might choose to > just put this on your TODO list and continue making forward progress for the > time being. But a FIXME comment here would be appropriate. > Yeah, thanks. I shall add it when I send patch v2 for it. Also I forgot to use "offsetof(CPUTLState, zero)" for ld zero register case, and still "return 0" for "st zero r1" or "ld r1 zero". I shall fix it in the patch v2. >> + case 0x3800000000000000ULL: > > There are a lot of constants defined in the tile <arch/opcode.h> header, and > presumably you could synthesize these constant values from them. Perhaps it > would make sense to import that header into qemu and then use symbolic values > for all of this kind of thing? > OK, thanks. originally I wanted to reuse them, but after think of, I prefer the 64-bit immediate number instead of. - The decoding function may be very long, but it is still a simple function, I guess, it is still simple enough for readers. - 64-bit immediate number has better performance under 64-bit machine (e.g x86-64). But I guess, there are still quite a few of inline functions (e.g. get src/dst) in arch/opcode_tilegx.h may be reused by me in the future. :-) > I can't really comment on most of the rest of the code, and I only skimmed it > quickly, but generally: good work getting as far as you have! > Thank you for your work and your encouragement. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed