On 01/14/2010 08:10 AM, Aurelien Jarno wrote:
On Tue, Jan 05, 2010 at 04:31:11PM -0800, Richard Henderson wrote:
A while ago Laurent pointed out that the setcc opcode emitted by
the setcond patch had unnecessary REX prefixes.
The existing P_REXB internal opcode flag unconditionally emits
the REX prefix. Technically it's not needed if the register in
question is %al, %bl, %cl, %dl.
Eliding the prefix requires splitting the P_REXB flag into two,
in order to indicate whether the byte register in question is
in the REG or the R/M field. Within TCG, the byte register is
in the REG field only for stores.
Signed-off-by: Richard Henderson<r...@twiddle.net>
---
tcg/x86_64/tcg-target.c | 46 ++++++++++++++++++++++++++++++----------------
1 files changed, 30 insertions(+), 16 deletions(-)
diff --git a/tcg/x86_64/tcg-target.c b/tcg/x86_64/tcg-target.c
index f584c94..8b6b68c 100644
--- a/tcg/x86_64/tcg-target.c
+++ b/tcg/x86_64/tcg-target.c
@@ -217,9 +217,10 @@ static inline int tcg_target_const_match(tcg_target_long
val,
#define JCC_JLE 0xe
#define JCC_JG 0xf
-#define P_EXT 0x100 /* 0x0f opcode prefix */
-#define P_REXW 0x200 /* set rex.w = 1 */
-#define P_REXB 0x400 /* force rex use for byte registers */
+#define P_EXT 0x100 /* 0x0f opcode prefix */
+#define P_REXW 0x200 /* set rex.w = 1 */
+#define P_REXB_R 0x400 /* REG field as byte register */
+#define P_REXB_RM 0x800 /* R/M field as byte register */
static const uint8_t tcg_cond_to_jcc[10] = {
[TCG_COND_EQ] = JCC_JE,
@@ -234,17 +235,30 @@ static const uint8_t tcg_cond_to_jcc[10] = {
[TCG_COND_GTU] = JCC_JA,
};
-static inline void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
+static void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
{
- int rex;
- rex = ((opc>> 6)& 0x8) | ((r>> 1)& 0x4) |
- ((x>> 2)& 2) | ((rm>> 3)& 1);
- if (rex || (opc& P_REXB)) {
+ int rex = 0;
+
+ rex |= (opc& P_REXW)>> 6; /* REX.W */
+ rex |= (r& 8)>> 1; /* REX.R */
+ rex |= (x& 8)>> 2; /* REX.X */
+ rex |= (rm& 8)>> 3; /* REX.B */
+
+ /* P_REXB_{R,RM} indicates that the given register is the low byte.
+ For %[abcd]l we need no REX prefix, but for %{si,di,bp,sp}l we do,
+ as otherwise the encoding indicates %[abcd]h. Note that the values
+ that are ORed in merely indicate that the REX byte must be present;
+ those bits get discarded in output. */
+ rex |= (r>= 4 ? opc& P_REXB_R : 0);
+ rex |= (rm>= 4 ? opc& P_REXB_RM : 0);
+
+ if (rex) {
tcg_out8(s, rex | 0x40);
}
With the above change, rex can be > 0xff. Not sure it's not a good idea
to not have an explicit cast when calling tcg_out8(), even if it
technically works.
Same as below.
- if (opc& P_EXT)
+ if (opc& P_EXT) {
tcg_out8(s, 0x0f);
- tcg_out8(s, opc& 0xff);
+ }
+ tcg_out8(s, opc);
What's the reason for removing the '& 0xff' part? tcg_out8() takes an uint8_t.
Yes, and the uint8_t truncates the value just fine. Is there any
particular reason you want to clutter the code with a duplicate
truncation? It might have been reasonable if the function name didn't
quite clearly indicate that a single byte was going to be output...
r~