On 2/16/25 06:58, Richard Henderson wrote:
On 2/14/25 18:11, Michael Clark wrote:
the intent of this patch is more conventional nomenclature
but the constant pool data code is also simplified a little.

- merge new_pool_{alloc,insert} -> new_pool_data.
- rename TCGLabelPoolData -> TCGData.
- rename pool_labels -> pool_data.
- rename macro TCG_TARGET_NEED_POOL_DATA.
- move TCGData struct definition into tcg.h.
- comment translation block epilogue members.

You can see from this list that this should be multiple patches.

possibly. possibly not. I don't know how much value it adds to split it up as one can see a logical change together in the message and diff. "label" to "data" essentially. splitting it splits the logical name change across commits and imho would be harder as a reader of history.

splitting the name change and the merging of the two functions seems a little complex to coordinate as I would need some intermediate names. they seem to be part of the same change. if I changed it to varargs without naming consistent with my thinking it makes less sense to me.

TCGLabelPoolData is ambiguous and asks for potential confusion
with the unrelated TCGLabel type. there is no label in the sense
of TCGLabel.

Fair.

the label member is merely a pointer to the instruction text to
be updated with the relative address of the constant, the primary
data is the constant data pool at the end of translation blocks.
this relates more closely to .data sections in offline codegen
if we were to imagine a translation block has .text and .data.

No, it doesn't.  It relates most closely to data emitted within .text, accessed via pc-relative instructions with limited offsets.

This isn't a thing you'd have ever seen on x86 or x86_64, but it is quite common for arm32 (12-bit offsets), sh4 (8-bit offsets), m68k (16- bit offsets) and such.  Because the offsets are so small, they could even be placed *within* functions not just between them.

yes I am aware of what it is. sometimes referred to as constant islands. but you could also consider it as switching between .text and .data or maybe interleaved .sdata or "small data". and yes one needs also to be careful about pads. one could try to put host instruction text into them and use them as a piece of an emulator break out if you can find out how to modify the stack pointer to jump to these "constants" in RX pages.

I can revise the text. I see it as a reified text constant as the primary data with a label (pointer not object) and a relocation embedded inside the constant data object for performance. in a more conventional code generator one would create a TCGLabel and TCGReloc. and yes you are right. it's not really .data section. I am just curious about reconciling concepts between binary translators and more traditional compilers. but ideally without a loss in performance.

I actually have a VM in mind that has a constant stream with it's own counter that branches called IB (immediate base). IB is set in call procedure and we pack a vector into the link register with the relative offset of the program counter and immediate base register (i32,i32) gives call +/-2GiB reach. link register no longer has absolute address. and there is a branch instruction for the constant stream. return needs two immediate offsets displaced from the text and constant entry points to compute the return address.

I have a sketch for a design that does this. and constants don't come over the same memory port as data as it has dedicated fetch bandwidth with a constant prefetcher. I could build a front-end and back-end for it when I get time. a CPU with some GPU concepts but no fixed function rasterizer. also relocs are easier because a RISC using this gets more conventional aligned PC-relative relocs of whole words. so the linker would be a little faster. constants are all bonded register slot sized in the instruction form instead of a special immediate form. it needs extra bypassing for "large immediate" but the latency can be covered by adding pipeline stages. constant branch predictor in parallel up front.

thus TCGData is more succinct and more reflective of what the
structure contains; data emitted in the constant data pool at
the end of translation blocks. also, pool_labels is renamed to
pool_data as the primary contents of the list is constant data.

I guess.  TCGData is perhaps too short, but we can certainly avoid the confusion of "labels".

TCGPoolData would be okay and I thought about that. the reason against was that it competed with TCGPool. but in some senses that might be the right name because it is a data constant in the pool. but we embed the relocation and label (as pointer) without a full TCGLabel and TCGReloc.

finally, new_pool_alloc and new_pool_insert are merged into a
single function named new_pool_data, which moves nlongs to the
end of the parameter list with varargs to allocate, copy, and
insert constant data items to simplify new_pool_label et al.
a successive step would be to collapse callers into calling
new_pool_data and remove a layer of indirection.

Why?  varargs generally produces horrible code.
The split between alloc and insert was intentional to avoid this.

it's pretty good code on SysV because it goes via registers except for perhaps new_pool_l8 which will spill to stack and get copied unless the inliner can eliminate the copies. maybe windows has bad varargs. but I like the style better than deeper layers of wrapper functions. they should fix the compiler so that it produces better code.

btw I didn't run through a benchmark suite to check for performance regressions because I don't have one in the rig. it was more of a code comprehension comment expressed as a patch. it could be merged if it adds to the code comprehension without affecting performance.

diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index a9ca493d20f6..448c2330ef0f 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -72,6 +72,6 @@ typedef enum {
  } TCGReg;
  #define HAVE_TCG_QEMU_TB_EXEC
-#define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_NEED_POOL_DATA

Oops, this should have been removed with a417ef83.

I will refresh.

Reply via email to