On 2/16/25 00:00, Michael Clark wrote:
On 2/16/25 06:58, Richard Henderson wrote:

the label member is merely a pointer to the instruction text to
be updated with the relative address of the constant, the primary
data is the constant data pool at the end of translation blocks.
this relates more closely to .data sections in offline codegen
if we were to imagine a translation block has .text and .data.

No, it doesn't.  It relates most closely to data emitted within .text, accessed via pc- relative instructions with limited offsets.

This isn't a thing you'd have ever seen on x86 or x86_64, but it is quite common for arm32 (12-bit offsets), sh4 (8-bit offsets), m68k (16- bit offsets) and such.  Because the offsets are so small, they could even be placed *within* functions not just between them.

I mentioned before I like the idea and have thought about architectures with constant streams and constant branch units.

say for arguments sake we considered it 'TCData' with embedded label and reloc (the purpose is the constant after after all, just it is not a TCGTemp, it's an explicitly reified constant in the codegen emitters). wondering if we could add a "disposition" field to control placement. TCG_DISP_TEXT_TB, TCG_DISP_DATA, etc. this way you could ask the code generator to do something more conventional while still supporting the short relative constant islands. "disposition" might be better than section as a name. also a DATA section could be mmap R without X perms to lessen the risk of injecting code as constants.

I don't think there's any point to doing anything differently than we currently do: place the data at the end of the TB.

(1) The architectures that we host and use the constant pool currently have
    relatively large displacements: aarch64 (21 bit), x86_64 (32 bit),
    ppc (16 or 34 bit (power10 only)), riscv (32 bit), s390x (34 bit).

(2) The size of a TB pretty generally maxes out at 3-4k, but is firmly capped 
at 64k
    by uint16_t TranslationBlock.jmp_reset_offset.

(3) The 16 and 21-bit offsets are not large enough to stretch to a read-only 
mapping.

(4) Memory management of TranslationBlocks becomes *much* more complicated.

TCGConstant is another alternative I would consider as okay. distinct from TCGTemp of type TEMP_CONST which is heavier weight. it makes one wonder about reification of large implicit constants as opposed to the explicitly emitted ones we are talking about here.

TCGConstant isn't bad, but I think I prefer TCGPoolData as mooted before.

i'm looking at a TCG source-compatible code generator as an option so I may experiment locally. it is a private interface at the moment anyhow. that just seemed inconsistent as most structure definitions are in the header. but I understand it is a private interface.

The organization of tcg.h is from antiquity. I am actively trying to reduce the size of the exported API.


r~

Reply via email to