On 2/16/25 00:00, Michael Clark wrote:
On 2/16/25 06:58, Richard Henderson wrote:
the label member is merely a pointer to the instruction text to
be updated with the relative address of the constant, the primary
data is the constant data pool at the end of translation blocks.
this relates more closely to .data sections in offline codegen
if we were to imagine a translation block has .text and .data.
No, it doesn't. It relates most closely to data emitted within .text, accessed via pc-
relative instructions with limited offsets.
This isn't a thing you'd have ever seen on x86 or x86_64, but it is quite common for
arm32 (12-bit offsets), sh4 (8-bit offsets), m68k (16- bit offsets) and such. Because
the offsets are so small, they could even be placed *within* functions not just between
them.
I mentioned before I like the idea and have thought about architectures with constant
streams and constant branch units.
say for arguments sake we considered it 'TCData' with embedded label and reloc (the
purpose is the constant after after all, just it is not a TCGTemp, it's an explicitly
reified constant in the codegen emitters). wondering if we could add a "disposition" field
to control placement. TCG_DISP_TEXT_TB, TCG_DISP_DATA, etc. this way you could ask the
code generator to do something more conventional while still supporting the short relative
constant islands. "disposition" might be better than section as a name. also a DATA
section could be mmap R without X perms to lessen the risk of injecting code as constants.
I don't think there's any point to doing anything differently than we currently do: place
the data at the end of the TB.
(1) The architectures that we host and use the constant pool currently have
relatively large displacements: aarch64 (21 bit), x86_64 (32 bit),
ppc (16 or 34 bit (power10 only)), riscv (32 bit), s390x (34 bit).
(2) The size of a TB pretty generally maxes out at 3-4k, but is firmly capped
at 64k
by uint16_t TranslationBlock.jmp_reset_offset.
(3) The 16 and 21-bit offsets are not large enough to stretch to a read-only
mapping.
(4) Memory management of TranslationBlocks becomes *much* more complicated.
TCGConstant is another alternative I would consider as okay. distinct from TCGTemp of type
TEMP_CONST which is heavier weight. it makes one wonder about reification of large
implicit constants as opposed to the explicitly emitted ones we are talking about here.
TCGConstant isn't bad, but I think I prefer TCGPoolData as mooted before.
i'm looking at a TCG source-compatible code generator as an option so I may experiment
locally. it is a private interface at the moment anyhow. that just seemed inconsistent as
most structure definitions are in the header. but I understand it is a private interface.
The organization of tcg.h is from antiquity. I am actively trying to reduce the size of
the exported API.
r~