erichkeane added a comment. In D112349#3107460 <https://reviews.llvm.org/D112349#3107460>, @ibookstein wrote:
> I'll first explain my thought process about the representation of aliases and > ifuncs in the IR, and why I think both aliasees and resolvers must always be > defined; I hope I'm not completely off track and would love it if @MaskRay > could weigh in as to whether I make sense. > Let's start at the level of the object file; My understanding is that aliases > apply to ELF and MachO, and ifuncs apply only to ELF. I'm not at all > acquainted with MachO, but on the ELF side, my understanding is that: > > 1. Aliases are simply lowered to additional symbols with the same st_value as > their aliasee. As long as the value of a symbol has to be concrete/numeric > and cannot express a way to refer to another symbol, for aliases to make > sense and have the correct semantics at this level, their aliasee must be > defined at the IR level. Otherwise all you're left with at the object file > level is an undefined symbol with no way to express that it 'wants to' alias > an external symbol with some specified name. In other words, symbols are > either undefined (st_shndx == 0, st_value meaningless) or defined (st_shndx > != 0, st_value meaningful and holds a section offset). If we were to allow > aliases to have undefined aliasees, they would decay to simple undefined > symbols and lose their aliasee information. > 2. IFuncs are lowered to specially typed symbols whose st_value is the > resolver. In much the same way as aliases, for this to actually have any > meaning, the resolver must be defined (because you have no way to specify > "the value is in another castle named 'XYZ'", only "defined at offset X" or > "undefined"). When we allow ifuncs to have undefined resolvers, they decay to > simple undefined symbols with the additional wart of having a special symbol > type, but the desired resolver name is lost. Concretely, as long as the > linker doesn't throw a fuss at said wart, for the references against that > symbol from within the object file this will behave like a simple undefined > external function. Because in your implementation one TU will have a > cpu_dispatch and therefore a defined resolver, it will 'win' and > intra-EXE/intra-DSO references against the ifunc will indeed be bound against > the return value of the resolver. If no translation unit in the EXE/DSO had > an ifunc with the same name and a defined resolver, you'd end up with a > peculiar undefined symbol of type ifunc in the EXE/DSO (same as the .o). > > It is my conclusion therefore that ifuncs with undefined resolvers behave > exactly like function declarations (and lose the name of the resolver), as > long as the linker is willing to accept such weird symbols. > Therefore, at the IR level, they're representational slack at best, and don't > do what you want (possibly binding against a differently-named resolver) at > worst, so they should not be allowed. > >> My understanding is the frontend's semantic rules are/were different from >> the IR rules, which is why we implemented it that way. > > As I understand it, features like aliases and ifuncs consist mostly of > vertical plumbing to expose low-level object-file semantics, and their design > must be informed by them. > >> I'm not sure what you mean here? Are you suggesting that an undefined >> resolver should instead just implement an undefined 'function' for the >> .ifunc? This doesn't seem right? Wouldn't that cause ODR issues? > > As I understand it, making the symbol your current implementation calls > "x.ifunc" a function **declaration** which gets upgraded to an ifunc with a > defined resolver on encountering cpu_dispatch would yield the correct > behavior. > >> I guess I still don't understand what the practical limitation that requires >> ifuncs to have a defined resolver? The resolver is just a normal function, >> so it seems to me that allowing them to have normal linking rules makes >> sense? I personally think this is the least obtrusive change; this patch is >> adding a limitation that didn't exist previously unnecessarily. > > I //think// I've addressed this in the wall of text above > >> It just seems odd I guess to name a function .ifunc, and not have it be an >> ifunc? What does our linker think about that? > > Ah, the name is just a name :) > As far as the linker is concerned, it encounters an object file with an > undefined symbol (of type STT_NOTYPE) and an object file with a defined > symbol with the same name, of type STT_GNU_IFUNC. It will bind references in > the former against the definition in the latter. > Here's my trying it out: > > itay@CTHULHU ~/tmp/ifuncdecl/tu> cat main.c > int foo(void); > int main() { return foo(); } > > itay@CTHULHU ~/tmp/ifuncdecl/tu> cat foo.c > static int foo_impl(void) { return 42; } > static void *foo_resolver(void) { return &foo_impl; } > int foo(void) __attribute__((ifunc("foo_resolver"))); > > itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 -c main.c -o main.c.o > itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 -c foo.c -o foo.c.o > itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 main.c.o foo.c.o -o main > itay@CTHULHU ~/tmp/ifuncdecl/tu> ./main > itay@CTHULHU ~/tmp/ifuncdecl/tu [42]> > itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU > main.c.o --symbols | grep foo > 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND foo > itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU > foo.c.o --symbols | grep foo > 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c > 3: 0000000000000000 16 FUNC LOCAL DEFAULT 2 foo_resolver > 4: 0000000000000010 11 FUNC LOCAL DEFAULT 2 foo_impl > 5: 0000000000000000 16 IFUNC GLOBAL DEFAULT 2 foo > itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU > main --symbols | grep foo > 35: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c > 36: 0000000000401150 16 FUNC LOCAL DEFAULT 13 foo_resolver > 37: 0000000000401160 11 FUNC LOCAL DEFAULT 13 foo_impl > 56: 0000000000401150 16 IFUNC GLOBAL DEFAULT 13 foo > itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU > main --relocations > > Relocation section '.rela.dyn' at offset 0x3d0 contains 2 entries: > Offset Info Type Symbol's Value > Symbol's Name + Addend > 0000000000403ff0 0000000100000006 R_X86_64_GLOB_DAT 0000000000000000 > __libc_start_main@GLIBC_2.2.5 + 0 > 0000000000403ff8 0000000200000006 R_X86_64_GLOB_DAT 0000000000000000 > __gmon_start__ + 0 > > Relocation section '.rela.plt' at offset 0x400 contains 1 entries: > Offset Info Type Symbol's Value > Symbol's Name + Addend > 0000000000404018 0000000000000025 R_X86_64_IRELATIVE 401150 > > > >> Thats correct, these aren't 'aliases' or 'ifuncs' as far as the CFE is >> concerned; they are multiversioned functions. That 'Aliases' and 'ifunc' >> list in the CFE are the AST-constructs of those, not the IR constructs, so >> there is no reason to put the multiversioned thinks in that list, since they >> are implementation details. Emitting an error "invalid alias!"/etc for > > I see, makes sense, thanks for the explanation. From my perspective, an ifunc is just a linkable entity, as is a resolver. If the linker can merge symbols for the resolver (and an ifunc points to one), it seems to me to make a ton of sense to allow the resolver to be defined in another TU? I guess I feel the same way with an alias, why can't I just alias to a function in a different TU, so long as this is a linkable entity? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112349/new/ https://reviews.llvm.org/D112349 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits