https://sourceware.org/bugzilla/show_bug.cgi?id=22903
Bug ID: 22903 Summary: [AArch64] Insufficient veneer stub alignment Product: binutils Version: 2.31 (HEAD) Status: UNCONFIRMED Severity: enhancement Priority: P2 Component: ld Assignee: unassigned at sourceware dot org Reporter: pexu at sourceware dot mail.kapsi.fi Target Milestone: --- Hi. It is not currently possible to specify an alignment requirement that will be used for generated veneer stubs (i.e. far calls for -fpic, -fpie etc. builds). Currently, the alignment for the stubs is 4 bytes. While this works just fine for the majority of the systems, it works only because many requisite deeds has been done beforehand (and a hint of luck, too). The problematic veneer template (aarch64_long_branch_stub at bfd/elfnn-aarch64.c) uses LDR to load the far address. The address itself is stored after the veneer code block, which does the address loading (via LDR/ADD) and branching. The template looks like this: ldr ip0, 1f # <-- ip0, i.e. X16, i.e. 64-bit register adr ip1, #0 add ip0, ip0, ip1 br ip0 1: .xword <address> While the address is 8-byte aligned within the stub itself, it will be misaligned unless the veneer lands on a 8-byte (or more) aligned address. ARMv8-A ARM clearly states, that unless an address is accessed to the size of the data element being accessed (i.e. N-bit accesses must be N-bit aligned) either an Alignment fault is generated or an unaligned access is performed. It is possible to disable the alignment check, and thus perform an unaligned access, via system register SCTLR_ELx.A (e.g. the case for Linux). However, there's a small catch 22 that is well buried into the small details within the ARM. If the stage 1 address translation is disabled (e.g. MMU disabled), Device-nGnRnE memory type is assigned to all data accesses (or the address simply happens to be some type of Device memory, nothing unusual with SoCs). Unlike Normal memory type, all accesses to any type of Device memory *must* be aligned, period. So, if the code has to deal with a large memory area and is not able to use MMU (say, not available or being set up), and thus no address translation is enabled, or for whatever reason uses Device memory type, LD's current approach will generate code, that is highly prone to intermittent failures that could be difficult to track down (without proper JTAG tools) as no matter how well the user does his task, the generated code is the source of the failure. Also, it should be understood that it would be an overkill and highly complex task trying to recover from this sort of exception (must interpret the bytecode, then perform aligned access(es), maybe patch the bytecode etc.) while the proper thing to do is to simply not perform any unaligned accesses when such accesses are not possible. Obviously, one can always just generate the long branches by hand, maybe use static linking where possible, so this is not a roadblocker by no means. As the subject is rather undocumented and there's apparently a patch readily available, this should be fixed. Perhaps there is no need to change the default alignment (without further studies), but it should be possible to change the alignment nevertheless. I hope I provided enough background information for this rare, but indeed curious case! -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils