Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Thu, Jul 25, 2013 at 05:06:55AM +0200, Jakub Jelinek wrote: > On Wed, Jul 24, 2013 at 07:36:31PM +0200, Richard Biener wrote: > > >Make them callee saved means we need to change ld.so to > > >preserve them and we need to change unwind library to > > >support them. It is certainly doable. > > > > IMHO it was a mistake to not have any callee saved xmm register in the > > original abi - we should fix this at this opportunity. Loops with > > function calls are not that uncommon. > > I've raised that earlier already. One issue with that beyond having to > teach unwinders about this (dynamic linker if you mean only for the lazy PLT > resolving is only a matter of whether the dynamic linker itself has been > built with a compiler that would clobber those registers anywhere) is that > as history shows, the vector registers keep growing over time. > So if we reserve now either 8 or all 16 zmm16 to zmm31 registers as call > saved, do we save them as 512 bit registers, or say 1024 bit already? We shouldn't save them all as we would often need to unnecessarily save register in leaf function. I am fine with 8. In practice 4 should be enough for most use cases. > If just 512 bit, then when next time the vector registers grow in size (will > they?), would we have just low parts of the 1024 bits registers call saved > and upper half call clobbered (I guess that is the case for M$Win 64-bit ABI > now, just with 128 bit vs. more). > I do not think that 1024 bit registers will come in next ten years. If they came tohn call clobbered is better. Full 1024 bits would be used rarely; given that in most cases we will use them just to store 64bit for doubles. > But yeah, it would be nice to have some call saved ones. > > Jakub
Re: fatal error: gnu/stubs-32.h: No such file
On 07/24/2013 11:51 PM, David Starner wrote: > On Wed, Jul 24, 2013 at 8:50 AM, Andrew Haley wrote: >> Not at all: we're just disagreeing about what a real system with >> a real workload looks like. > > No, we aren't. We're disagreeing about whether it's acceptable to > enable a feature by default that breaks the compiler build half way > through with an obscure error message. No we aren't. I want that error message fixed too. A configure- time warning would be good. > Real systems need features that aren't enabled by default sometimes. I *totally* agree. >> It's a stupid thing to say anyway, because who is to say their >> system is more real than mine or yours? > > By that logic, you've already said that any system needing GNAT is > less real then others, because it's not enabled by default. Absolutely not: you're the one making claims about "real systems and real workloads". I made no such claims. Andrew.
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
2013/7/25 Ian Lance Taylor : > On Wed, Jul 24, 2013 at 4:36 PM, Roland McGrath wrote: >> >> Will an MPX-using binary require an MPX-supporting dynamic linker to run >> correctly? >> >> * An old dynamic linker won't clobber %bndN directly, so that's not a >> problem. > > These are my answers and likely incorrect. Hi, I want add some comments to your answers. > > It will clobber the registers indirectly, though, as soon as it > executes a branching instruction. The effect will be that calls from > bnd-checked code to bnd-checked code through the dynamic linker will > not succeed. I would not say that call will fail. Some bound info will just be lost. MPX binaries should still work correctly with old dynamic linker. The problem here is that when you decrease level of MPX support (use legacy dynamic linker, and legacy libraries) you decrease a quality of bound violation detection. BTW if new PLT section is used then table fixup after the first call will lead to correct bounds transfer in subsequent calls. > > I have not yet seen the changes this will require to the ABI, but I'm > making the natural assumptions: the first four pointer arguments to a > function will be associated with a pair of bound registers, and > similarly for a returned pointer. I don't know what the proposal is > for struct parameters and return values. The general idea is to use bound registers for pointers passed in registers. It does not matter if this pointer is a part of the structure. BND0 is used to return bounds for returned pointer. Of course, there are some more details (e.g. when more than 4 pointers are passed in registers or when vararg call is made). > > >> * Does having the bounds registers set have any effect on regular/legacy >> code, or only when bndc[lun] instructions are used? > > As far as I can tell, only when the bndXX instructions are used, > though I'd be happy to hear otherwise. As usually new registers affect context save/restore instructions. > > >> If it doesn't affect normal instructions, then I don't entirely >> understand why it would matter to clear %bnd* when entering or leaving >> legacy code. Is it solely for the case of legacy code returning a >> pointer value, so that the new code would expect the new ABI wherein >> %bnd0 has been set to correspond to the pointer returned in %rax? > > There is no problem with clearing the bnd registers when calling in or > out of legacy code. The issue is avoiding clearing the pointers when > calling from bnd-enabled code to bnd-enabled code. When legacy code returns a pointer we need to clear at least BND0 to avoid wrong bounds for returned pointer. We also may have a calls sequence mpx code -> legacy code -> mpx code. In such case we have to clear all bound register before calling mpx code from legacy code. Otherwise nested mpx code gets wrong bounds. Thanks, Ilya > > >> * What's the effect of entering the dynamic linker via "bnd jmp" >> (i.e. new MPX-using binary with new PLT, old dynamic linker)? The old >> dynamic linker will leave %bndN et al exactly as they are, until its >> first unadorned branching instruction implicitly clears them. So the >> only problem would be if the work _dl_runtime_{resolve,profile} does >> before its first branch/call were affected by the %bndN state. > > "It's not a problem." > >> In a related vein, what's the effect of entering some legacy code via >> "bnd jmp" (i.e. new binary using PLT call into legacy DSO)? >> >> * If the state of %bndN et al does not affect legacy code directly, then >> it's not a problem. The legacy code will eventually use an unadorned >> branch instruction, and that will implicitly clear %bnd*. (Even if >> it's a leaf function that's entirely branch-free, its return will >> count as such an unadorned branch instruction.) > > Yes. > >> * If that's not the case, > > It is the case. > >> I can't tell if you are proposing that a single object might contain >> both 16-byte and 32-byte PLT slots next to each other in the same .plt >> section. That seems like a bad idea. I can think of two things off >> hand that expect PLT entries to be of uniform size, and there may well >> be more. >> >> * The foo@plt pseudo-symbols that e.g. objdump will display are based on >> the BFD backend knowing the size of PLT entries. Arguably this ought >> to look at sh_entsize of .plt instead of using baked-in knowledge, but >> it doesn't. > > This seems fixable. Of course, we could also keep the PLT the same > length by changing it. The current PLT entries are > > jmpq *GOT(sym) > pushq offset > jmpq plt0 > > The linker or dynamic linker initializes *GOT(sym) to point to the > second instruction in this sequence. So we can keep the PLT at 16 > bytes by simply changing it to jump somewhere else. > > bnd jmpq *GOT(sym) > .skip 9 > > We have the linker or dynamic linker fill in *GOT(sym) to point to the > second PLT table. When the dynamic linker i
Re: Intel® Memory Protection Extensions support in the GCC
On 07/24/2013 05:58 PM, Zamyatin, Igor wrote: Hi All! This is to let you know that enabling of Intel® MPX technology (see details in http://download-software.intel.com/sites/default/files/319433-015.pdf) in GCC has been started. (Corresponding changes in binutils are here - http://sourceware.org/ml/binutils/2013-07/msg00233.html) Thanks, this is interesting. Can userspace update the translation tables for bounds? Are the bounds stored in Bound Table Entries relative to the starting linear address of pointer (LAp) or absolute? The former would allow sharing bound table pages for different pages having memory objects of the same size (which happens with some malloc implementations). -- Florian Weimer / Red Hat Product Security Team
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Wed, Jul 24, 2013 at 9:52 PM, Ondřej Bílka wrote: > On Wed, Jul 24, 2013 at 08:25:14AM -1000, Richard Henderson wrote: >> On 07/24/2013 05:23 AM, Richard Biener wrote: >> > "H.J. Lu" wrote: >> > >> >> Hi, >> >> >> >> Here is a patch to extend x86-64 psABI to support AVX-512: >> > >> > Afaik avx 512 doubles the amount of xmm registers. Can we get them callee >> > saved please? >> >> Having them callee saved pre-supposes that one knows the width of the >> register. >> >> There's room in the instruction set for avx1024. Does anyone believe that is >> not going to appear in the next few years? >> > It would be mistake for intel to focus on avx1024. You hit diminishing > returns and only few workloads would utilize loading 128 bytes at once. > Problem with vectorization is that it becomes memory bound so you will > not got much because performance is dominated by cache throughput. > > You would get bigger speedup from more effective pipelining, more > fusion... ISTR that one of the main reason "long" vector ISA's did so well on some workloads was not that the vector length was big, per se, but rather that the scatter/gather instructions these ISA's typically have allowed them to extract much more parallelism from the memory subsystem. The typical example being sparse matrix style problems, but I suppose other types of problems with indirect accesses could benefit as well. Deeper OoO buffers would in principle allow the same memory level parallelism extraction, but those apparently have quite steep power and silicon area cost scaling (O(n**2) or maybe even O(n**3)), making really deep buffers impractical. And, IIRC scatter/gather instructions are featured as of some recent-ish AVX-something version. That being said, maybe current cache-based memory subsystems are different enough from the vector supercomputers of yore that the above doesn't hold to the same extent anymore.. -- Janne Blomqvist
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Thu, Jul 25, 2013 at 03:17:43PM +0300, Janne Blomqvist wrote: > On Wed, Jul 24, 2013 at 9:52 PM, Ondřej Bílka wrote: > > On Wed, Jul 24, 2013 at 08:25:14AM -1000, Richard Henderson wrote: > >> On 07/24/2013 05:23 AM, Richard Biener wrote: > >> > "H.J. Lu" wrote: > >> > > >> >> Hi, > >> >> > >> >> Here is a patch to extend x86-64 psABI to support AVX-512: > >> > > >> > Afaik avx 512 doubles the amount of xmm registers. Can we get them > >> > callee saved please? > >> > >> Having them callee saved pre-supposes that one knows the width of the > >> register. > >> > >> There's room in the instruction set for avx1024. Does anyone believe that > >> is > >> not going to appear in the next few years? > >> > > It would be mistake for intel to focus on avx1024. You hit diminishing > > returns and only few workloads would utilize loading 128 bytes at once. > > Problem with vectorization is that it becomes memory bound so you will > > not got much because performance is dominated by cache throughput. > > > > You would get bigger speedup from more effective pipelining, more > > fusion... > > ISTR that one of the main reason "long" vector ISA's did so well on > some workloads was not that the vector length was big, per se, but > rather that the scatter/gather instructions these ISA's typically have > allowed them to extract much more parallelism from the memory > subsystem. The typical example being sparse matrix style problems, but > I suppose other types of problems with indirect accesses could benefit > as well. Deeper OoO buffers would in principle allow the same memory > level parallelism extraction, but those apparently have quite steep > power and silicon area cost scaling (O(n**2) or maybe even O(n**3)), > making really deep buffers impractical. > > And, IIRC scatter/gather instructions are featured as of some > recent-ish AVX-something version. That being said, maybe current > cache-based memory subsystems are different enough from the vector > supercomputers of yore that the above doesn't hold to the same extent > anymore.. > Also this depends how many details intel got right. One example is pmovmsk instruction. It is trivial to implement in silicon and gives advantage over other architectures. When a problem is 'find elements in array that satisfy some expression' then without pmovmsk or equivalent finding what changed is relatively expensive. One problem is that depending on profile you may spend majority of time for small sizes. So you need to have effective branches for these sizes (gcc does not handle that well yet). Then you get problem that it increases icache pressure. Then another problem is that you often could benefit from vector instructions if you could read/write more memory. Reading can be done inexpensively by checking if it crosses page, writing data is problem and so we do a suboptimal path just to write only data that changed. This could also be solved technologically if a masked move instruction could encode only to memory accesses that changed and thus avoid possible race conditions in unchanged parts. > > -- > Janne Blomqvist
Re: Intel(R) Memory Protection Extensions support in the GCC
2013/7/25 Florian Weimer : > On 07/24/2013 05:58 PM, Zamyatin, Igor wrote: >> >> Hi All! >> >> This is to let you know that enabling of IntelŽ MPX technology (see >> details in >> http://download-software.intel.com/sites/default/files/319433-015.pdf) in >> GCC has been started. (Corresponding changes in binutils are here - >> http://sourceware.org/ml/binutils/2013-07/msg00233.html) > > > Thanks, this is interesting. > > Can userspace update the translation tables for bounds? Are the bounds > stored in Bound Table Entries relative to the starting linear address of > pointer (LAp) or absolute? The former would allow sharing bound table pages > for different pages having memory objects of the same size (which happens > with some malloc implementations). Hi Florian, Do you mean 'Bounds Directory' when say 'translation tables'? If yes, then you should be able to access it by getting its address from BNDCFGU register. It is not clear how Bound Tables may be shared. Bound Tables are used to hold bounds for pointers stored in memory, not for objects allocated in memory. Thanks, Ilya > > > -- > Florian Weimer / Red Hat Product Security Team
RE: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
Hi, This got lost in our site-consolidation efforts. We are working to make it active again. Will update the community soon. Regards Ganesh From: Joseph Myers [jos...@codesourcery.com] Sent: Tuesday, July 23, 2013 2:57 PM To: H.J. Lu Cc: GNU C Library; GCC Development; Binutils; Girkar, Milind; Kreitzer, David L; Gopalasubramanian, Ganesh Subject: Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512 On Tue, 23 Jul 2013, H.J. Lu wrote: > Here is a patch to extend x86-64 psABI to support AVX-512: I have no comments on this patch for now - but where is the version control repository we should use for the ABI source code, since x86-64.org has been down for some time? (I've also CC:ed the last person from AMD to post to gcc-patches, in the hope that they have the right contacts to get x86-64.org - website, mailing lists, version control - brought back up again.) -- Joseph S. Myers jos...@codesourcery.com
Re: Intel(R) Memory Protection Extensions support in the GCC
On 07/25/2013 03:50 PM, Ilya Enkovich wrote: Do you mean 'Bounds Directory' when say 'translation tables'? If yes, then you should be able to access it by getting its address from BNDCFGU register. Good to know. It is not clear how Bound Tables may be shared. Bound Tables are used to hold bounds for pointers stored in memory, not for objects allocated in memory. Oh. I think I misread the specification then. Obviously, this supports more precise checking, covering pointer provenience and intra-object overflow checks. I'm worried that this adds quite a bit of memory overhead, but I guess I'll have to wait and see. -- Florian Weimer / Red Hat Product Security Team
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Thu, Jul 25, 2013 at 4:08 AM, Ilya Enkovich wrote: > 2013/7/25 Ian Lance Taylor : >> On Wed, Jul 24, 2013 at 4:36 PM, Roland McGrath wrote: >>> >>> Will an MPX-using binary require an MPX-supporting dynamic linker to run >>> correctly? >>> >>> * An old dynamic linker won't clobber %bndN directly, so that's not a >>> problem. >> >> These are my answers and likely incorrect. > > Hi, > > I want add some comments to your answers. > >> >> It will clobber the registers indirectly, though, as soon as it >> executes a branching instruction. The effect will be that calls from >> bnd-checked code to bnd-checked code through the dynamic linker will >> not succeed. > > I would not say that call will fail. Some bound info will just be > lost. MPX binaries should still work correctly with old dynamic > linker. The problem here is that when you decrease level of MPX > support (use legacy dynamic linker, and legacy libraries) you decrease > a quality of bound violation detection. BTW if new PLT section is used > then table fixup after the first call will lead to correct bounds > transfer in subsequent calls. To make it clear, the sequence is MPX code -> PLT -> ld.so -> PLT -> MPX library If ld.so doesn't preserve bound registers, bound registers will be cleared, which means the lower bound is 0 and upper bound is -1 (MAX), when MPX library is reached. The MPX library will work correctly, but without MPX protections on pointers passed in registers. -- H.J.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
On Thu, Jul 25, 2013 at 08:55:38AM +0200, Ondřej Bílka wrote: > On Thu, Jul 25, 2013 at 05:06:55AM +0200, Jakub Jelinek wrote: > > On Wed, Jul 24, 2013 at 07:36:31PM +0200, Richard Biener wrote: > > > >Make them callee saved means we need to change ld.so to > > > >preserve them and we need to change unwind library to > > > >support them. It is certainly doable. > > > > > > IMHO it was a mistake to not have any callee saved xmm register in the > > > original abi - we should fix this at this opportunity. Loops with > > > function calls are not that uncommon. > > > > I've raised that earlier already. One issue with that beyond having to > > teach unwinders about this (dynamic linker if you mean only for the lazy PLT > > resolving is only a matter of whether the dynamic linker itself has been > > built with a compiler that would clobber those registers anywhere) is that > > as history shows, the vector registers keep growing over time. > > So if we reserve now either 8 or all 16 zmm16 to zmm31 registers as call > > saved, do we save them as 512 bit registers, or say 1024 bit already? > > We shouldn't save them all as we would often need to unnecessarily save > register in leaf function. I am fine with 8. In practice 4 should be > enough for most use cases. You can't add call-saved registers without breaking the ABI, because they need to be saved in the jmp_buf, which does not have space for them. Also, unless you add them at the same time the registers are added to the machine (so there's no existing code using those registers), you'll have ABI problems like this: function using the new call-saved registers calls qsort, which calls application code, which assumes the registers are call-clobbered and clobbers them; after qsort returns, the original caller's state is gone. Adding call-saved registers to an existing psABI is just fundamentally misguided. Rich
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 4:36 PM, Roland McGrath wrote: > I've read through the MPX spec once, but most of it is still not very > clear to me. So please correct any misconceptions. (HJ, if you answer > any or all of these questions in your usual style with just, "It's not a > problem," I will find you and I will kill you. Explain!) > > Will an MPX-using binary require an MPX-supporting dynamic linker to run > correctly? Yes. But you may lose MPX protection in MPX library since bound registers are cleared in the first call with lazy bounding: MPX code -> PLT -> ld.so -> PLT -> MPX library > > Those are the background questions to help me understand better. > Now, to your specific questions. > > Now, assuming we are talking about a uniform PLT in each object, there > is the question of whether to use a new PLT layout everywhere, or only > when linking an object with some input files that use MPX. I am proposing the uniform PLT in each object. That was my first question. > * My initial reaction was to say that we should just change it > unconditionally to keep things simple: use new linker, get new format, > end of story. Simplicity is good. This is my thinking also. > * But, doubling the size of PLT entries means more i-cache pressure. If > cache lines are 64 bytes, then today you fit four entries into a cache > line. Assuming PLT entries are more used than unused, this is a good > thing. Reducing that to two entries per cache line means twice as > many i-cache misses if you hit a given PLT frequently (with even > distribution of which entries you actually use--at any rate, it's > "more" even if it's not "twice as many"). Perhaps this is enough cost > in real-world situations to be worried about. I really don't know. > > * As I mentioned before, there are things floating around that think > they know the size of PLT entries. Realistically, there will be > plenty of people using new tools to build binaries but not using MPX > at all, and these people will give those binaries to people who have > old tools. In the case of someone running an old objdump on a new > binary, they would see bogus foo@plt pseudo-symbols and be misled and > confused. Not to mention the unknown unknowns, i.e. other things that > "know" the size of PLT entries that we don't know about or haven't > thought of here. It's just basic conservatism not to perturb things > for these people who don't care about or need anything related to MPX > at all. We can investigate if the old objdump can deal with PLT entry size change. > How a relocatable object is marked so that the linker knows whether its > code is MPX-compatible at link time and how a DSO/executable is marked > so that the dynamic linker knows at runtime are two separate subjects. > > For relocatable objects, I don't think there is really any precedent for > using ELF notes to tell the linker things. It seems much nicer if the We have been using .note.GNU-stack section at link-time for a long time. > linker continues to treat notes completely normally, i.e. appending > input files' same-named note sections together like with any other named > section rather than magically recognizing and swallowing certain notes. > OTOH, the SHT_GNU_ATTRIBUTES mechanism exists for exactly this sort of > purpose and is used on other machines for very similar sorts of issues. > There is both precedent and existing code in binutils to have the linker > merge attribute sections from many input files together in a fashion > aware of the semantics of those sections, and to have those attributes > affect the linker's behavior in machine-specific ways. I think you have > to make a very strong case to use anything other than SHT_GNU_ATTRIBUTES > for this sort of purpose in relocatable objects. > > For linked objects, there a couple of obvious choices. They all require > that the linker have special knowledge to create the markings. One > option is a note. We use .note.ABI-tag for a similar purpose in libc, > but I don't know of any precedent for the linker synthesizing notes. > The most obvious choice is e_flags bits. That's what other machines use > to mark ABI variants. There are no bits assigned for x86 yet. There > are obvious limitations to using e_flags, in that it's part of the > universal ELF psABI rather than something with vendor extensibility > built in like notes have, and in that there are only 32 bits available > to assign rather than being a wholly open-ended format like notes. But > using e_flags is certainly simpler to synthesize in the linker and > simpler to recognize in the dynamic linker than a note format. I think > you have to make at least a reasonable (objective) case to use a note > rather than e_flags, though I'm certainly not firmly against a note. My main concerns are e_flags isn't very extensible and the old tools may not be able to handle it properly. A note section is backward compatible. Given that MP
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On Wed, Jul 24, 2013 at 5:23 PM, Ian Lance Taylor wrote: >> * The foo@plt pseudo-symbols that e.g. objdump will display are based on >> the BFD backend knowing the size of PLT entries. Arguably this ought >> to look at sh_entsize of .plt instead of using baked-in knowledge, but >> it doesn't. > > This seems fixable. Of course, we could also keep the PLT the same > length by changing it. The current PLT entries are > > jmpq *GOT(sym) > pushq offset > jmpq plt0 > > The linker or dynamic linker initializes *GOT(sym) to point to the > second instruction in this sequence. So we can keep the PLT at 16 > bytes by simply changing it to jump somewhere else. > > bnd jmpq *GOT(sym) > .skip 9 > > We have the linker or dynamic linker fill in *GOT(sym) to point to the > second PLT table. When the dynamic linker is involved, we use another > DT tag to point to the second PLT. The offsets are consistent: there > is one entry in each PLT table, so the dynamic linker can compute the > right value. Then in the second PLT we have the sequence > > pushq offset > bnd jmpq plt0 > > That gives the dynamic linker the offset that it needs to update > *GOT(sym) to point to the runtime symbol value. So we get slightly > worse instruction cache handling the first time a function is called, > but after that we are the same as before. And PLT entries are the > same size as always so everything is simpler. > > The special DT tag will tell the dynamic linker to apply the special > processing. No attribute is needed to change behaviour. The issue > then is: a program linked in this way will not work with an old > dynamic linker, because the old dynamic linker will not initialize > GOT(sym) to the right value. That is a problem for any scheme, so I > think that is OK. But if that is a concern, we could actually handle > by generating two PLTs. One conventional PLT, and another as I just > outlined. The linker branches to the new PLT, and initializes > GOT(sym) to point to the old PLT. The dynamic linker spots this > because it recognizes the new DT tags, and cunningly rewrites the GOT > to point to the new PLT. Cost is an extra jump the first time a > function is called when using the old dynamic linker. > I don't like the complexity. I believe extending PLT entry to 32 byte works with the old ld.so. If we are willing to have mixed PLT entry, we merge 2 16-byte PLT entries into one super 32-byte PLT entry so that we can have jmpq *name@GOTPCREL(%rip) pushq $index jmpq PLT0 bnd jmpq *name@GOTPCREL(%rip) pushq $index bnd jmpq PLT0 nop paddings jmpq *name@GOTPCREL(%rip) pushq $index jmpq PLT0 We can also have new link-time relocations for branches with BND prefix and only create the super PLT entries when needed. Of course,. unwind info may be incorrect for both approach if we don't find a way to fix it. -- H.J.
gcc-4.8-20130725 is now available
Snapshot gcc-4.8-20130725 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20130725/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 201255 You'll find: gcc-4.8-20130725.tar.bz2 Complete GCC MD5=e21f259bc4c44e61e19a780ad5badfeb SHA1=d6f611012ae432b0a7c4c1ab6472d854ed2ba5cc Diffs from 4.8-20130718 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
INCOMING_RETURN_ADDR_RTX
I am getting a crash with my backend when running arbitrary code with -g. Apparently this is because the compiler aborts at dwarf2cfi.c:2852 (GCC 4.8.1-release, because initial_return_save (INCOMING_RETURN_ADDR_RTX); INCOMING_RETURN_ADDR_RTX is undefined. The documentation states "You only need to define this macro if you want to support call frame debugging information like that provided by DWARF 2.". We can't support frame debugging right now (at least I think we can't), I need to investigate that. In any case the documentation sounds more like that you don't need to define this macro for your target. In order to disable this feature, do I also need to disable some frame unwind info macros? Thanks, Regards, Hendrik Greving
Re: INCOMING_RETURN_ADDR_RTX
I am reaching this code like this: (gdb) p targetm.debug_unwind_info () $1 = UI_DWARF2 (gdb) p targetm_common.except_unwind_info (&global_options) $2 = UI_SJLJ On Thu, Jul 25, 2013 at 3:57 PM, Hendrik Greving wrote: > I am getting a crash with my backend when running arbitrary code with > -g. Apparently this is because the compiler aborts at dwarf2cfi.c:2852 > (GCC 4.8.1-release, because > > initial_return_save (INCOMING_RETURN_ADDR_RTX); > > INCOMING_RETURN_ADDR_RTX is undefined. > > The documentation states "You only need to define this macro if you > want to support call frame debugging information like that provided by > DWARF 2.". > > We can't support frame debugging right now (at least I think we > can't), I need to investigate that. In any case the documentation > sounds more like that you don't need to define this macro for your > target. In order to disable this feature, do I also need to disable > some frame unwind info macros? > > Thanks, > Regards, > Hendrik Greving
Re: INCOMING_RETURN_ADDR_RTX
I found this email thread http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48459 It sounds like I should define TARGET_DEBUG_UNWIND_INFO and return UI_NONE for now? On Thu, Jul 25, 2013 at 3:57 PM, Hendrik Greving wrote: > I am getting a crash with my backend when running arbitrary code with > -g. Apparently this is because the compiler aborts at dwarf2cfi.c:2852 > (GCC 4.8.1-release, because > > initial_return_save (INCOMING_RETURN_ADDR_RTX); > > INCOMING_RETURN_ADDR_RTX is undefined. > > The documentation states "You only need to define this macro if you > want to support call frame debugging information like that provided by > DWARF 2.". > > We can't support frame debugging right now (at least I think we > can't), I need to investigate that. In any case the documentation > sounds more like that you don't need to define this macro for your > target. In order to disable this feature, do I also need to disable > some frame unwind info macros? > > Thanks, > Regards, > Hendrik Greving
Re: fatal error: gnu/stubs-32.h: No such file
On Thu, Jul 25, 2013 at 1:17 AM, Andrew Haley wrote: > On 07/24/2013 11:51 PM, David Starner wrote: >> On Wed, Jul 24, 2013 at 8:50 AM, Andrew Haley wrote: >>> Not at all: we're just disagreeing about what a real system with >>> a real workload looks like. >> >> No, we aren't. We're disagreeing about whether it's acceptable to >> enable a feature by default that breaks the compiler build half way >> through with an obscure error message. > > No we aren't. I want that error message fixed too. A configure- > time warning would be good. The obscurity of the error message is only part of the problem; the fact that it errors out halfway through a multi-hour build is also an issue. The question is if it can't detect a compile time that this will fail, should GCC disable multilibs? -- Kie ekzistas vivo, ekzistas espero.
DejaGnu and toolchain testing
I was interested to watch the video of the DejaGnu BOF at the Cauldron. A few issues with DejaGnu for toolchain testing that I've noted but I don't think were covered there include: * DejaGnu has a lot of hardcoded logic to try to find various files in a toolchain build directory. A lot of it is actually for very old toolchain versions (using GCC version 2 or older, for example). The first issue with this is that it doesn't belong in DejaGnu: the toolchain should be free to rearrange its build directories without needing changes to DejaGnu itself (which in practice means there's lots of such logic in the toolchain's own testsuites *as well*, duplicating the DejaGnu code to a greater or lesser extent). The second issue is that "make install" already knows where to find files in the build directory, and it would be better to move towards build-tree testing installing the toolchain in a staging directory and running tools from there, rather than needing any logic in the testsuites at all to enable bits of uninstalled tools to find other bits of uninstalled tools. (There might still be a few bits like setting LD_LIBRARY_PATH required. But the compiler command lines would be much simpler and much closer to how users actually use the compiler in practice.) * Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds lots of its own prunes; it's not clear hardcoding this in DejaGnu is a particularly good idea either. * Another piece of unfortunate hardcoding in DejaGnu is how remote-host testing uses "-o a.out" when running tools on the remote host - such a difference from how they are run on a local host results in lots of issue where a tool cares about the output file name in some way (e.g. to generate other output files). * A key feature of QMTest that I like but I don't think got mentioned is that you can *statically enumerate the set of tests* without running them. That is, a testsuite has a well-defined set of tests, and that set does not depend on what the results of the tests are - whereas it's very easy and common for a DejaGnu test to have test names (the text after PASS: or FAIL: ) depending on whether the test passed or failed, or how the test passed or failed (no doubt the testsuite authors had reasons for doing this, but it conflicts with any automatic comparison of results). The QMTest model isn't wonderfully well-matched to toolchain testing - in toolchain testing, you can typically do a single indivisible test execution (e.g. compiling a file), which produces results for a large number of test assertions (tests for warnings on particular lines of that file), and QMTest expects one indivisible test execution to produce one result. But a model where a test can contain multiple assertions, and both tests and their assertions can be statically enumerated independent of their result, and where the results can be annotated by the testsuite (to deal with the purposes for which testsuites stick extra text on the PASS/FAIL line) certainly seems better than one that makes it likely the set of test assertions will vary in unpredictable ways. * People in the BOF seemed happy with expect. I think expect has caused quite a few problems for toolchain testing. In particular, there are or have been too many places where expect likes to throw away input whose size exceeds some arbitrary limit and you need to hack around those by increasing the limits in some way. GCC tests can generate and test for very large numbers of diagnostics from a single test, and some binutils tests can generate megabytes of output from a tool (that are then matched against regular expressions etc.). -- Joseph S. Myers jos...@codesourcery.com