Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.
On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei wrote: > Hi, > I noticed there is a regression of 4.8 against ancient 4.5 in vectorization > on our port. After a bit investigation, I found following code that prefer > even|odd version instead of lo|hi one. This is obviously the case for AltiVec > and maybe some other targets. But even|odd (expanding to a series of > instructions) versions are less efficient on our target than lo|hi ones. > Shouldn't there be a target-specific hook to do the choice instead of > hard-coded one here, or utilizing some cost-estimating technique to compare > two alternatives? Hmm, what's the reason for a target to support both? I think the idea was that a target only supports either (the more efficient case). Richard. > /* The result of a vectorized widening operation usually requires > two vectors (because the widened results do not fit into one vector). > The generated vector results would normally be expected to be > generated in the same order as in the original scalar computation, > i.e. if 8 results are generated in each vector iteration, they are > to be organized as follows: > vect1: [res1,res2,res3,res4], > vect2: [res5,res6,res7,res8]. > > However, in the special case that the result of the widening > operation is used in a reduction computation only, the order doesn't > matter (because when vectorizing a reduction we change the order of > the computation). Some targets can take advantage of this and > generate more efficient code. For example, targets like Altivec, > that support widen_mult using a sequence of {mult_even,mult_odd} > generate the following vectors: > vect1: [res1,res3,res5,res7], > vect2: [res2,res4,res6,res8]. > > When vectorizing outer-loops, we execute the inner-loop sequentially > (each vectorized inner-loop iteration contributes to VF outer-loop > iterations in parallel). We therefore don't allow to change the > order of the computation in the inner-loop during outer-loop > vectorization. */ > /* TODO: Another case in which order doesn't *really* matter is when we > widen and then contract again, e.g. (short)((int)x * y >> 8). > Normally, pack_trunc performs an even/odd permute, whereas the > repack from an even/odd expansion would be an interleave, which > would be significantly simpler for e.g. AVX2. */ > /* In any case, in order to avoid duplicating the code below, recurse > on VEC_WIDEN_MULT_EVEN_EXPR. If it succeeds, all the return values > are properly set up for the caller. If we fail, we'll continue with > a VEC_WIDEN_MULT_LO/HI_EXPR check. */ > if (vect_loop > && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction > && !nested_in_vect_loop_p (vect_loop, stmt) > && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR, > stmt, vectype_out, vectype_in, > code1, code2, multi_step_cvt, > interm_types)) > return true; > > > Thanks, > Bingfeng Mei
RE: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.
Thanks, Richard. It is not very clear from documents. "Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) are vectors with N signed/unsigned elements of size S. Multiply the high/low or even/odd elements of the two vectors, and put the N/2 products of size 2*S in the output vector (operand 0)." So I thought that implementing both can help vectorizer to optimize more loops. Maybe we should improve documents. Bingfeng -Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: 28 January 2014 11:02 To: Bingfeng Mei Cc: gcc@gcc.gnu.org Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization. On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei wrote: > Hi, > I noticed there is a regression of 4.8 against ancient 4.5 in vectorization > on our port. After a bit investigation, I found following code that prefer > even|odd version instead of lo|hi one. This is obviously the case for AltiVec > and maybe some other targets. But even|odd (expanding to a series of > instructions) versions are less efficient on our target than lo|hi ones. > Shouldn't there be a target-specific hook to do the choice instead of > hard-coded one here, or utilizing some cost-estimating technique to compare > two alternatives? Hmm, what's the reason for a target to support both? I think the idea was that a target only supports either (the more efficient case). Richard. > /* The result of a vectorized widening operation usually requires > two vectors (because the widened results do not fit into one vector). > The generated vector results would normally be expected to be > generated in the same order as in the original scalar computation, > i.e. if 8 results are generated in each vector iteration, they are > to be organized as follows: > vect1: [res1,res2,res3,res4], > vect2: [res5,res6,res7,res8]. > > However, in the special case that the result of the widening > operation is used in a reduction computation only, the order doesn't > matter (because when vectorizing a reduction we change the order of > the computation). Some targets can take advantage of this and > generate more efficient code. For example, targets like Altivec, > that support widen_mult using a sequence of {mult_even,mult_odd} > generate the following vectors: > vect1: [res1,res3,res5,res7], > vect2: [res2,res4,res6,res8]. > > When vectorizing outer-loops, we execute the inner-loop sequentially > (each vectorized inner-loop iteration contributes to VF outer-loop > iterations in parallel). We therefore don't allow to change the > order of the computation in the inner-loop during outer-loop > vectorization. */ > /* TODO: Another case in which order doesn't *really* matter is when we > widen and then contract again, e.g. (short)((int)x * y >> 8). > Normally, pack_trunc performs an even/odd permute, whereas the > repack from an even/odd expansion would be an interleave, which > would be significantly simpler for e.g. AVX2. */ > /* In any case, in order to avoid duplicating the code below, recurse > on VEC_WIDEN_MULT_EVEN_EXPR. If it succeeds, all the return values > are properly set up for the caller. If we fail, we'll continue with > a VEC_WIDEN_MULT_LO/HI_EXPR check. */ > if (vect_loop > && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction > && !nested_in_vect_loop_p (vect_loop, stmt) > && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR, > stmt, vectype_out, vectype_in, > code1, code2, multi_step_cvt, > interm_types)) > return true; > > > Thanks, > Bingfeng Mei
Re: Suspected bugs in ptr_difference_const & split_address_to_core_and_offset
On Fri, Jan 24, 2014 at 6:47 PM, Bingfeng Mei wrote: > Hi, > I experienced an issue in our port, which I suspected due to bugs > in ptr_difference_const & split_address_to_core_and_offset. Basically, > ptr_difference_const, called by ivopts pass, tries to evaluate > whether e1 & e2 differ by a const. In this example, > > e1 is (addr_expr (mem_ref (ssa_name1, 8)) > e2 is just ssa_name1. > > It is obvious to me that ptr_difference_const should return true. But > it calls split_address_to_core_and_offset to split e1 to some base pointer > and offset. split_addess_to_core_and_offset in turn calls get_inner_reference > to do it. get_inner_reference cannot handle (mem_ref (ssa_name1, 8)), > it just returns the same expression back. > > get_inner_reference function > case MEM_REF: > /* Hand back the decl for MEM[&decl, off]. */ > if (TREE_CODE (TREE_OPERAND (exp, 0)) == ADDR_EXPR) > { > tree off = TREE_OPERAND (exp, 1); > if (!integer_zerop (off)) > { > double_int boff, coff = mem_ref_offset (exp); > boff = coff.alshift (BITS_PER_UNIT == 8 >? 3 : exact_log2 (BITS_PER_UNIT), >HOST_BITS_PER_DOUBLE_INT); > bit_offset += boff; > } > exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0); > } > goto done; > > Then in ptr_difference_const, we get core1 as (mem_ref (ssa_name1, 8)) > and toffset1 is empty. ptr_difference_const will return false as result. That's because get_inner_reference is the wrong tool to ask for a base _address_ IMHO. In theory get_inner_reference could return MEM[ptr, 0] of course but that requires building a new tree which isn't the suitable thing to do here. What you want is a get_base_address_and_constant_offset_part. It may be as simple as wrapping get_inner_reference to perform the final step and adjust the kind of tree it is supposed to return. > > There is another possible bug in ptr_difference_const. If one of toffset1 > & toffset2 is true, why it returns false? The comment doesn't make sense > to me. In this example, toffset1 should be 8 and toffset2 should be empty. No, I think in this example bitpos should have the 8, not toffset. toffset are supposed to be non-constant parts. I think the fix belongs into split_address_to_core_and_offset, handling MEM[X, CST], avoiding the build_fold_addr_expr_loc and adjusting pbitpos for CST. Richard. > No way it should return false. > > else if (toffset1 || toffset2) > { > /* If only one of the offsets is non-constant, the difference cannot > be a constant. */ > return false; > } > > > Any comment? I would like to submit a patch for it. The problem is I don't > have an reproducible example on x86 or other public targets. I ran through > the x86-64 tests and didn't hit a single case that meets this condition. > e1 is (addr_expr (mem_ref (ssa_name1, 8)) > e2 is just ssa_name1. > Not sure about other targets though. > > Thanks, > Bingfeng
Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.
On Tue, Jan 28, 2014 at 12:08 PM, Bingfeng Mei wrote: > Thanks, Richard. It is not very clear from documents. > > "Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) > are vectors with N signed/unsigned elements of size S. Multiply the high/low > or even/odd elements of the two vectors, and put the N/2 products of size 2*S > in the output vector (operand 0)." > > So I thought that implementing both can help vectorizer to optimize more > loops. > Maybe we should improve documents. Maybe. But my answer was from the top of my head - so better double-check in the vectorizer sources. Richard. > Bingfeng > > > > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 28 January 2014 11:02 > To: Bingfeng Mei > Cc: gcc@gcc.gnu.org > Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR > in vectorization. > > On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei wrote: >> Hi, >> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization >> on our port. After a bit investigation, I found following code that prefer >> even|odd version instead of lo|hi one. This is obviously the case for >> AltiVec and maybe some other targets. But even|odd (expanding to a series of >> instructions) versions are less efficient on our target than lo|hi ones. >> Shouldn't there be a target-specific hook to do the choice instead of >> hard-coded one here, or utilizing some cost-estimating technique to compare >> two alternatives? > > Hmm, what's the reason for a target to support both? I think the idea > was that a target only supports either (the more efficient case). > > Richard. > >> /* The result of a vectorized widening operation usually requires >> two vectors (because the widened results do not fit into one >> vector). >> The generated vector results would normally be expected to be >> generated in the same order as in the original scalar computation, >> i.e. if 8 results are generated in each vector iteration, they are >> to be organized as follows: >> vect1: [res1,res2,res3,res4], >> vect2: [res5,res6,res7,res8]. >> >> However, in the special case that the result of the widening >> operation is used in a reduction computation only, the order doesn't >> matter (because when vectorizing a reduction we change the order of >> the computation). Some targets can take advantage of this and >> generate more efficient code. For example, targets like Altivec, >> that support widen_mult using a sequence of {mult_even,mult_odd} >> generate the following vectors: >> vect1: [res1,res3,res5,res7], >> vect2: [res2,res4,res6,res8]. >> >> When vectorizing outer-loops, we execute the inner-loop sequentially >> (each vectorized inner-loop iteration contributes to VF outer-loop >> iterations in parallel). We therefore don't allow to change the >> order of the computation in the inner-loop during outer-loop >> vectorization. */ >> /* TODO: Another case in which order doesn't *really* matter is when we >> widen and then contract again, e.g. (short)((int)x * y >> 8). >> Normally, pack_trunc performs an even/odd permute, whereas the >> repack from an even/odd expansion would be an interleave, which >> would be significantly simpler for e.g. AVX2. */ >> /* In any case, in order to avoid duplicating the code below, recurse >> on VEC_WIDEN_MULT_EVEN_EXPR. If it succeeds, all the return values >> are properly set up for the caller. If we fail, we'll continue with >> a VEC_WIDEN_MULT_LO/HI_EXPR check. */ >> if (vect_loop >> && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction >> && !nested_in_vect_loop_p (vect_loop, stmt) >> && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR, >> stmt, vectype_out, vectype_in, >> code1, code2, multi_step_cvt, >> interm_types)) >> return true; >> >> >> Thanks, >> Bingfeng Mei
RE: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.
I checked vectorization code, it seems that only relevant place vec_widen_mult_even/odd & vec_widen_mult_lo/hi are generated is in supportable_widening_operation. One of these pairs is selected, with priority given to vec_widen_mult_even/odd if it is a reduction loop. However, lo/hi pair seems to have wider usage than even/odd pair (non-loop? Non-reduction?). Maybe that's why AltiVec and x86 still implement both pairs. Is following patch OK? Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 207183) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2014-01-28 Bingfeng Mei + + * doc/md.texi: Mention that a target shouldn't implement + vec_widen_(s|u)mul_even/odd pair if it is less efficient + than hi/lo pair. + 2014-01-28 Richard Biener Revert Index: gcc/doc/md.texi === --- gcc/doc/md.texi (revision 207183) +++ gcc/doc/md.texi (working copy) @@ -4918,7 +4918,8 @@ the output vector (operand 0). Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) are vectors with N signed/unsigned elements of size S@. Multiply the high/low or even/odd elements of the two vectors, and put the N/2 products of size 2*S -in the output vector (operand 0). +in the output vector (operand 0). A target shouldn't implement even/odd pattern +pair if it is less efficient than lo/hi one. @cindex @code{vec_widen_ushiftl_hi_@var{m}} instruction pattern @cindex @code{vec_widen_ushiftl_lo_@var{m}} instruction pattern -Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: 28 January 2014 12:56 To: Bingfeng Mei Cc: gcc@gcc.gnu.org Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization. On Tue, Jan 28, 2014 at 12:08 PM, Bingfeng Mei wrote: > Thanks, Richard. It is not very clear from documents. > > "Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) > are vectors with N signed/unsigned elements of size S. Multiply the high/low > or even/odd elements of the two vectors, and put the N/2 products of size 2*S > in the output vector (operand 0)." > > So I thought that implementing both can help vectorizer to optimize more > loops. > Maybe we should improve documents. Maybe. But my answer was from the top of my head - so better double-check in the vectorizer sources. Richard. > Bingfeng > > > > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 28 January 2014 11:02 > To: Bingfeng Mei > Cc: gcc@gcc.gnu.org > Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR > in vectorization. > > On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei wrote: >> Hi, >> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization >> on our port. After a bit investigation, I found following code that prefer >> even|odd version instead of lo|hi one. This is obviously the case for >> AltiVec and maybe some other targets. But even|odd (expanding to a series of >> instructions) versions are less efficient on our target than lo|hi ones. >> Shouldn't there be a target-specific hook to do the choice instead of >> hard-coded one here, or utilizing some cost-estimating technique to compare >> two alternatives? > > Hmm, what's the reason for a target to support both? I think the idea > was that a target only supports either (the more efficient case). > > Richard. > >> /* The result of a vectorized widening operation usually requires >> two vectors (because the widened results do not fit into one >> vector). >> The generated vector results would normally be expected to be >> generated in the same order as in the original scalar computation, >> i.e. if 8 results are generated in each vector iteration, they are >> to be organized as follows: >> vect1: [res1,res2,res3,res4], >> vect2: [res5,res6,res7,res8]. >> >> However, in the special case that the result of the widening >> operation is used in a reduction computation only, the order doesn't >> matter (because when vectorizing a reduction we change the order of >> the computation). Some targets can take advantage of this and >> generate more efficient code. For example, targets like Altivec, >> that support widen_mult using a sequence of {mult_even,mult_odd} >> generate the following vectors: >> vect1: [res1,res3,res5,res7], >> vect2: [res2,res4,res6,res8]. >> >> When vectorizing outer-loops, we execute the inner-loop sequentially >> (each vectorized inner-loop iteration contributes to VF outer-loop >> iterations in parallel). We therefore don't allow to change the >> order of the computation in the inner-loop
Re: Generating minimum libstdc++ symbols for a new platform
On Sun, Jan 26, 2014 at 11:12 AM, Richard Sandiford wrote: > [adding libstdc++@] > > Bill Schmidt writes: >> It was recently pointed out to me that our new powerpc64le-linux-gnu >> target does not yet have a corresponding directory in libstdc >> ++-v3/config/abi/post/ to hold a baseline_symbols.txt for the platform. >> I've been looking around and haven't found any documentation for how the >> minimum baseline symbols file should be generated. Can someone please >> enlighten me about the process? > > Yeah, I'd like to know this too. abi_check has been failing for > mips*-linux-gnu for a long while but I was never sure what to do about it. libstdc++/testsuite Makefile has a "new-abi-baseline" target. # Use 'new-abi-baseline' to create an initial symbol file. Then run # 'check-abi' to test for changes against that file. - David
Re: [gomp4] Questions about "declare target" and "target update" pragmas
2014/1/22 Jakub Jelinek : > This can print 3 (if doing host fallback or device shares address space > with host), or 2 (otherwise). It shouldn't print 1 ever, and yes, > the target update is then well defined. All variables from omp declare > target are allocated on the device sometime before > the first target data/target update/target region; given that they will > be allocated in the data section of the target DSO, they actually just need > to be registered with the mapping data structure when the DSO is loaded. > > No, the target DSO initialization should use the tables we've talked about > to initialize the mapping. > > Jakub Yes, when G is global variable marked with 'declare target', everything works fine. But this testcase crashes at runtime in GOMP_target_update: int main () { int G = 2; #pragma omp target update to(G) G = 3; int x = 0; #pragma omp target { x = G; } printf ("%d\n", x); } Is it right, that such usage of 'target update' is not allowed by omp specification? -- Ilya
Re: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c
On 01/27/2014 10:51 PM, Senthil Kumar Selvaraj wrote: > This is on trunk - I was under the impression that it is always trunk, > unless otherwise stated? > > getpid doesn't really make sense for bare metal targets, I would think? > > Regards > Senthil > > On Mon, Jan 27, 2014 at 01:04:48PM +, Umesh Kalappa wrote: >> Senthil, >> Please do let us know the gcc version ,I couldn't locate the file pr58419.c >> in the GCC 4.8.1 source . >> >> To go with the below problem ,you can attributed the getpid() function as >> weak (http://www.embedded-bits.co.uk/2008/gcc-weak-symbols/). >> >> ~Umesh >> >> -Original Message- >> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of >> Senthil Kumar Selvaraj >> Sent: 27 January 2014 15:18 >> To: gcc@gcc.gnu.org >> Subject: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c >> >> All, >> >> gcc.c-torture/execute/pr58419.c has a call to getpid, and this causes >> a linker error on the AVR (embedded) target. Is the call intentional, >> and if yes, how should this be fixed for targets that don't support an >> OS? >> >> Regards >> Senthil > > The testcase in the PR had a call to printf. The modified test doesn't call printf, and doesn't declare getpid. I assume that getpid is called simply to have an external call, but there must be something that could be used instead that is available for any target. Jeff, you added this test, would you please take a look at it? Janis
Re: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c
On 01/28/14 11:14, Janis Johnson wrote: The testcase in the PR had a call to printf. The modified test doesn't call printf, and doesn't declare getpid. I assume that getpid is called simply to have an external call, but there must be something that could be used instead that is available for any target. Jeff, you added this test, would you please take a look at it? It was a testcase from Zhendong that was modified to fit within our testing framework. I think we could replace the call to getpid with a call to a function which has a noinline attribute. Jeff
Re: proposal to make SIZE_TYPE more flexible
Ping? Or do I need to repost on the patches list? http://gcc.gnu.org/ml/gcc/2014-01/msg00130.html
Re: [gomp4] Questions about "declare target" and "target update" pragmas
On Tue, Jan 28, 2014 at 09:54:09PM +0400, Ilya Verbin wrote: > Yes, when G is global variable marked with 'declare target', everything works > fine. But this testcase crashes at runtime in GOMP_target_update: > > int main () > { > int G = 2; > #pragma omp target update to(G) > G = 3; > int x = 0; > #pragma omp target > { > x = G; > } > printf ("%d\n", x); > } > > Is it right, that such usage of 'target update' is not allowed by omp > specification? Yes, this testcase is invalid. "If the corresponding list item is not present in the device data environment, the behavior is unspecified." Perhaps we shouldn't crash but do nothing, or complain to stderr, as QoI. Jakub
Re: proposal to make SIZE_TYPE more flexible
On Tue, 28 Jan 2014, DJ Delorie wrote: > Ping? Or do I need to repost on the patches list? Repost on the patches list (with self-contained write-up, rationale for choices made, etc.) at the start of stage 1 for 4.10/5.0, I suggest (this clearly isn't stage 3 material). -- Joseph S. Myers jos...@codesourcery.com
Re: proposal to make SIZE_TYPE more flexible
> Repost on the patches list (with self-contained write-up, rationale for > choices made, etc.) at the start of stage 1 for 4.10/5.0, Ok. > I suggest (this clearly isn't stage 3 material). Yup. Would be nice to back port it to 4.9 later, but... understood.
Re: Hurd: /lib/ld.so vs. /lib/ld.so.1 (was: Policy: Require new dynamic loader names for entirely new ABIs?)
> For x86 Hurd, GCC has been specifying »-dynamic-linker /lib/ld.so« in > LINK_SPEC since forever (1995, or earlier), and Debian glibc has had the > following since forever (2002, or earlier): > > ifeq ($(DEB_HOST_GNU_SYSTEM),gnu) > # Why doesn't the glibc makefile install this? >ln -sf ld.so.1 $(tmpdir)/$@/lib/ld.so > endif > > Roland, do you have any recollection of this? Assuming that /lib/ld.so.1 > is the "official" name, I suppose GCC should be changed, and then Debian > could drop the symbolic link after a transition period (full archive > rebuilt, etc.). The original rationale was that these are two different strings indicating two different kinds of compatibility, and that there would be symlinks from both of these names to the same implementation at any given moment. That thinking goes like this: /lib/ld.so is the PT_INTERP string. That identifies the ELF dynamic linker layer of the ABI. That doesn't need to change as long as programs' use of ELF headers, symbol tables, and so forth does not need new or different behavior. ld.so.1 is the SONAME string. That identifies the DSO that happens also to be the dynamic linker, and its particular DSO ABI just as the SONAME of any other DSO identifies that DSO's ABI. That needs to change any time the set of symbols exported by the dynamic linker changes incompatibly. (All this rationale pre-dates the symbol versioning features by a few years.) Those notions remain sound as to what each string means. But practical reality and years of experience lead in different directions. It has become very clear that by far the easiest way to retain compatibility for a variety of binary vintages and flavors sharing a filesystem is to have their loading paths diverge as early as possible. PT_INTERP is the earliest step in the process that is really under userland control, so that's the spot. It's trivial to make multiple names resolve to the same actual thing (with symlinks) when one thing can serve multiple flavors/vintages but it's very difficult to go in the other direction. That says that if the canonical PT_INTERP string for Hurd is going to change, it might as well change in the direction of being much more specific. e.g. /lib/ld-gnu-i386.so.1 or something like that. Thanks, Roland
gimple_build_call for a class constructor
I am building a GCC plugin and am trying to create a call to a constructor for a global variable. The class is declared in a .cpp file and I have global instance of the class declared in the file as well. The class declaration for the global instance I am trying to create follows: -- namespace LocalTestNamespace { class CTestClass { public : CTestClass() { std::cout << "Test Class Initialized." << std::endl; } }; } LocalTestNamespace::CTestClasssourceCodeGlobalTestClass; // g++ parser generates the initialization statement for this global In my plugin, I create a global variable for 'CTestClass' and then attempt to invoke the constructor for it in the '__static_initialization_and_destruction_0' function. Below is a snippet of the code to create the gimple statement and insert it into the initialization function. The plugin runs just before the call flow graph generator pass. - treeaddr_var_decl = build_fold_addr_expr( globalDeclaration ); // globalDeclaration points to the VAR_DECL I created treeconstructor = CLASSTYPE_CONSTRUCTORS( declType ); // declType is the tree for CTestClass gimpleinitializationStatement = gimple_build_call( OVL_CURRENT( constructor ), 1, addr_var_decl ); debug_gimple_stmt( initializationStatement ); // the debug outout of the statement looks fine gsi_insert_before( &insertionPoint, initializationStatement, GSI_SAME_STMT ); // insertionPoint is just before the goto following the calls to global initializers -- When I run this code, the statement gets inserted but the assembler fails. Looking at the assembly output reveals the following at the end of the initializer: -- movl$sourceCodeGlobalTestClass, %edi // the global in the source code call_ZN18LocalTestNamespace10CTestClassC1Ev // call to the class constructor created by the g++ parser movl$testCTestClassVar, %edi // the global I created in the plugin call_ZN18LocalTestNamespace10CTestClassC1EOS0_ *INTERNAL* // call to the class constructor generated by the code snippet above and the gcc error -- Using c++filt the names demangle as: _ZN18LocalTestNamespace10CTestClassC1Ev =>> LocalTestNamespace::CTestClass::CTestClass() _ZN18LocalTestNamespace10CTestClassC1EOS0_ =>> LocalTestNamespace::CTestClass::CTestClass(LocalTestNamespace::CTestClass&&) Clearly the call I am building is incorrect and I have tried numerous variations with the same results. If I manually edit the assembly output file and change the 'C1EOS0_' suffix to 'C1Ev' and strip out the '*INTERNAL*', I can run the assembler on the modified file and generate an executable that works perfectly. I have searched for examples of using gimple_build_call() to generate calls to c++ class constructors but haven't tripped over any examples. I would greatly appreciate any suggestions on how to generate the appropriate constructor call. Thanks, Stephan