date:20140128

Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

2014-01-28 Thread Richard Biener

On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei  wrote:
> Hi,
> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization 
> on our port. After a bit investigation, I found following code that prefer 
> even|odd version instead of lo|hi one. This is obviously the case for AltiVec 
> and maybe some other targets. But even|odd (expanding to a series of 
> instructions) versions are less efficient on our target than lo|hi ones. 
> Shouldn't there be a target-specific hook to do the choice instead of 
> hard-coded one here, or utilizing some cost-estimating technique to compare 
> two alternatives?

Hmm, what's the reason for a target to support both?  I think the idea
was that a target only supports either (the more efficient case).

Richard.

>  /* The result of a vectorized widening operation usually requires
>  two vectors (because the widened results do not fit into one vector).
>  The generated vector results would normally be expected to be
>  generated in the same order as in the original scalar computation,
>  i.e. if 8 results are generated in each vector iteration, they are
>  to be organized as follows:
> vect1: [res1,res2,res3,res4],
> vect2: [res5,res6,res7,res8].
>
>  However, in the special case that the result of the widening
>  operation is used in a reduction computation only, the order doesn't
>  matter (because when vectorizing a reduction we change the order of
>  the computation).  Some targets can take advantage of this and
>  generate more efficient code.  For example, targets like Altivec,
>  that support widen_mult using a sequence of {mult_even,mult_odd}
>  generate the following vectors:
> vect1: [res1,res3,res5,res7],
> vect2: [res2,res4,res6,res8].
>
>  When vectorizing outer-loops, we execute the inner-loop sequentially
>  (each vectorized inner-loop iteration contributes to VF outer-loop
>  iterations in parallel).  We therefore don't allow to change the
>  order of the computation in the inner-loop during outer-loop
>  vectorization.  */
>   /* TODO: Another case in which order doesn't *really* matter is when we
>  widen and then contract again, e.g. (short)((int)x * y >> 8).
>  Normally, pack_trunc performs an even/odd permute, whereas the
>  repack from an even/odd expansion would be an interleave, which
>  would be significantly simpler for e.g. AVX2.  */
>   /* In any case, in order to avoid duplicating the code below, recurse
>  on VEC_WIDEN_MULT_EVEN_EXPR.  If it succeeds, all the return values
>  are properly set up for the caller.  If we fail, we'll continue with
>  a VEC_WIDEN_MULT_LO/HI_EXPR check.  */
>   if (vect_loop
>   && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction
>   && !nested_in_vect_loop_p (vect_loop, stmt)
>   && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR,
>  stmt, vectype_out, vectype_in,
>  code1, code2, multi_step_cvt,
>  interm_types))
> return true;
>
>
> Thanks,
> Bingfeng Mei

RE: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

2014-01-28 Thread Bingfeng Mei

Thanks, Richard. It is not very clear from documents. 

"Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
are vectors with N signed/unsigned elements of size S. Multiply the high/low
or even/odd elements of the two vectors, and put the N/2 products of size 2*S
in the output vector (operand 0)."

So I thought that implementing both can help vectorizer to optimize more loops.
Maybe we should improve documents.

Bingfeng 



-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com] 
Sent: 28 January 2014 11:02
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in 
vectorization.

On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei  wrote:
> Hi,
> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization 
> on our port. After a bit investigation, I found following code that prefer 
> even|odd version instead of lo|hi one. This is obviously the case for AltiVec 
> and maybe some other targets. But even|odd (expanding to a series of 
> instructions) versions are less efficient on our target than lo|hi ones. 
> Shouldn't there be a target-specific hook to do the choice instead of 
> hard-coded one here, or utilizing some cost-estimating technique to compare 
> two alternatives?

Hmm, what's the reason for a target to support both?  I think the idea
was that a target only supports either (the more efficient case).

Richard.

>  /* The result of a vectorized widening operation usually requires
>  two vectors (because the widened results do not fit into one vector).
>  The generated vector results would normally be expected to be
>  generated in the same order as in the original scalar computation,
>  i.e. if 8 results are generated in each vector iteration, they are
>  to be organized as follows:
> vect1: [res1,res2,res3,res4],
> vect2: [res5,res6,res7,res8].
>
>  However, in the special case that the result of the widening
>  operation is used in a reduction computation only, the order doesn't
>  matter (because when vectorizing a reduction we change the order of
>  the computation).  Some targets can take advantage of this and
>  generate more efficient code.  For example, targets like Altivec,
>  that support widen_mult using a sequence of {mult_even,mult_odd}
>  generate the following vectors:
> vect1: [res1,res3,res5,res7],
> vect2: [res2,res4,res6,res8].
>
>  When vectorizing outer-loops, we execute the inner-loop sequentially
>  (each vectorized inner-loop iteration contributes to VF outer-loop
>  iterations in parallel).  We therefore don't allow to change the
>  order of the computation in the inner-loop during outer-loop
>  vectorization.  */
>   /* TODO: Another case in which order doesn't *really* matter is when we
>  widen and then contract again, e.g. (short)((int)x * y >> 8).
>  Normally, pack_trunc performs an even/odd permute, whereas the
>  repack from an even/odd expansion would be an interleave, which
>  would be significantly simpler for e.g. AVX2.  */
>   /* In any case, in order to avoid duplicating the code below, recurse
>  on VEC_WIDEN_MULT_EVEN_EXPR.  If it succeeds, all the return values
>  are properly set up for the caller.  If we fail, we'll continue with
>  a VEC_WIDEN_MULT_LO/HI_EXPR check.  */
>   if (vect_loop
>   && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction
>   && !nested_in_vect_loop_p (vect_loop, stmt)
>   && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR,
>  stmt, vectype_out, vectype_in,
>  code1, code2, multi_step_cvt,
>  interm_types))
> return true;
>
>
> Thanks,
> Bingfeng Mei

Re: Suspected bugs in ptr_difference_const & split_address_to_core_and_offset

2014-01-28 Thread Richard Biener

On Fri, Jan 24, 2014 at 6:47 PM, Bingfeng Mei  wrote:
> Hi,
> I experienced an issue in our port, which I suspected due to bugs
> in ptr_difference_const & split_address_to_core_and_offset. Basically,
> ptr_difference_const, called by ivopts pass, tries to evaluate
> whether e1 & e2 differ by a const. In this example,
>
> e1 is (addr_expr (mem_ref (ssa_name1, 8))
> e2 is just ssa_name1.
>
> It is obvious to me that ptr_difference_const should return true. But
> it calls split_address_to_core_and_offset to split e1 to some base pointer
> and offset. split_addess_to_core_and_offset in turn calls get_inner_reference
> to do it. get_inner_reference cannot handle (mem_ref (ssa_name1, 8)),
> it just returns the same expression back.
>
> get_inner_reference function
> case MEM_REF:
>   /* Hand back the decl for MEM[&decl, off].  */
>   if (TREE_CODE (TREE_OPERAND (exp, 0)) == ADDR_EXPR)
> {
>   tree off = TREE_OPERAND (exp, 1);
>   if (!integer_zerop (off))
> {
>   double_int boff, coff = mem_ref_offset (exp);
>   boff = coff.alshift (BITS_PER_UNIT == 8
>? 3 : exact_log2 (BITS_PER_UNIT),
>HOST_BITS_PER_DOUBLE_INT);
>   bit_offset += boff;
> }
>   exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
> }
>   goto done;
>
> Then in ptr_difference_const, we get core1 as (mem_ref (ssa_name1, 8))
> and toffset1 is empty. ptr_difference_const will return false as result.

That's because get_inner_reference is the wrong tool to ask for
a base _address_ IMHO.  In theory get_inner_reference could
return MEM[ptr, 0] of course but that requires building a new tree
which isn't the suitable thing to do here.

What you want is a get_base_address_and_constant_offset_part.
It may be as simple as wrapping get_inner_reference
to perform the final step and adjust the kind of tree it is supposed to
return.

>
> There is another possible bug in ptr_difference_const. If one of toffset1
> & toffset2 is true, why it returns false?  The comment doesn't make sense
> to me. In this example, toffset1 should be 8 and toffset2 should be empty.

No, I think in this example bitpos should have the 8, not toffset.  toffset
are supposed to be non-constant parts.

I think the fix belongs into split_address_to_core_and_offset, handling
MEM[X, CST], avoiding the build_fold_addr_expr_loc and adjusting
pbitpos for CST.

Richard.

> No way it should return false.
>
>   else if (toffset1 || toffset2)
> {
>  /* If only one of the offsets is non-constant, the difference cannot
> be a constant.  */
>   return false;
> }
>
>
> Any comment? I would like to submit a patch for it. The problem is I don't
> have an reproducible example on x86 or other public targets. I ran through
> the x86-64 tests and didn't hit a single case that meets this condition.
> e1 is (addr_expr (mem_ref (ssa_name1, 8))
> e2 is just ssa_name1.
> Not sure about other targets though.
>
> Thanks,
> Bingfeng

Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

2014-01-28 Thread Richard Biener

On Tue, Jan 28, 2014 at 12:08 PM, Bingfeng Mei  wrote:
> Thanks, Richard. It is not very clear from documents.
>
> "Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
> are vectors with N signed/unsigned elements of size S. Multiply the high/low
> or even/odd elements of the two vectors, and put the N/2 products of size 2*S
> in the output vector (operand 0)."
>
> So I thought that implementing both can help vectorizer to optimize more 
> loops.
> Maybe we should improve documents.

Maybe.  But my answer was from the top of my head - so better double-check
in the vectorizer sources.

Richard.

> Bingfeng
>
>
>
> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: 28 January 2014 11:02
> To: Bingfeng Mei
> Cc: gcc@gcc.gnu.org
> Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR 
> in vectorization.
>
> On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei  wrote:
>> Hi,
>> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization 
>> on our port. After a bit investigation, I found following code that prefer 
>> even|odd version instead of lo|hi one. This is obviously the case for 
>> AltiVec and maybe some other targets. But even|odd (expanding to a series of 
>> instructions) versions are less efficient on our target than lo|hi ones. 
>> Shouldn't there be a target-specific hook to do the choice instead of 
>> hard-coded one here, or utilizing some cost-estimating technique to compare 
>> two alternatives?
>
> Hmm, what's the reason for a target to support both?  I think the idea
> was that a target only supports either (the more efficient case).
>
> Richard.
>
>>  /* The result of a vectorized widening operation usually requires
>>  two vectors (because the widened results do not fit into one 
>> vector).
>>  The generated vector results would normally be expected to be
>>  generated in the same order as in the original scalar computation,
>>  i.e. if 8 results are generated in each vector iteration, they are
>>  to be organized as follows:
>> vect1: [res1,res2,res3,res4],
>> vect2: [res5,res6,res7,res8].
>>
>>  However, in the special case that the result of the widening
>>  operation is used in a reduction computation only, the order doesn't
>>  matter (because when vectorizing a reduction we change the order of
>>  the computation).  Some targets can take advantage of this and
>>  generate more efficient code.  For example, targets like Altivec,
>>  that support widen_mult using a sequence of {mult_even,mult_odd}
>>  generate the following vectors:
>> vect1: [res1,res3,res5,res7],
>> vect2: [res2,res4,res6,res8].
>>
>>  When vectorizing outer-loops, we execute the inner-loop sequentially
>>  (each vectorized inner-loop iteration contributes to VF outer-loop
>>  iterations in parallel).  We therefore don't allow to change the
>>  order of the computation in the inner-loop during outer-loop
>>  vectorization.  */
>>   /* TODO: Another case in which order doesn't *really* matter is when we
>>  widen and then contract again, e.g. (short)((int)x * y >> 8).
>>  Normally, pack_trunc performs an even/odd permute, whereas the
>>  repack from an even/odd expansion would be an interleave, which
>>  would be significantly simpler for e.g. AVX2.  */
>>   /* In any case, in order to avoid duplicating the code below, recurse
>>  on VEC_WIDEN_MULT_EVEN_EXPR.  If it succeeds, all the return values
>>  are properly set up for the caller.  If we fail, we'll continue with
>>  a VEC_WIDEN_MULT_LO/HI_EXPR check.  */
>>   if (vect_loop
>>   && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction
>>   && !nested_in_vect_loop_p (vect_loop, stmt)
>>   && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR,
>>  stmt, vectype_out, vectype_in,
>>  code1, code2, multi_step_cvt,
>>  interm_types))
>> return true;
>>
>>
>> Thanks,
>> Bingfeng Mei

RE: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

2014-01-28 Thread Bingfeng Mei

I checked vectorization code, it seems that only relevant place 
vec_widen_mult_even/odd & vec_widen_mult_lo/hi are generated is in 
supportable_widening_operation. One of these pairs is selected, with priority 
given to vec_widen_mult_even/odd if it is a reduction loop. However, lo/hi pair 
seems to have wider usage than even/odd pair (non-loop? Non-reduction?). Maybe 
that's why AltiVec and x86 still implement both pairs. Is following patch OK?

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 207183)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2014-01-28  Bingfeng Mei  
+
+   * doc/md.texi: Mention that a target shouldn't implement 
+   vec_widen_(s|u)mul_even/odd pair if it is less efficient
+   than hi/lo pair.
+
 2014-01-28  Richard Biener  
 
Revert
Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi (revision 207183)
+++ gcc/doc/md.texi (working copy)
@@ -4918,7 +4918,8 @@ the output vector (operand 0).
 Signed/Unsigned widening multiplication.  The two inputs (operands 1 and 2)
 are vectors with N signed/unsigned elements of size S@.  Multiply the high/low
 or even/odd elements of the two vectors, and put the N/2 products of size 2*S
-in the output vector (operand 0).
+in the output vector (operand 0). A target shouldn't implement even/odd pattern
+pair if it is less efficient than lo/hi one.
 
 @cindex @code{vec_widen_ushiftl_hi_@var{m}} instruction pattern
 @cindex @code{vec_widen_ushiftl_lo_@var{m}} instruction pattern


-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com] 
Sent: 28 January 2014 12:56
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in 
vectorization.

On Tue, Jan 28, 2014 at 12:08 PM, Bingfeng Mei  wrote:
> Thanks, Richard. It is not very clear from documents.
>
> "Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
> are vectors with N signed/unsigned elements of size S. Multiply the high/low
> or even/odd elements of the two vectors, and put the N/2 products of size 2*S
> in the output vector (operand 0)."
>
> So I thought that implementing both can help vectorizer to optimize more 
> loops.
> Maybe we should improve documents.

Maybe.  But my answer was from the top of my head - so better double-check
in the vectorizer sources.

Richard.

> Bingfeng
>
>
>
> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: 28 January 2014 11:02
> To: Bingfeng Mei
> Cc: gcc@gcc.gnu.org
> Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR 
> in vectorization.
>
> On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei  wrote:
>> Hi,
>> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization 
>> on our port. After a bit investigation, I found following code that prefer 
>> even|odd version instead of lo|hi one. This is obviously the case for 
>> AltiVec and maybe some other targets. But even|odd (expanding to a series of 
>> instructions) versions are less efficient on our target than lo|hi ones. 
>> Shouldn't there be a target-specific hook to do the choice instead of 
>> hard-coded one here, or utilizing some cost-estimating technique to compare 
>> two alternatives?
>
> Hmm, what's the reason for a target to support both?  I think the idea
> was that a target only supports either (the more efficient case).
>
> Richard.
>
>>  /* The result of a vectorized widening operation usually requires
>>  two vectors (because the widened results do not fit into one 
>> vector).
>>  The generated vector results would normally be expected to be
>>  generated in the same order as in the original scalar computation,
>>  i.e. if 8 results are generated in each vector iteration, they are
>>  to be organized as follows:
>> vect1: [res1,res2,res3,res4],
>> vect2: [res5,res6,res7,res8].
>>
>>  However, in the special case that the result of the widening
>>  operation is used in a reduction computation only, the order doesn't
>>  matter (because when vectorizing a reduction we change the order of
>>  the computation).  Some targets can take advantage of this and
>>  generate more efficient code.  For example, targets like Altivec,
>>  that support widen_mult using a sequence of {mult_even,mult_odd}
>>  generate the following vectors:
>> vect1: [res1,res3,res5,res7],
>> vect2: [res2,res4,res6,res8].
>>
>>  When vectorizing outer-loops, we execute the inner-loop sequentially
>>  (each vectorized inner-loop iteration contributes to VF outer-loop
>>  iterations in parallel).  We therefore don't allow to change the
>>  order of the computation in the inner-loop

Re: Generating minimum libstdc++ symbols for a new platform

2014-01-28 Thread David Edelsohn

On Sun, Jan 26, 2014 at 11:12 AM, Richard Sandiford
 wrote:
> [adding libstdc++@]
>
> Bill Schmidt  writes:
>> It was recently pointed out to me that our new powerpc64le-linux-gnu
>> target does not yet have a corresponding directory in libstdc
>> ++-v3/config/abi/post/ to hold a baseline_symbols.txt for the platform.
>> I've been looking around and haven't found any documentation for how the
>> minimum baseline symbols file should be generated.  Can someone please
>> enlighten me about the process?
>
> Yeah, I'd like to know this too.  abi_check has been failing for
> mips*-linux-gnu for a long while but I was never sure what to do about it.

libstdc++/testsuite Makefile has a "new-abi-baseline" target.

# Use 'new-abi-baseline' to create an initial symbol file.  Then run
# 'check-abi' to test for changes against that file.

- David

Re: [gomp4] Questions about "declare target" and "target update" pragmas

2014-01-28 Thread Ilya Verbin

2014/1/22 Jakub Jelinek :
> This can print 3 (if doing host fallback or device shares address space
> with host), or 2 (otherwise).  It shouldn't print 1 ever, and yes,
> the target update is then well defined.  All variables from omp declare
> target are allocated on the device sometime before
> the first target data/target update/target region; given that they will
> be allocated in the data section of the target DSO, they actually just need
> to be registered with the mapping data structure when the DSO is loaded.
>
> No, the target DSO initialization should use the tables we've talked about
> to initialize the mapping.
>
> Jakub

Yes, when G is global variable marked with 'declare target', everything works
fine.  But this testcase crashes at runtime in GOMP_target_update:

int main ()
{
  int G = 2;
  #pragma omp target update to(G)
  G = 3;
  int x = 0;
  #pragma omp target
{
  x = G;
}
  printf ("%d\n", x);
}

Is it right, that such usage of 'target update' is not allowed by omp
specification?

  -- Ilya

Re: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c

2014-01-28 Thread Janis Johnson

On 01/27/2014 10:51 PM, Senthil Kumar Selvaraj wrote:
> This is on trunk - I was under the impression that it is always trunk,
> unless otherwise stated?
> 
> getpid doesn't really make sense for bare metal targets, I would think?
> 
> Regards
> Senthil
> 
> On Mon, Jan 27, 2014 at 01:04:48PM +, Umesh Kalappa wrote:
>> Senthil,
>> Please do let us know the gcc version ,I couldn't locate the file pr58419.c  
>> in  the GCC 4.8.1 source .
>>  
>> To go with the below problem ,you can attributed the getpid() function as 
>> weak (http://www.embedded-bits.co.uk/2008/gcc-weak-symbols/).
>>
>> ~Umesh
>>
>> -Original Message-
>> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of 
>> Senthil Kumar Selvaraj
>> Sent: 27 January 2014 15:18
>> To: gcc@gcc.gnu.org
>> Subject: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c
>>
>> All,
>>
>>   gcc.c-torture/execute/pr58419.c has a call to getpid, and this causes
>>   a linker error on the AVR (embedded) target. Is the call intentional,
>>   and if yes, how should this be fixed for targets that don't support an
>>   OS?
>>
>> Regards
>> Senthil
> 
> 

The testcase in the PR had a call to printf.  The modified test doesn't call
printf, and doesn't declare getpid.  I assume that getpid is called simply
to have an external call, but there must be something that could be used
instead that is available for any target.

Jeff, you added this test, would you please take a look at it?

Janis

Re: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c

2014-01-28 Thread Jeff Law


On 01/28/14 11:14, Janis Johnson wrote:

The testcase in the PR had a call to printf.  The modified test doesn't call
printf, and doesn't declare getpid.  I assume that getpid is called simply
to have an external call, but there must be something that could be used
instead that is available for any target.

Jeff, you added this test, would you please take a look at it?
It was a testcase from Zhendong that was modified to fit within our 
testing framework.


I think we could replace the call to getpid with a call to a function 
which has a noinline attribute.


Jeff

Re: proposal to make SIZE_TYPE more flexible

2014-01-28 Thread DJ Delorie


Ping?  Or do I need to repost on the patches list?

http://gcc.gnu.org/ml/gcc/2014-01/msg00130.html

Re: [gomp4] Questions about "declare target" and "target update" pragmas

2014-01-28 Thread Jakub Jelinek

On Tue, Jan 28, 2014 at 09:54:09PM +0400, Ilya Verbin wrote:
> Yes, when G is global variable marked with 'declare target', everything works
> fine.  But this testcase crashes at runtime in GOMP_target_update:
> 
> int main ()
> {
>   int G = 2;
>   #pragma omp target update to(G)
>   G = 3;
>   int x = 0;
>   #pragma omp target
> {
>   x = G;
> }
>   printf ("%d\n", x);
> }
> 
> Is it right, that such usage of 'target update' is not allowed by omp
> specification?

Yes, this testcase is invalid.
"If the corresponding list item is not present in the device data
environment, the behavior is unspecified."
Perhaps we shouldn't crash but do nothing, or complain to stderr, as QoI.

Jakub

Re: proposal to make SIZE_TYPE more flexible

2014-01-28 Thread Joseph S. Myers

On Tue, 28 Jan 2014, DJ Delorie wrote:

> Ping?  Or do I need to repost on the patches list?

Repost on the patches list (with self-contained write-up, rationale for 
choices made, etc.) at the start of stage 1 for 4.10/5.0, I suggest (this 
clearly isn't stage 3 material).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: proposal to make SIZE_TYPE more flexible

2014-01-28 Thread DJ Delorie


> Repost on the patches list (with self-contained write-up, rationale for 
> choices made, etc.) at the start of stage 1 for 4.10/5.0,

Ok.

> I suggest (this clearly isn't stage 3 material).

Yup.  Would be nice to back port it to 4.9 later, but... understood.

Re: Hurd: /lib/ld.so vs. /lib/ld.so.1 (was: Policy: Require new dynamic loader names for entirely new ABIs?)

2014-01-28 Thread Roland McGrath

> For x86 Hurd, GCC has been specifying »-dynamic-linker /lib/ld.so« in
> LINK_SPEC since forever (1995, or earlier), and Debian glibc has had the
> following since forever (2002, or earlier):
> 
> ifeq ($(DEB_HOST_GNU_SYSTEM),gnu)
> # Why doesn't the glibc makefile install this?
>ln -sf ld.so.1 $(tmpdir)/$@/lib/ld.so
> endif
> 
> Roland, do you have any recollection of this?  Assuming that /lib/ld.so.1
> is the "official" name, I suppose GCC should be changed, and then Debian
> could drop the symbolic link after a transition period (full archive
> rebuilt, etc.).

The original rationale was that these are two different strings indicating
two different kinds of compatibility, and that there would be symlinks from
both of these names to the same implementation at any given moment.  That
thinking goes like this:

/lib/ld.so is the PT_INTERP string.  That identifies the ELF dynamic linker
layer of the ABI.  That doesn't need to change as long as programs' use of
ELF headers, symbol tables, and so forth does not need new or different
behavior.

ld.so.1 is the SONAME string.  That identifies the DSO that happens also to
be the dynamic linker, and its particular DSO ABI just as the SONAME of any
other DSO identifies that DSO's ABI.  That needs to change any time the set
of symbols exported by the dynamic linker changes incompatibly.  (All this
rationale pre-dates the symbol versioning features by a few years.)

Those notions remain sound as to what each string means.  But practical
reality and years of experience lead in different directions.  It has
become very clear that by far the easiest way to retain compatibility for a
variety of binary vintages and flavors sharing a filesystem is to have
their loading paths diverge as early as possible.  PT_INTERP is the
earliest step in the process that is really under userland control, so
that's the spot.  It's trivial to make multiple names resolve to the same
actual thing (with symlinks) when one thing can serve multiple
flavors/vintages but it's very difficult to go in the other direction.

That says that if the canonical PT_INTERP string for Hurd is going to
change, it might as well change in the direction of being much more
specific.  e.g. /lib/ld-gnu-i386.so.1 or something like that.


Thanks,
Roland

gimple_build_call for a class constructor

2014-01-28 Thread Stephan Friedl

I am building a GCC plugin and am trying to create a call to a constructor for 
a global variable.  The class is declared in a .cpp file and I have global 
instance of the class declared in the file as well. The class declaration for 
the global instance I am trying to create follows:

--

namespace LocalTestNamespace
{
    class CTestClass
    {
    public :

        CTestClass()
        {
            std::cout << "Test Class Initialized." << std::endl;
        }

    };
}


LocalTestNamespace::CTestClasssourceCodeGlobalTestClass;                       
// g++ parser generates the initialization statement for this global




In my plugin, I create a global variable for 'CTestClass' and then attempt to 
invoke the constructor for it in the 
'__static_initialization_and_destruction_0' function.  Below is a snippet of 
the code to create the gimple statement and insert it into the initialization 
function.  The plugin runs just before the call flow graph generator pass.

-

treeaddr_var_decl = build_fold_addr_expr( globalDeclaration );                
// globalDeclaration points to the VAR_DECL I created

treeconstructor = CLASSTYPE_CONSTRUCTORS( declType );                         
// declType is the tree for CTestClass

gimpleinitializationStatement = gimple_build_call( OVL_CURRENT( constructor ), 
1, addr_var_decl );



debug_gimple_stmt( initializationStatement );                                   
          // the debug outout of the statement looks fine

gsi_insert_before( &insertionPoint, initializationStatement, GSI_SAME_STMT );   
          // insertionPoint is just before the goto following the calls to 
global initializers

--



When I run this code, the statement gets inserted but the assembler fails.  
Looking at the assembly output reveals the following at the end of the 
initializer:

--

movl$sourceCodeGlobalTestClass, %edi                                     // the 
global in the source code
call_ZN18LocalTestNamespace10CTestClassC1Ev                              // 
call to the class constructor created by the g++ parser
movl$testCTestClassVar, %edi                                             // the 
global I created in the plugin
call_ZN18LocalTestNamespace10CTestClassC1EOS0_ *INTERNAL*                // 
call to the class constructor generated by the code snippet above and the gcc 
error

--


Using c++filt the names demangle as:

_ZN18LocalTestNamespace10CTestClassC1Ev      =>>     
LocalTestNamespace::CTestClass::CTestClass()
_ZN18LocalTestNamespace10CTestClassC1EOS0_   =>>     
LocalTestNamespace::CTestClass::CTestClass(LocalTestNamespace::CTestClass&&)


Clearly the call I am building is incorrect and I have tried numerous 
variations with the same results.  If I manually edit the assembly output file 
and change the 'C1EOS0_' suffix to 'C1Ev' and strip out the '*INTERNAL*', I can 
run the assembler on the modified file and generate an executable that works 
perfectly.  I have searched for examples of using gimple_build_call() to 
generate calls to c++ class constructors but haven't tripped over any examples.

I would greatly appreciate any suggestions on how to generate the appropriate 
constructor call.

Thanks,

Stephan

Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

RE: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

Re: Suspected bugs in ptr_difference_const & split_address_to_core_and_offset

Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

RE: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

Re: Generating minimum libstdc++ symbols for a new platform

Re: [gomp4] Questions about "declare target" and "target update" pragmas

Re: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c

Re: [Testsuite] getpid in gcc.c-torture/execute/pr58419.c

Re: proposal to make SIZE_TYPE more flexible

Re: [gomp4] Questions about "declare target" and "target update" pragmas

Re: proposal to make SIZE_TYPE more flexible

Re: proposal to make SIZE_TYPE more flexible

Re: Hurd: /lib/ld.so vs. /lib/ld.so.1 (was: Policy: Require new dynamic loader names for entirely new ABIs?)

gimple_build_call for a class constructor

15 matches

Site Navigation

Mail list logo

Footer information