Re: [Patch, Fortran] FINAL (prep patches 4/5): Support noncontiguous arrays in the finalization wrapper function

2013-01-05 Thread Paul Richard Thomas
Dear Tobias,

I think that the patch would be much less opaque if repeated operation
to produce code were turned into functions, as I did for the defined
assignment patch; eg resolve.c(build_assignment, add_code_to_chain)?
What you have done is OK but it is HEAVY, as is much of the content of
class.c.  However, unless your intestinal fortitude is of a very high
order, I suggest that this be left as a latter clean up operation :-)

A nit:
+ do idx2 = 1, rank
+   offset = offset + mod (idx, sizes(idx2)) / size(idx2-1) * strides(idx2)
+ end do

s/size(idx2-1)/sizes(idx2-1)/

Apart from that, the patch is OK for trunk.

Thanks


Paul

On 31 December 2012 15:11, Tobias Burnus  wrote:
> Dear all,
>
> this lengthy patch supports noncontiguous arrays in the finalization
> wrapper. That encompasses bother the scalarizer (used for finalizing the
> components and for an ELEMENTAL FINAL subroutine) and calling array FINAL
> subroutines. For the latter, the subroutine is directly called if possible.
> Namely, when the element size of the actual type is the same as the one of
> the declared type - and the the FINAL subroutine is either assumed-shape
> without the contiguous attribute or the actual argument is contiguous.
> Otherwise, the code packs the array.
>
> The code is written such that it works for any array rank. I explicitly
> avoided using GFC_MAX_DIMENSIONS to allow for more ranks without breaking
> the ABI.
>
> The code consists of two new blocks of code. The new function
> "finalization_get_offset" which generates the code to translate from an
> element index to the byte offset - and in generate_finalization_wrapper to
> fill the array "strides" and "sizes", where the latter contains the
> multiplied up size, i.e. sizes(0) == 1, sizes(1) = size(array,dim=1),
> sizes(2) = sizes(1)*size(array,dim=2) etc.
>
> Note: Without patch 5/5, this code is never executed.
>
> Build and regtested on x86-64-gnu-linux - and tested (with the not submitted
> patch for invoking the finalizer).
> OK for the trunk?
>
> Tobias



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


Re: [wwwdocs] SH 4.8 changes - document thread pointer built-ins

2013-01-05 Thread Oleg Endo
Hi,

On Wed, 2013-01-02 at 19:13 -1000, Gerald Pfeifer wrote:
> Hi Oleg,
> 
> On Wed, 17 Oct 2012, Oleg Endo wrote:
> >> +Added support for the built-in functions
> >> +__builtin_thread_pointer and
> >> +__builtin_set_thread_pointer.  This assumes that
> >> +GBR is used to hold the thread pointer of the current 
> >> thread,
> >> +which has been the case since a while already. 
> >> 
> >> "since a while" -> "for a while", and I made that change.
> >> That said, why is this important, and is there a fixed date or version?
> > It might be important for some embedded systems software that does not
> > use the GBR for storing the thread pointer, but for something else (like
> > a pointer to some global table of frequently used stuff or something
> > like that).  I just thought it might be better to mention this.  But
> > you're right, the last "for a while" part sounds strange, and should
> > probably just be removed, reducing it to "This assumes that
> > GBR is used to hold the thread pointer of the current
> > thread."
> 
> That sounds good.  I noticed this has not been changed yet, so I
> assume you were probably waiting for my response?  Will you be
> making this change?

I also assume I was waiting for your response ;)
(it's been a while, can't remember exactly).

I'll send a patch with the change.  Thanks for reminding me.

Cheers,
Oleg




Re: RFA: Fix ICE on PARALLEL returns when expand builtins

2013-01-05 Thread Richard Biener
On Fri, Jan 4, 2013 at 7:57 PM, Eric Botcazou  wrote:
>> Hmm, how would anyone else get at the "padding" info dealing with the
>> parallel?  This looks broken if we need access to the type :/
>
> Yes, you need to access the type for return values, but that's not really new.

Ick :/

>> Eric, any comments?
>
> Richard's patch seems fine to me, modulo the name of the routine; I'd rather
> use maybe_emit_group_store or something along these lines.

Ok, fine with me - you're clearly more experienced with this code.

Thanks,
Richard.

> --
> Eric Botcazou


Re: [Patch, libfortran] Improve performance of byte swapped IO

2013-01-05 Thread Richard Biener
On Fri, Jan 4, 2013 at 11:35 PM, Andreas Schwab  wrote:
> Janne Blomqvist  writes:
>
>> diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c
>> index c8ecc3a..bf2250a 100644
>> --- a/libgfortran/io/file_pos.c
>> +++ b/libgfortran/io/file_pos.c
>> @@ -140,15 +140,21 @@ unformatted_backspace (st_parameter_filepos *fpp, 
>> gfc_unit *u)
>>   }
>>else
>>   {
>> +   uint32_t u32;
>> +   uint64_t u64;
>> switch (length)
>>   {
>>   case sizeof(GFC_INTEGER_4):
>> -   reverse_memcpy (&m4, p, sizeof (m4));
>> +   memcpy (&u32, p, sizeof (u32));
>> +   u32 = __builtin_bswap32 (u32);
>> +   m4 = *(GFC_INTEGER_4*)&u32;
>
> Isn't that an aliasing violation?

It looks like one.  Why not simply do

   m4 = (GFC_INTEGER_4) u32;

?  I suppose GFC_INTEGER_4 is always the same size as uint32_t but signed?

Richard.

>
> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


Re: Control dependence vs. builtin_unreachable

2013-01-05 Thread Steven Bosscher
On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote:
> On 01/03/2013 12:01 PM, Steven Bosscher wrote:
>>
>> Hello,
>>
>> Consider the following test case:
>>
>> void bar (void);
>> int foo (int b, int c, int d)
>> {
>>int r = 0;
>>if (b)
>>  res = b * 2 + 4;
>>if (c)
>>  {
>>if (d)
>>  r = res;
>>else
>>  __builtin_unreachable ();
>>  }
>>return r;
>> }
>>
>> This is typical for code in GCC itself in places where
>> gcc_unreachable() is used.
>
>
>>
>> The corresponding CFG looks like this:
>>
>>  +-+
>>  | bb0 |
>>  +-+
>>|
>>|
>>v
>>  +-+
>>  | bb2 | -+
>>  +-+  |
>>|  |
>>|  |
>>v  |
>>  +-+  |
>>  | bb3 |  |
>>  +-+  |
>>|  |
>>|  |
>>v  |
>> +-+ +-+  |
>> | bb8 | <-- | bb4 | <+
>> +-+ +-+
>>|   |
>>|   |
>>|   v
>>| +-+ +-+
>>| | bb5 | --> | bb7 |
>>| +-+ +-+
>>|   |
>>|   |
>>|   v
>>| +-+
>>| | bb6 |
>>| +-+
>>|   |
>>|   |
>>|   v
>>| +-+
>>+---> | bb9 |
>>  +-+
>>|
>>|
>>v
>>  +-+
>>  | bb1 |
>>  +-+
>
> Presumably BB7 was created in response to the builtin_unreachable?

Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up
at the end of the GIMPLE passes pipeline, in the fold-all-builtins
pass (most __builtin_unreachable calls are, but not all).


> One
> could argue that an empty dead-end basic block should just be removed and
> the CFG appropriately simplified.

The problem with this, is that __builtin_unreachable actually exposes
optimization opportunities: more const/copy props of "implicit sets"
in the predicate guarding the __builtin_unreachable call, more
optimistic value numbering, etc. It also helps improve maybe-unused
warnings accuracy. So simply removing these "dead ends" in the CFG is
probably not a good idea.


> You might want to look  at a discussion from Oct/Nov 2011 "New pass to
> delete unexecutable paths in the CFG" which touches on some of this stuff.

That's a really interesting discussion! I must have missed it at the time :-)


> It's not 100% the same, but the concept of eliminating edges from the CFG
> which we can never traverse in a conforming program applies to both your
> example and the stuff I was playing with.

I think there is one important difference: In the thread you referred
to, you're removing paths in the CFG that are implicitly not
executable (for some languages anyway), whereas a
__builtin_unreachable call is an explicit marker for "this can never
happen". I think this difference is important:

* The explicit marker may have been put there on purpose (e.g. to get
rid of a false-positive warning). The compiler should respect that. An
implicit unreachable path can be optimized away without regard for the
user's intent.

* The explicit marker should not inhibit optimizations. For an
implicit unreachable path the compiler should be conservative. But for
a __builtin_unreachable call that is the only statement in a basic
block, the compiler should be allowed to work as if the block really
is never reached.

The attached patch implements these ideas. During a tree-CFG cleanup,
basic blocks containing only a __builtin_unreachable call are marked
with a new flag BB_NEVER_REACHED. The flag is used in post-dominance:
A marked block is not considered in the computations of the immediate
post-dominator of the predecessor blocks. The result is a more
optimistic post-dominance tree: Without the patch all predecessors of
these BB_NEVER_REACHED blocks have the EXIT block as their
post-dominator, but with the patch this non-executable path in the CFG
is ignored and the post-dominators are those that would result if the
BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED
blocks themselves are only post-dominated by the EXIT block).

I've also added a control dependence calculation function. It's not
currently used, but it shows how the BB_NEVER_REACHED flag is used in
this case to avoid the "false" control dependences that I showed in
the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html.

Bootstrapped&tested on powerpc64-unknown-linux-gnu.
What do you think of this approach?

Ciao!
Steven
* cfg-flags.def (BB_NEVER_REACHED): New flag on basic_block.
(BB_SUPERBLOCK): Add FIXME, document that this flag should be removed.
* tree-cfgcleanup.c (is_only_builtin_unreachable_block): New function.
(mark

[Patch, fortran] PR fortran/42769 typebound call resolved incorrectly.

2013-01-05 Thread Mikael Morin

Hello,

here is a fix for PR42769, where a type bound call was resolved 
statically into a normal procedure call, but using the local procedure 
(with the same name) instead of the original (and correct) one.
It turned out to not be OOP specific, and be a pure module loading 
problem in the way we associate a symbol number in the module file with 
the corresponding symbol pointer once module is loaded.


The first module.c hunk fixes comment #14 in the PR. Here, p == NULL, 
which means that the user didn't ask to load the symbol (USE,ONLY:...), 
so we can just continue to the next one.  However, as an optimization, 
we try to find the symbol in the current name space and associate the 
pointer info (mapping module's integer <-> symbol pointer) with it if 
found to avoid reloading it from the module file if needed.
The problem was there was no check that the symbol is really the same, 
not another symbol with the same name.  Fixed by checking symbol's name 
and module name too.  The original code was only reusing the symtree; I 
took the opportunity to reuse the symbol as well, and the test suite 
didn't complain :-).


The second module.c hunk fixes comment #24 in the PR.  Here, p != NULL: 
the symbol will have to be loaded.  Before creating a new symbol and a 
new symtree and inserting them in the current name space, we check 
whether a symtree of the same name exists in the current name space. If 
it exists and is not ambiguous, it is either the same one as the one 
from the module, or both are generic so we can reuse the existing symbol 
in both cases.  However, if the symtree in the name space is ambiguous 
we shall not reuse it as it is different than the one to be loaded. 
Thus fixed by not reusing it in that case, and letting a unique non 
ambiguous name being automatically generated later.


The resolve.c hunk fixes comment #34 in the PR.  This one is not a 
module problem, but let's fix all cases in one go.  Resolve_call reloads 
the call's procedure symbol to take into account contained procedure 
which were not known at parse time.  However, it uses the symbol's name 
(original name) for lookup instead of the symtree's (local name).  The 
resolve.c hunk changes that.


The tests are adapted from comment 14, 24, 34 from pr42769, and the 
original test cases from pr45836 and pr45900.

Regression tested on x86_64-unknown-linux-gnu. OK for trunk?
4.7?
4.6?

Mikael





2013-01-05  Mikael Morin  

PR fortran/42769
PR fortran/45836
PR fortran/45900
* module.c (read_module): Don't reuse local symtree if the associated
symbol isn't exactly the one wanted.  Don't reuse local symtree if it is
ambiguous.
* resolve.c (resolve_call): Use symtree's name instead of symbol's to
lookup the symtree.


diff --git a/module.c b/module.c
index cde5739..e63e510 100644
--- a/module.c
+++ b/module.c
@@ -4656,8 +4656,14 @@ read_module (void)
  if (p == NULL)
{
  st = gfc_find_symtree (gfc_current_ns->sym_root, name);
- if (st != NULL)
-   info->u.rsym.symtree = st;
+ if (st != NULL
+ && strcmp (st->n.sym->name, info->u.rsym.true_name) == 0
+ && st->n.sym->module != NULL
+ && strcmp (st->n.sym->module, info->u.rsym.module) == 0)
+   {
+ info->u.rsym.symtree = st;
+ info->u.rsym.sym = st->n.sym;
+   }
  continue;
}
 
@@ -4678,7 +4684,8 @@ read_module (void)
  /* Check for ambiguous symbols.  */
  if (check_for_ambiguous (st->n.sym, info))
st->ambiguous = 1;
- info->u.rsym.symtree = st;
+ else
+   info->u.rsym.symtree = st;
}
  else
{
diff --git a/resolve.c b/resolve.c
index d4d5eb9..17ee716 100644
--- a/resolve.c
+++ b/resolve.c
@@ -3778,7 +3778,7 @@ resolve_call (gfc_code *c)
   if (csym && gfc_current_ns->parent && csym->ns != gfc_current_ns)
 {
   gfc_symtree *st;
-  gfc_find_sym_tree (csym->name, gfc_current_ns, 1, &st);
+  gfc_find_sym_tree (c->symtree->name, gfc_current_ns, 1, &st);
   sym = st ? st->n.sym : NULL;
   if (sym && csym != sym
  && sym->ns == gfc_current_ns


2013-01-05  Mikael Morin  

PR fortran/42769
PR fortran/45836
PR fortran/45900
* gfortran.dg/use_23.f90: New test.
* gfortran.dg/use_24.f90: New test.
* gfortran.dg/use_25.f90: New test.
* gfortran.dg/use_26.f90: New test.
* gfortran.dg/use_27.f90: New test.



! { dg-do compile }
!
! PR fortran/42769
! This test used to ICE in resolve_typebound_procedure because T1's GET
! procedure was wrongly associated to MOD2's MY_GET (instead of the original
! MOD1's MY_GET) in MOD3's SUB.
!
! Original testcase by Salvator Filippone 
! Reduced by Janus Weil 

module mod1
  type :: t1
  contains
p

Re: [Patch, libfortran] Improve performance of byte swapped IO

2013-01-05 Thread Janne Blomqvist
On Sat, Jan 5, 2013 at 5:35 PM, Richard Biener
 wrote:
> On Fri, Jan 4, 2013 at 11:35 PM, Andreas Schwab  wrote:
>> Janne Blomqvist  writes:
>>
>>> diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c
>>> index c8ecc3a..bf2250a 100644
>>> --- a/libgfortran/io/file_pos.c
>>> +++ b/libgfortran/io/file_pos.c
>>> @@ -140,15 +140,21 @@ unformatted_backspace (st_parameter_filepos *fpp, 
>>> gfc_unit *u)
>>>   }
>>>else
>>>   {
>>> +   uint32_t u32;
>>> +   uint64_t u64;
>>> switch (length)
>>>   {
>>>   case sizeof(GFC_INTEGER_4):
>>> -   reverse_memcpy (&m4, p, sizeof (m4));
>>> +   memcpy (&u32, p, sizeof (u32));
>>> +   u32 = __builtin_bswap32 (u32);
>>> +   m4 = *(GFC_INTEGER_4*)&u32;
>>
>> Isn't that an aliasing violation?
>
> It looks like one.  Why not simply do
>
>m4 = (GFC_INTEGER_4) u32;
>
> ?  I suppose GFC_INTEGER_4 is always the same size as uint32_t but signed?

Yes, GFC_INTEGER_4 is a typedef for int32_t. As for why I didn't do
the above, C99 6.3.1.3(3) says that if the unsigned value is outside
the range of the signed variable, the result is
implementation-defined. Though I suppose the sensible
"implementation-defined behavior" in this case on a two's complement
target is to just do a bitwise copy.

Anyway, to be really safe one could use memcpy instead; the compiler
optimizes small fixed size memcpy's just fine. Updated patch attached.


-- 
Janne Blomqvist


bswap2.diff
Description: Binary data


Re: Control dependence vs. builtin_unreachable

2013-01-05 Thread Steven Bosscher
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher wrote:
> Bootstrapped&tested on powerpc64-unknown-linux-gnu.

And to be clear, bootstrapped with this patch on top:
Index: system.h
===
--- system.h(revision 194924)
+++ system.h(working copy)
@@ -698,7 +698,7 @@

 /* Use gcc_unreachable() to mark unreachable locations (like an
unreachable default case of a switch.  Do not use gcc_assert(0).  */
-#if (GCC_VERSION >= 4005) && !ENABLE_ASSERT_CHECKING
+#if (GCC_VERSION >= 4005) //&& !ENABLE_ASSERT_CHECKING
 #define gcc_unreachable() __builtin_unreachable()
 #else
 #define gcc_unreachable() (fancy_abort (__FILE__, __LINE__, __FUNCTION__))

Otherwise, __builtin_unreachable would be almost unused and
bootstrap+test wouldn't prove much :-)

Ciao!
Steven


Re: [PING^4] PR 54805: __gthread_tsd* in vxlib-tls.c

2013-01-05 Thread rbmj

On 06-Dec-12 10:14, rbmj wrote:

On 26-Nov-12 13:27, rbmj wrote:

On 11/13/2012 10:22 PM, rbmj wrote:

On 11/5/2012 12:57 PM, rbmj wrote:

This removes warnings about implicit declarations and fixes one of the
function calls in vxlib-tls.c for vxworks targets.

I got the old prototypes from
http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01314.html

See bug for further details.

Someone please comment or commit :)



Ping: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00406.html

Robert Mason




Ping^2


Ping^3?



Ping^4?


Re: [cxx-conversion] Change uses of htab_t in gcc/config to hash_table.

2013-01-05 Thread Xinliang David Li
I noticed that the traverse and traverse_noresize method takes
Argument as the first template parameter. It should be moved to be the
second after the callback. In most of the cases, the type of the
Argument can be deduced from the callsite, so that the user only need
to specify the callback:

ht->traverse_noresize(&arg);

In the current way, user has to do this:

ht->traverse_noresize (&arg);

which is not as friendly.

David


On Tue, Dec 18, 2012 at 8:02 PM, Lawrence Crowl  wrote:
> Update various config htab_t uses to hash_table.
>
> Modify types and calls to match.
>
> * config/arm/arm.c'arm_libcall_uses_aapcs_base::libcall_htab
>
> Fold libcall_eq and libcall_hash into new struct libcall_hasher.
>
> * config/ia64/ia64.c'bundle_state_table
>
> Fold bundle_state_hash and bundle_state_eq_p into new struct
> bundle_state_hasher.
>
> * config/mips/mips.c'mips_offset_table
>
> Fold mips_lo_sum_offset_hash and mips_lo_sum_offset_eq into new
> struct mips_lo_sum_offset_hasher.
>
> In mips_reorg_process_insns, change call to for_each_rtx to pass
> a pointer to the hash_table rather than a htab_t.  This change
> requires then dereferencing that pointer in mips_record_lo_sum to
> obtain the hash_table.
>
> * config/sol2.c'solaris_comdat_htab
>
> Fold comdat_hash and comdat_eq into new struct comdat_entry_hasher.
>
> * config/i386/winnt.c'i386_pe_section_type_flags::htab
>
> * config/i386/winnt.c'i386_find_on_wrapper_list::wrappers
>
> Fold wrapper_strcmp into new struct wrapped_symbol_hasher.
>
>
> Tested on x86-64.
> Tested with contrib/config-list.mk.
>
>
> Okay for branch?
>
>
> Index: gcc/config/arm/arm.c
> ===
> --- gcc/config/arm/arm.c(revision 194511)
> +++ gcc/config/arm/arm.c(working copy)
> @@ -25,6 +25,7 @@
>  #include "config.h"
>  #include "system.h"
>  #include "coretypes.h"
> +#include "hash-table.h"
>  #include "tm.h"
>  #include "rtl.h"
>  #include "tree.h"
> @@ -3716,36 +3717,48 @@ arm_function_value(const_tree type, cons
>return arm_libcall_value_1 (mode);
>  }
>
> -static int
> -libcall_eq (const void *p1, const void *p2)
> +/* libcall hashtable helpers.  */
> +
> +struct libcall_hasher : typed_noop_remove 
>  {
> -  return rtx_equal_p ((const_rtx) p1, (const_rtx) p2);
> +  typedef rtx_def value_type;
> +  typedef rtx_def compare_type;
> +  static inline hashval_t hash (const value_type *);
> +  static inline bool equal (const value_type *, const compare_type *);
> +  static inline void remove (value_type *);
> +};
> +
> +inline bool
> +libcall_hasher::equal (const value_type *p1, const compare_type *p2)
> +{
> +  return rtx_equal_p (p1, p2);
>  }
>
> -static hashval_t
> -libcall_hash (const void *p1)
> +inline hashval_t
> +libcall_hasher::hash (const value_type *p1)
>  {
> -  return hash_rtx ((const_rtx) p1, VOIDmode, NULL, NULL, FALSE);
> +  return hash_rtx (p1, VOIDmode, NULL, NULL, FALSE);
>  }
>
> +typedef hash_table  libcall_table_type;
> +
>  static void
> -add_libcall (htab_t htab, rtx libcall)
> +add_libcall (libcall_table_type htab, rtx libcall)
>  {
> -  *htab_find_slot (htab, libcall, INSERT) = libcall;
> +  *htab.find_slot (libcall, INSERT) = libcall;
>  }
>
>  static bool
>  arm_libcall_uses_aapcs_base (const_rtx libcall)
>  {
>static bool init_done = false;
> -  static htab_t libcall_htab;
> +  static libcall_table_type libcall_htab;
>
>if (!init_done)
>  {
>init_done = true;
>
> -  libcall_htab = htab_create (31, libcall_hash, libcall_eq,
> - NULL);
> +  libcall_htab.create (31);
>add_libcall (libcall_htab,
>convert_optab_libfunc (sfloat_optab, SFmode, SImode));
>add_libcall (libcall_htab,
> @@ -3804,7 +3817,7 @@ arm_libcall_uses_aapcs_base (const_rtx l
> DFmode));
>  }
>
> -  return libcall && htab_find (libcall_htab, libcall) != NULL;
> +  return libcall && libcall_htab.find (libcall) != NULL;
>  }
>
>  static rtx
> Index: gcc/config/arm/t-arm
> ===
> --- gcc/config/arm/t-arm(revision 194511)
> +++ gcc/config/arm/t-arm(working copy)
> @@ -73,8 +73,8 @@ $(srcdir)/config/arm/arm-tables.opt: $(s
> $(SHELL) $(srcdir)/config/arm/genopt.sh $(srcdir)/config/arm > \
> $(srcdir)/config/arm/arm-tables.opt
>
> -arm.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
> -  $(RTL_H) $(TREE_H) $(OBSTACK_H) $(REGS_H) hard-reg-set.h \
> +arm.o: $(srcdir)/config/arm/arm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
> $(TM_H) \
> +  $(RTL_H) $(TREE_H) $(HASH_TABLE_H) $(OBSTACK_H) $(REGS_H) hard-reg-set.h \
>insn-config.h conditions.h output.h dumpfile.h \
>$(INSN_ATTR_H) $(FLAGS_H) reload.h $(FUNCTION_H) \
>$(EXPR_H) $(OPTABS_H) $(RECOG_H) $(CGRAPH_H) \
> Index: gcc/config/i386/winnt.c
> ==