Re: [Patch, Fortran] FINAL (prep patches 4/5): Support noncontiguous arrays in the finalization wrapper function
Dear Tobias, I think that the patch would be much less opaque if repeated operation to produce code were turned into functions, as I did for the defined assignment patch; eg resolve.c(build_assignment, add_code_to_chain)? What you have done is OK but it is HEAVY, as is much of the content of class.c. However, unless your intestinal fortitude is of a very high order, I suggest that this be left as a latter clean up operation :-) A nit: + do idx2 = 1, rank + offset = offset + mod (idx, sizes(idx2)) / size(idx2-1) * strides(idx2) + end do s/size(idx2-1)/sizes(idx2-1)/ Apart from that, the patch is OK for trunk. Thanks Paul On 31 December 2012 15:11, Tobias Burnus wrote: > Dear all, > > this lengthy patch supports noncontiguous arrays in the finalization > wrapper. That encompasses bother the scalarizer (used for finalizing the > components and for an ELEMENTAL FINAL subroutine) and calling array FINAL > subroutines. For the latter, the subroutine is directly called if possible. > Namely, when the element size of the actual type is the same as the one of > the declared type - and the the FINAL subroutine is either assumed-shape > without the contiguous attribute or the actual argument is contiguous. > Otherwise, the code packs the array. > > The code is written such that it works for any array rank. I explicitly > avoided using GFC_MAX_DIMENSIONS to allow for more ranks without breaking > the ABI. > > The code consists of two new blocks of code. The new function > "finalization_get_offset" which generates the code to translate from an > element index to the byte offset - and in generate_finalization_wrapper to > fill the array "strides" and "sizes", where the latter contains the > multiplied up size, i.e. sizes(0) == 1, sizes(1) = size(array,dim=1), > sizes(2) = sizes(1)*size(array,dim=2) etc. > > Note: Without patch 5/5, this code is never executed. > > Build and regtested on x86-64-gnu-linux - and tested (with the not submitted > patch for invoking the finalizer). > OK for the trunk? > > Tobias -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [wwwdocs] SH 4.8 changes - document thread pointer built-ins
Hi, On Wed, 2013-01-02 at 19:13 -1000, Gerald Pfeifer wrote: > Hi Oleg, > > On Wed, 17 Oct 2012, Oleg Endo wrote: > >> +Added support for the built-in functions > >> +__builtin_thread_pointer and > >> +__builtin_set_thread_pointer. This assumes that > >> +GBR is used to hold the thread pointer of the current > >> thread, > >> +which has been the case since a while already. > >> > >> "since a while" -> "for a while", and I made that change. > >> That said, why is this important, and is there a fixed date or version? > > It might be important for some embedded systems software that does not > > use the GBR for storing the thread pointer, but for something else (like > > a pointer to some global table of frequently used stuff or something > > like that). I just thought it might be better to mention this. But > > you're right, the last "for a while" part sounds strange, and should > > probably just be removed, reducing it to "This assumes that > > GBR is used to hold the thread pointer of the current > > thread." > > That sounds good. I noticed this has not been changed yet, so I > assume you were probably waiting for my response? Will you be > making this change? I also assume I was waiting for your response ;) (it's been a while, can't remember exactly). I'll send a patch with the change. Thanks for reminding me. Cheers, Oleg
Re: RFA: Fix ICE on PARALLEL returns when expand builtins
On Fri, Jan 4, 2013 at 7:57 PM, Eric Botcazou wrote: >> Hmm, how would anyone else get at the "padding" info dealing with the >> parallel? This looks broken if we need access to the type :/ > > Yes, you need to access the type for return values, but that's not really new. Ick :/ >> Eric, any comments? > > Richard's patch seems fine to me, modulo the name of the routine; I'd rather > use maybe_emit_group_store or something along these lines. Ok, fine with me - you're clearly more experienced with this code. Thanks, Richard. > -- > Eric Botcazou
Re: [Patch, libfortran] Improve performance of byte swapped IO
On Fri, Jan 4, 2013 at 11:35 PM, Andreas Schwab wrote: > Janne Blomqvist writes: > >> diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c >> index c8ecc3a..bf2250a 100644 >> --- a/libgfortran/io/file_pos.c >> +++ b/libgfortran/io/file_pos.c >> @@ -140,15 +140,21 @@ unformatted_backspace (st_parameter_filepos *fpp, >> gfc_unit *u) >> } >>else >> { >> + uint32_t u32; >> + uint64_t u64; >> switch (length) >> { >> case sizeof(GFC_INTEGER_4): >> - reverse_memcpy (&m4, p, sizeof (m4)); >> + memcpy (&u32, p, sizeof (u32)); >> + u32 = __builtin_bswap32 (u32); >> + m4 = *(GFC_INTEGER_4*)&u32; > > Isn't that an aliasing violation? It looks like one. Why not simply do m4 = (GFC_INTEGER_4) u32; ? I suppose GFC_INTEGER_4 is always the same size as uint32_t but signed? Richard. > > Andreas. > > -- > Andreas Schwab, sch...@linux-m68k.org > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 > "And now for something completely different."
Re: Control dependence vs. builtin_unreachable
On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote: > On 01/03/2013 12:01 PM, Steven Bosscher wrote: >> >> Hello, >> >> Consider the following test case: >> >> void bar (void); >> int foo (int b, int c, int d) >> { >>int r = 0; >>if (b) >> res = b * 2 + 4; >>if (c) >> { >>if (d) >> r = res; >>else >> __builtin_unreachable (); >> } >>return r; >> } >> >> This is typical for code in GCC itself in places where >> gcc_unreachable() is used. > > >> >> The corresponding CFG looks like this: >> >> +-+ >> | bb0 | >> +-+ >>| >>| >>v >> +-+ >> | bb2 | -+ >> +-+ | >>| | >>| | >>v | >> +-+ | >> | bb3 | | >> +-+ | >>| | >>| | >>v | >> +-+ +-+ | >> | bb8 | <-- | bb4 | <+ >> +-+ +-+ >>| | >>| | >>| v >>| +-+ +-+ >>| | bb5 | --> | bb7 | >>| +-+ +-+ >>| | >>| | >>| v >>| +-+ >>| | bb6 | >>| +-+ >>| | >>| | >>| v >>| +-+ >>+---> | bb9 | >> +-+ >>| >>| >>v >> +-+ >> | bb1 | >> +-+ > > Presumably BB7 was created in response to the builtin_unreachable? Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up at the end of the GIMPLE passes pipeline, in the fold-all-builtins pass (most __builtin_unreachable calls are, but not all). > One > could argue that an empty dead-end basic block should just be removed and > the CFG appropriately simplified. The problem with this, is that __builtin_unreachable actually exposes optimization opportunities: more const/copy props of "implicit sets" in the predicate guarding the __builtin_unreachable call, more optimistic value numbering, etc. It also helps improve maybe-unused warnings accuracy. So simply removing these "dead ends" in the CFG is probably not a good idea. > You might want to look at a discussion from Oct/Nov 2011 "New pass to > delete unexecutable paths in the CFG" which touches on some of this stuff. That's a really interesting discussion! I must have missed it at the time :-) > It's not 100% the same, but the concept of eliminating edges from the CFG > which we can never traverse in a conforming program applies to both your > example and the stuff I was playing with. I think there is one important difference: In the thread you referred to, you're removing paths in the CFG that are implicitly not executable (for some languages anyway), whereas a __builtin_unreachable call is an explicit marker for "this can never happen". I think this difference is important: * The explicit marker may have been put there on purpose (e.g. to get rid of a false-positive warning). The compiler should respect that. An implicit unreachable path can be optimized away without regard for the user's intent. * The explicit marker should not inhibit optimizations. For an implicit unreachable path the compiler should be conservative. But for a __builtin_unreachable call that is the only statement in a basic block, the compiler should be allowed to work as if the block really is never reached. The attached patch implements these ideas. During a tree-CFG cleanup, basic blocks containing only a __builtin_unreachable call are marked with a new flag BB_NEVER_REACHED. The flag is used in post-dominance: A marked block is not considered in the computations of the immediate post-dominator of the predecessor blocks. The result is a more optimistic post-dominance tree: Without the patch all predecessors of these BB_NEVER_REACHED blocks have the EXIT block as their post-dominator, but with the patch this non-executable path in the CFG is ignored and the post-dominators are those that would result if the BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED blocks themselves are only post-dominated by the EXIT block). I've also added a control dependence calculation function. It's not currently used, but it shows how the BB_NEVER_REACHED flag is used in this case to avoid the "false" control dependences that I showed in the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html. Bootstrapped&tested on powerpc64-unknown-linux-gnu. What do you think of this approach? Ciao! Steven * cfg-flags.def (BB_NEVER_REACHED): New flag on basic_block. (BB_SUPERBLOCK): Add FIXME, document that this flag should be removed. * tree-cfgcleanup.c (is_only_builtin_unreachable_block): New function. (mark
[Patch, fortran] PR fortran/42769 typebound call resolved incorrectly.
Hello, here is a fix for PR42769, where a type bound call was resolved statically into a normal procedure call, but using the local procedure (with the same name) instead of the original (and correct) one. It turned out to not be OOP specific, and be a pure module loading problem in the way we associate a symbol number in the module file with the corresponding symbol pointer once module is loaded. The first module.c hunk fixes comment #14 in the PR. Here, p == NULL, which means that the user didn't ask to load the symbol (USE,ONLY:...), so we can just continue to the next one. However, as an optimization, we try to find the symbol in the current name space and associate the pointer info (mapping module's integer <-> symbol pointer) with it if found to avoid reloading it from the module file if needed. The problem was there was no check that the symbol is really the same, not another symbol with the same name. Fixed by checking symbol's name and module name too. The original code was only reusing the symtree; I took the opportunity to reuse the symbol as well, and the test suite didn't complain :-). The second module.c hunk fixes comment #24 in the PR. Here, p != NULL: the symbol will have to be loaded. Before creating a new symbol and a new symtree and inserting them in the current name space, we check whether a symtree of the same name exists in the current name space. If it exists and is not ambiguous, it is either the same one as the one from the module, or both are generic so we can reuse the existing symbol in both cases. However, if the symtree in the name space is ambiguous we shall not reuse it as it is different than the one to be loaded. Thus fixed by not reusing it in that case, and letting a unique non ambiguous name being automatically generated later. The resolve.c hunk fixes comment #34 in the PR. This one is not a module problem, but let's fix all cases in one go. Resolve_call reloads the call's procedure symbol to take into account contained procedure which were not known at parse time. However, it uses the symbol's name (original name) for lookup instead of the symtree's (local name). The resolve.c hunk changes that. The tests are adapted from comment 14, 24, 34 from pr42769, and the original test cases from pr45836 and pr45900. Regression tested on x86_64-unknown-linux-gnu. OK for trunk? 4.7? 4.6? Mikael 2013-01-05 Mikael Morin PR fortran/42769 PR fortran/45836 PR fortran/45900 * module.c (read_module): Don't reuse local symtree if the associated symbol isn't exactly the one wanted. Don't reuse local symtree if it is ambiguous. * resolve.c (resolve_call): Use symtree's name instead of symbol's to lookup the symtree. diff --git a/module.c b/module.c index cde5739..e63e510 100644 --- a/module.c +++ b/module.c @@ -4656,8 +4656,14 @@ read_module (void) if (p == NULL) { st = gfc_find_symtree (gfc_current_ns->sym_root, name); - if (st != NULL) - info->u.rsym.symtree = st; + if (st != NULL + && strcmp (st->n.sym->name, info->u.rsym.true_name) == 0 + && st->n.sym->module != NULL + && strcmp (st->n.sym->module, info->u.rsym.module) == 0) + { + info->u.rsym.symtree = st; + info->u.rsym.sym = st->n.sym; + } continue; } @@ -4678,7 +4684,8 @@ read_module (void) /* Check for ambiguous symbols. */ if (check_for_ambiguous (st->n.sym, info)) st->ambiguous = 1; - info->u.rsym.symtree = st; + else + info->u.rsym.symtree = st; } else { diff --git a/resolve.c b/resolve.c index d4d5eb9..17ee716 100644 --- a/resolve.c +++ b/resolve.c @@ -3778,7 +3778,7 @@ resolve_call (gfc_code *c) if (csym && gfc_current_ns->parent && csym->ns != gfc_current_ns) { gfc_symtree *st; - gfc_find_sym_tree (csym->name, gfc_current_ns, 1, &st); + gfc_find_sym_tree (c->symtree->name, gfc_current_ns, 1, &st); sym = st ? st->n.sym : NULL; if (sym && csym != sym && sym->ns == gfc_current_ns 2013-01-05 Mikael Morin PR fortran/42769 PR fortran/45836 PR fortran/45900 * gfortran.dg/use_23.f90: New test. * gfortran.dg/use_24.f90: New test. * gfortran.dg/use_25.f90: New test. * gfortran.dg/use_26.f90: New test. * gfortran.dg/use_27.f90: New test. ! { dg-do compile } ! ! PR fortran/42769 ! This test used to ICE in resolve_typebound_procedure because T1's GET ! procedure was wrongly associated to MOD2's MY_GET (instead of the original ! MOD1's MY_GET) in MOD3's SUB. ! ! Original testcase by Salvator Filippone ! Reduced by Janus Weil module mod1 type :: t1 contains p
Re: [Patch, libfortran] Improve performance of byte swapped IO
On Sat, Jan 5, 2013 at 5:35 PM, Richard Biener wrote: > On Fri, Jan 4, 2013 at 11:35 PM, Andreas Schwab wrote: >> Janne Blomqvist writes: >> >>> diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c >>> index c8ecc3a..bf2250a 100644 >>> --- a/libgfortran/io/file_pos.c >>> +++ b/libgfortran/io/file_pos.c >>> @@ -140,15 +140,21 @@ unformatted_backspace (st_parameter_filepos *fpp, >>> gfc_unit *u) >>> } >>>else >>> { >>> + uint32_t u32; >>> + uint64_t u64; >>> switch (length) >>> { >>> case sizeof(GFC_INTEGER_4): >>> - reverse_memcpy (&m4, p, sizeof (m4)); >>> + memcpy (&u32, p, sizeof (u32)); >>> + u32 = __builtin_bswap32 (u32); >>> + m4 = *(GFC_INTEGER_4*)&u32; >> >> Isn't that an aliasing violation? > > It looks like one. Why not simply do > >m4 = (GFC_INTEGER_4) u32; > > ? I suppose GFC_INTEGER_4 is always the same size as uint32_t but signed? Yes, GFC_INTEGER_4 is a typedef for int32_t. As for why I didn't do the above, C99 6.3.1.3(3) says that if the unsigned value is outside the range of the signed variable, the result is implementation-defined. Though I suppose the sensible "implementation-defined behavior" in this case on a two's complement target is to just do a bitwise copy. Anyway, to be really safe one could use memcpy instead; the compiler optimizes small fixed size memcpy's just fine. Updated patch attached. -- Janne Blomqvist bswap2.diff Description: Binary data
Re: Control dependence vs. builtin_unreachable
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher wrote: > Bootstrapped&tested on powerpc64-unknown-linux-gnu. And to be clear, bootstrapped with this patch on top: Index: system.h === --- system.h(revision 194924) +++ system.h(working copy) @@ -698,7 +698,7 @@ /* Use gcc_unreachable() to mark unreachable locations (like an unreachable default case of a switch. Do not use gcc_assert(0). */ -#if (GCC_VERSION >= 4005) && !ENABLE_ASSERT_CHECKING +#if (GCC_VERSION >= 4005) //&& !ENABLE_ASSERT_CHECKING #define gcc_unreachable() __builtin_unreachable() #else #define gcc_unreachable() (fancy_abort (__FILE__, __LINE__, __FUNCTION__)) Otherwise, __builtin_unreachable would be almost unused and bootstrap+test wouldn't prove much :-) Ciao! Steven
Re: [PING^4] PR 54805: __gthread_tsd* in vxlib-tls.c
On 06-Dec-12 10:14, rbmj wrote: On 26-Nov-12 13:27, rbmj wrote: On 11/13/2012 10:22 PM, rbmj wrote: On 11/5/2012 12:57 PM, rbmj wrote: This removes warnings about implicit declarations and fixes one of the function calls in vxlib-tls.c for vxworks targets. I got the old prototypes from http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01314.html See bug for further details. Someone please comment or commit :) Ping: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00406.html Robert Mason Ping^2 Ping^3? Ping^4?
Re: [cxx-conversion] Change uses of htab_t in gcc/config to hash_table.
I noticed that the traverse and traverse_noresize method takes Argument as the first template parameter. It should be moved to be the second after the callback. In most of the cases, the type of the Argument can be deduced from the callsite, so that the user only need to specify the callback: ht->traverse_noresize(&arg); In the current way, user has to do this: ht->traverse_noresize (&arg); which is not as friendly. David On Tue, Dec 18, 2012 at 8:02 PM, Lawrence Crowl wrote: > Update various config htab_t uses to hash_table. > > Modify types and calls to match. > > * config/arm/arm.c'arm_libcall_uses_aapcs_base::libcall_htab > > Fold libcall_eq and libcall_hash into new struct libcall_hasher. > > * config/ia64/ia64.c'bundle_state_table > > Fold bundle_state_hash and bundle_state_eq_p into new struct > bundle_state_hasher. > > * config/mips/mips.c'mips_offset_table > > Fold mips_lo_sum_offset_hash and mips_lo_sum_offset_eq into new > struct mips_lo_sum_offset_hasher. > > In mips_reorg_process_insns, change call to for_each_rtx to pass > a pointer to the hash_table rather than a htab_t. This change > requires then dereferencing that pointer in mips_record_lo_sum to > obtain the hash_table. > > * config/sol2.c'solaris_comdat_htab > > Fold comdat_hash and comdat_eq into new struct comdat_entry_hasher. > > * config/i386/winnt.c'i386_pe_section_type_flags::htab > > * config/i386/winnt.c'i386_find_on_wrapper_list::wrappers > > Fold wrapper_strcmp into new struct wrapped_symbol_hasher. > > > Tested on x86-64. > Tested with contrib/config-list.mk. > > > Okay for branch? > > > Index: gcc/config/arm/arm.c > === > --- gcc/config/arm/arm.c(revision 194511) > +++ gcc/config/arm/arm.c(working copy) > @@ -25,6 +25,7 @@ > #include "config.h" > #include "system.h" > #include "coretypes.h" > +#include "hash-table.h" > #include "tm.h" > #include "rtl.h" > #include "tree.h" > @@ -3716,36 +3717,48 @@ arm_function_value(const_tree type, cons >return arm_libcall_value_1 (mode); > } > > -static int > -libcall_eq (const void *p1, const void *p2) > +/* libcall hashtable helpers. */ > + > +struct libcall_hasher : typed_noop_remove > { > - return rtx_equal_p ((const_rtx) p1, (const_rtx) p2); > + typedef rtx_def value_type; > + typedef rtx_def compare_type; > + static inline hashval_t hash (const value_type *); > + static inline bool equal (const value_type *, const compare_type *); > + static inline void remove (value_type *); > +}; > + > +inline bool > +libcall_hasher::equal (const value_type *p1, const compare_type *p2) > +{ > + return rtx_equal_p (p1, p2); > } > > -static hashval_t > -libcall_hash (const void *p1) > +inline hashval_t > +libcall_hasher::hash (const value_type *p1) > { > - return hash_rtx ((const_rtx) p1, VOIDmode, NULL, NULL, FALSE); > + return hash_rtx (p1, VOIDmode, NULL, NULL, FALSE); > } > > +typedef hash_table libcall_table_type; > + > static void > -add_libcall (htab_t htab, rtx libcall) > +add_libcall (libcall_table_type htab, rtx libcall) > { > - *htab_find_slot (htab, libcall, INSERT) = libcall; > + *htab.find_slot (libcall, INSERT) = libcall; > } > > static bool > arm_libcall_uses_aapcs_base (const_rtx libcall) > { >static bool init_done = false; > - static htab_t libcall_htab; > + static libcall_table_type libcall_htab; > >if (!init_done) > { >init_done = true; > > - libcall_htab = htab_create (31, libcall_hash, libcall_eq, > - NULL); > + libcall_htab.create (31); >add_libcall (libcall_htab, >convert_optab_libfunc (sfloat_optab, SFmode, SImode)); >add_libcall (libcall_htab, > @@ -3804,7 +3817,7 @@ arm_libcall_uses_aapcs_base (const_rtx l > DFmode)); > } > > - return libcall && htab_find (libcall_htab, libcall) != NULL; > + return libcall && libcall_htab.find (libcall) != NULL; > } > > static rtx > Index: gcc/config/arm/t-arm > === > --- gcc/config/arm/t-arm(revision 194511) > +++ gcc/config/arm/t-arm(working copy) > @@ -73,8 +73,8 @@ $(srcdir)/config/arm/arm-tables.opt: $(s > $(SHELL) $(srcdir)/config/arm/genopt.sh $(srcdir)/config/arm > \ > $(srcdir)/config/arm/arm-tables.opt > > -arm.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ > - $(RTL_H) $(TREE_H) $(OBSTACK_H) $(REGS_H) hard-reg-set.h \ > +arm.o: $(srcdir)/config/arm/arm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h > $(TM_H) \ > + $(RTL_H) $(TREE_H) $(HASH_TABLE_H) $(OBSTACK_H) $(REGS_H) hard-reg-set.h \ >insn-config.h conditions.h output.h dumpfile.h \ >$(INSN_ATTR_H) $(FLAGS_H) reload.h $(FUNCTION_H) \ >$(EXPR_H) $(OPTABS_H) $(RECOG_H) $(CGRAPH_H) \ > Index: gcc/config/i386/winnt.c > ==