RE: Integer promotion for register based arguments

2012-07-26 Thread Jon Beniston
Hi Eric,

> > I guess my question is what would I need to change to make it work
> > like the ARM port? I can't see how this is being controlled.
> 
> Try TARGET_PROMOTE_PROTOTYPES.

Thanks, actually it does turn out to be this, but I was confused by the
documentation. If this returns true, I see sign extension performed in the
callee, if false, no sign extension is performed in the callee.

The documentation for this comes under the " Passing Function Arguments on
the Stack" section, which says:

"This target hook returns true if an argument declared in a prototype as an
integral type smaller than int should actually be passed as an int. In
addition to avoiding errors in certain cases of mismatch, it also makes for
better code on certain machines."

I would have thought if the args smaller than an int are actually passed as
an int, that would have meant the promotion had already taken place and so
wasn't needed in the callee. It could also be said that it makes worse code
on other machines :)

Thanks,
Jon




Re: Integer promotion for register based arguments

2012-07-26 Thread Andrew Haley
On 07/26/2012 09:03 AM, Jon Beniston wrote:
> Hi Eric,
> 
>>> I guess my question is what would I need to change to make it work
>>> like the ARM port? I can't see how this is being controlled.
>>
>> Try TARGET_PROMOTE_PROTOTYPES.
> 
> Thanks, actually it does turn out to be this, but I was confused by the
> documentation. If this returns true, I see sign extension performed in the
> callee, if false, no sign extension is performed in the callee.
> 
> The documentation for this comes under the " Passing Function Arguments on
> the Stack" section, which says:
> 
> "This target hook returns true if an argument declared in a prototype as an
> integral type smaller than int should actually be passed as an int. In
> addition to avoiding errors in certain cases of mismatch, it also makes for
> better code on certain machines."
> 
> I would have thought if the args smaller than an int are actually passed as
> an int, that would have meant the promotion had already taken place and so
> wasn't needed in the callee. It could also be said that it makes worse code
> on other machines :)

Indeed.  If that's true, then either the doc or the code is wrong, or at
least highly misleading.

Andrew.
.




Re: Optimize attribute and inlining

2012-07-26 Thread Richard Guenther
On Wed, Jul 25, 2012 at 8:25 PM, David Brown  wrote:
> On 25/07/12 17:30, Richard Guenther wrote:
>>
>> On Wed, Jul 25, 2012 at 4:07 PM, Selvaraj, Senthil_Kumar
>>   wrote:
>>>
>>> Declaring a function with __attribute__((optimize("O0")) turns off
>>> inlining for the translation unit (atleast) containing the function
>>> (see output at the end). Is this expected behavior?
>>
>>
>> Not really.  The optimize attribute processing should only affect
>> flags it saves.  -f[no-]inline is not meaningful per function and we
>> have the noinline attribute for more proper handling.
>>
>> That said, I consider the optimize attribute code seriously broken
>> and unmaintained (but sometimes useful for debugging - and only
>> that).
>>
>
> That's a pity.  It's understandable - changing optimisation levels on
> different functions is always going to be problematic, since inter-function
> optimisations (like inlining) are going to be difficult to define.  But
> sometimes it could be nice to use specific optimisations in specific places,
> such as loop unrolling in a critical function while other code is to be
> optimised for code size.  Does "#pragma Gcc optimize" work more reliably?

No, it uses the same mechanism internally.

Richard.


Re: Optimize attribute and inlining

2012-07-26 Thread David Brown

On 26/07/2012 11:12, Richard Guenther wrote:

On Wed, Jul 25, 2012 at 8:25 PM, David Brown  wrote:

On 25/07/12 17:30, Richard Guenther wrote:


On Wed, Jul 25, 2012 at 4:07 PM, Selvaraj, Senthil_Kumar
  wrote:


Declaring a function with __attribute__((optimize("O0")) turns off
inlining for the translation unit (atleast) containing the function
(see output at the end). Is this expected behavior?



Not really.  The optimize attribute processing should only affect
flags it saves.  -f[no-]inline is not meaningful per function and we
have the noinline attribute for more proper handling.

That said, I consider the optimize attribute code seriously broken
and unmaintained (but sometimes useful for debugging - and only
that).



That's a pity.  It's understandable - changing optimisation levels on
different functions is always going to be problematic, since inter-function
optimisations (like inlining) are going to be difficult to define.  But
sometimes it could be nice to use specific optimisations in specific places,
such as loop unrolling in a critical function while other code is to be
optimised for code size.  Does "#pragma Gcc optimize" work more reliably?


No, it uses the same mechanism internally.

Richard.



Is it reliable to use "#pragma Gcc optimize" options at the start of the 
file, as an alternative to specifying them in the command line?  For 
example, a Makefile might specify "-Os" as standard options for all c 
files, but one particular file might have "#pragma Gcc optimize 3" at 
the start.  If the line is at the start of the file, before any 
#includes or code, then there would be no mixing of optimisation levels.


And if these options are so broken, should they be marked as such in the 
manual?


Thanks for your time here.

David



Re: Optimize attribute and inlining

2012-07-26 Thread Richard Guenther
On Thu, Jul 26, 2012 at 1:21 PM, David Brown  wrote:
> On 26/07/2012 11:12, Richard Guenther wrote:
>>
>> On Wed, Jul 25, 2012 at 8:25 PM, David Brown 
>> wrote:
>>>
>>> On 25/07/12 17:30, Richard Guenther wrote:


 On Wed, Jul 25, 2012 at 4:07 PM, Selvaraj, Senthil_Kumar
   wrote:
>
>
> Declaring a function with __attribute__((optimize("O0")) turns off
> inlining for the translation unit (atleast) containing the function
> (see output at the end). Is this expected behavior?



 Not really.  The optimize attribute processing should only affect
 flags it saves.  -f[no-]inline is not meaningful per function and we
 have the noinline attribute for more proper handling.

 That said, I consider the optimize attribute code seriously broken
 and unmaintained (but sometimes useful for debugging - and only
 that).

>>>
>>> That's a pity.  It's understandable - changing optimisation levels on
>>> different functions is always going to be problematic, since
>>> inter-function
>>> optimisations (like inlining) are going to be difficult to define.  But
>>> sometimes it could be nice to use specific optimisations in specific
>>> places,
>>> such as loop unrolling in a critical function while other code is to be
>>> optimised for code size.  Does "#pragma Gcc optimize" work more reliably?
>>
>>
>> No, it uses the same mechanism internally.
>>
>> Richard.
>>
>
> Is it reliable to use "#pragma Gcc optimize" options at the start of the
> file, as an alternative to specifying them in the command line?  For
> example, a Makefile might specify "-Os" as standard options for all c files,
> but one particular file might have "#pragma Gcc optimize 3" at the start.
> If the line is at the start of the file, before any #includes or code, then
> there would be no mixing of optimisation levels.

Behavior will not be the same for

gcc -include foo.h x.c

when you change x.c in the proposed way compared to

gcc -include foo.h x.c -O3

> And if these options are so broken, should they be marked as such in the
> manual?

Probably yes.

> Thanks for your time here.
>
> David
>


memset and host char requirement

2012-07-26 Thread Paulo J. Matos

Hi,

My target has 16bit chars.
What I am seeing is that in a memset call, the call is not inlined by 
GCC whenever fill value is bigger than host char.


This seems to be due to the code (GCC 4.6.5) in target_char_cast 
(builtins.c), called from expand_builtin_memset_args:


static int
target_char_cast (tree cst, char *p)
{
  unsigned HOST_WIDE_INT val, hostval;

  if (TREE_CODE (cst) != INTEGER_CST
  || CHAR_TYPE_SIZE > HOST_BITS_PER_WIDE_INT)
return 1;

  val = TREE_INT_CST_LOW (cst);
  if (CHAR_TYPE_SIZE < HOST_BITS_PER_WIDE_INT)
val &= (((unsigned HOST_WIDE_INT) 1) << CHAR_TYPE_SIZE) - 1;

  hostval = val;
  if (HOST_BITS_PER_CHAR < HOST_BITS_PER_WIDE_INT)
hostval &= (((unsigned HOST_WIDE_INT) 1) << HOST_BITS_PER_CHAR) - 1;

  if (val != hostval)
return 1;

  *p = hostval;
  return 0;
}


This requires the tree cst variable to fit in target char (which makes 
sense) and in host char (which doesn't make sense).


Why would the fill value in a memset call be required to fit in a host 
char?


Cheers,

--
PMatos




Re: memset and host char requirement

2012-07-26 Thread Richard Guenther
On Thu, Jul 26, 2012 at 2:11 PM, Paulo J. Matos  wrote:
> Hi,
>
> My target has 16bit chars.
> What I am seeing is that in a memset call, the call is not inlined by GCC
> whenever fill value is bigger than host char.
>
> This seems to be due to the code (GCC 4.6.5) in target_char_cast
> (builtins.c), called from expand_builtin_memset_args:
>
> static int
> target_char_cast (tree cst, char *p)
> {
>   unsigned HOST_WIDE_INT val, hostval;
>
>   if (TREE_CODE (cst) != INTEGER_CST
>   || CHAR_TYPE_SIZE > HOST_BITS_PER_WIDE_INT)
> return 1;
>
>   val = TREE_INT_CST_LOW (cst);
>   if (CHAR_TYPE_SIZE < HOST_BITS_PER_WIDE_INT)
> val &= (((unsigned HOST_WIDE_INT) 1) << CHAR_TYPE_SIZE) - 1;
>
>   hostval = val;
>   if (HOST_BITS_PER_CHAR < HOST_BITS_PER_WIDE_INT)
> hostval &= (((unsigned HOST_WIDE_INT) 1) << HOST_BITS_PER_CHAR) - 1;
>
>   if (val != hostval)
> return 1;
>
>   *p = hostval;
>   return 0;
> }
>
>
> This requires the tree cst variable to fit in target char (which makes
> sense) and in host char (which doesn't make sense).
>
> Why would the fill value in a memset call be required to fit in a host char?

Obviously because of the implementation detail of its caller.

Richard.

> Cheers,
>
> --
> PMatos
>
>


Re: memset and host char requirement

2012-07-26 Thread Paulo J. Matos

On 26/07/12 13:27, Richard Guenther wrote:

Why would the fill value in a memset call be required to fit in a host char?


Obviously because of the implementation detail of its caller.

Richard.



Richard, I am sorry if I was not more clear. I understand that this is 
required because the caller uses a pointer to a char to pass the value 
of the fill, therefore the fill must fill in a host char. But is there 
any reason to introduce this constraint?


Wouldn't it be more flexible, albeit probably more complex, to pass the 
value through a TARGET char or a HOST type as big as a TARGET char?


Cheers,
--
PMatos





Re: memset and host char requirement

2012-07-26 Thread Andrew Haley
On 07/26/2012 01:32 PM, Paulo J. Matos wrote:
> On 26/07/12 13:27, Richard Guenther wrote:
>>> Why would the fill value in a memset call be required to fit in a host char?
>>
>> Obviously because of the implementation detail of its caller.
> 
> Richard, I am sorry if I was not more clear. I understand that this is 
> required because the caller uses a pointer to a char to pass the value 
> of the fill, therefore the fill must fill in a host char. But is there 
> any reason to introduce this constraint?
> 
> Wouldn't it be more flexible, albeit probably more complex, to pass the 
> value through a TARGET char or a HOST type as big as a TARGET char?

Probably.  I suspect a patch would be welcome.

Andrew.



Re: memset and host char requirement

2012-07-26 Thread Richard Guenther
On Thu, Jul 26, 2012 at 2:32 PM, Paulo J. Matos  wrote:
> On 26/07/12 13:27, Richard Guenther wrote:
>>>
>>> Why would the fill value in a memset call be required to fit in a host
>>> char?
>>
>>
>> Obviously because of the implementation detail of its caller.
>>
>> Richard.
>>
>
> Richard, I am sorry if I was not more clear. I understand that this is
> required because the caller uses a pointer to a char to pass the value of
> the fill, therefore the fill must fill in a host char. But is there any
> reason to introduce this constraint?

Simplicity and testing matrix of the one who wrote this code.  Patches
welcome I guess.

Richard.

> Wouldn't it be more flexible, albeit probably more complex, to pass the
> value through a TARGET char or a HOST type as big as a TARGET char?
>
> Cheers,
> --
> PMatos
>
>
>


Re: Optimize attribute and inlining

2012-07-26 Thread David Brown

On 26/07/2012 14:04, Richard Guenther wrote:

On Thu, Jul 26, 2012 at 1:21 PM, David Brown  wrote:

On 26/07/2012 11:12, Richard Guenther wrote:


On Wed, Jul 25, 2012 at 8:25 PM, David Brown 
wrote:


On 25/07/12 17:30, Richard Guenther wrote:



On Wed, Jul 25, 2012 at 4:07 PM, Selvaraj, Senthil_Kumar
  wrote:



Declaring a function with __attribute__((optimize("O0")) turns off
inlining for the translation unit (atleast) containing the function
(see output at the end). Is this expected behavior?




Not really.  The optimize attribute processing should only affect
flags it saves.  -f[no-]inline is not meaningful per function and we
have the noinline attribute for more proper handling.

That said, I consider the optimize attribute code seriously broken
and unmaintained (but sometimes useful for debugging - and only
that).



That's a pity.  It's understandable - changing optimisation levels on
different functions is always going to be problematic, since
inter-function
optimisations (like inlining) are going to be difficult to define.  But
sometimes it could be nice to use specific optimisations in specific
places,
such as loop unrolling in a critical function while other code is to be
optimised for code size.  Does "#pragma Gcc optimize" work more reliably?



No, it uses the same mechanism internally.

Richard.



Is it reliable to use "#pragma Gcc optimize" options at the start of the
file, as an alternative to specifying them in the command line?  For
example, a Makefile might specify "-Os" as standard options for all c files,
but one particular file might have "#pragma Gcc optimize 3" at the start.
If the line is at the start of the file, before any #includes or code, then
there would be no mixing of optimisation levels.


Behavior will not be the same for

gcc -include foo.h x.c

when you change x.c in the proposed way compared to

gcc -include foo.h x.c -O3


Fair enough - but will "gcc x.c -Os" with "#pragma Gcc optimize 3" as 
the first line of "x.c" do the same thing as "gcc x.c -O3" ?


I realise that we can't get everything here.  C in general, and gcc in 
particular, is extremely flexible - there are always going to be ways to 
cause trouble.


But while it is always best to write source code that works correctly 
regardless of the optimisation levels, occasionally there is code with 
special requirements.  It might be code that should be optimised 
particularly aggressively, or it might be code that should not be 
optimised, or have other options (such as older code that breaks strict 
aliasing rules).  I think such requirements are best put in the C file 
(as pragmas or attributes) rather than in a Makefile - it binds them 
tighter to the code in question.


If the pragmas work as long as they come before any other code, then 
that gives them a useful, though limited, purpose.





And if these options are so broken, should they be marked as such in the
manual?


Probably yes.


Or they could be fixed :-)

Could the pragmas be changed (or replaced) with ones explicitly designed 
to override command-line options, regardless of where they appear in the 
order in the file?  Of course, you are still going to get conflicts when 
someone specifies more than one c file on the command line, each of 
which contains such overrides - I think the reasonable thing to do this 
is just exit with an error message.





Thanks for your time here.

David





build6_stat removed?

2012-07-26 Thread Iyer, Balaji V
Hello Everyone,
I have a question regarding build6_stat. I saw that in 7/25 merge, 
someone removed this function. Why was it removed? I am currently using it in 
my Cilk Plus branch. What is a work around for this? Am I allowed to put this 
function back in?

Thanks,

Balaji V. Iyer.


Re: _darwin10_Unwind_FindEnclosingFunction

2012-07-26 Thread Jack Howarth
On Mon, Jul 23, 2012 at 12:23:59PM +0100, Bryce McKinlay wrote:
> libgcc_s and libgcj contain a hack which renames
> _Unwind_FindEnclosingFunction to
> _darwin10_Unwind_FindEnclosingFunction on darwin targets. It appears
> this was introduced to work around an issue in OS X 10.6 where the
> _Unwind_FindEnclosingFunction was implemented as a stub which called
> abort(). see: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41991
> 
> This has since been fixed in OS X 10.7+, and the system's
> _Unwind_FindEnclosingFunction now works.
> 
> In Mac OS X 10.7 and 10.8, libgcc_s is installed as a symlink to libSystem:
> 
> $ ls -l /usr/lib/libgcc_s.1.dylib
> lrwxr-xr-x 1 root wheel 17 Jun 19 13:16 /usr/lib/libgcc_s.1.dylib ->
> libSystem.B.dylib
> 
> Unfortunately this means that libgcj does not work on a standard Mac
> OS X installation, because dyld will see the symlink and resolve
> libgcc_s to libSystem before it checks anywhere else:
> 
> $ gcj Hello.class --main=Hello
> $ ./a.out
> dyld: _dyld_bind_fully_image_containing_address() error
> dyld: Symbol not found: __darwin10_Unwind_FindEnclosingFunction
>   Referenced from: /usr/local/lib/libgcj.13.dylib
>   Expected in: /usr/lib/libSystem.B.dylib
>  in /usr/local/lib/libgcj.13.dylib
> Trace/BPT trap: 5

The following works fine here using a gcc trunk built on 10.8...

howarth% gcc-fsf-4.8 -v
Using built-in specs.
COLLECT_GCC=gcc-fsf-4.8
COLLECT_LTO_WRAPPER=/sw/lib/gcc4.8/libexec/gcc/x86_64-apple-darwin12.0.0/4.8.0/lto-wrapper
Target: x86_64-apple-darwin12.0.0
Configured with: ../gcc-4.8-20120725/configure --prefix=/sw 
--prefix=/sw/lib/gcc4.8 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.8/info 
--enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw 
--with-libiconv-prefix=/sw --with-isl=/sw --with-cloog=/sw --with-mpc=/sw 
--with-system-zlib --enable-checking=yes --x-includes=/usr/X11R6/include 
--x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.8
Thread model: posix
gcc version 4.8.0 20120725 (experimental) (GCC) 

howarth% cat testme.java
public class testme { 
  public static void main(String args[]){ 
System.out.println("Hello"); 
  } 
} 

howarth% gcj-fsf-4.8 --main=testme -O testme.java
howarth% ./a.out
Hello

> 
> This can be worked around by adjusting the system library path, or
> forcing libgcc_s to be loaded with DYLD_INSERT_LIBRARIES, but libgcj
> should work out-of-the-box for without having to hack the dyld
> configuration - so clearly we should not be renaming
> _Unwind_FindEnclosingFunction on OS X 10.7+ configurations.
> 
> But I'm not convinced that this solution was ever really right to
> begin with. Even on a 10.6 system, things ought to work so long as you
> ensure libgcc_s is loaded before libSystem. Shouldn't the
> _darwin10_Unwind_FindEnclosingFunction rename be removed entirely?

If I recall correctly, Apple added some magic to make sure that the symbols
previously in libgcc_s would always be resolved from libSystem. I can dig
out the email traffic on that with the darwin linker developer later. Since
10.6 would remain broken, if 10.7/10.8 no longer needs the hack we would
still have to use that hack when targeting darwin10. Iain Sandoe made the 
last adjustment to this hack...

2010-08-17  Iain Sandoe  

* include/posix.h: Make substitution of
_darwin10_Unwind_FindEnclosingFunction conditional on
OSX >= 10.6 (Darwin10).

Perhaps we can expand it to OSX >= 10.6 and < 10.7? Note that the unwinder 
situation
is pretty ugly because from 10.6 onwards we are using a compatibility unwinder 
and
not really the unwinder from libgcc in FSF gcc. Only a single unwinder can be 
used
and it should always be the system unwinder. The compatibility unwinder doesn't 
use
FDEs and over aggressively set aborts to functions like 
_Unwind_FindEnclosingFunction.
Apple may well have removed the aborts for those calls (although they probably 
are still
effectively no-ops).
   Jack

> 
> Bryce


Re: memset and host char requirement

2012-07-26 Thread Joseph S. Myers
On Thu, 26 Jul 2012, Paulo J. Matos wrote:

> My target has 16bit chars.

As I explained before, support for such targets is extremely limited and 
bitrotten (this applies whether it is BITS_PER_UNIT, CHAR_TYPE_SIZE or 
both that are not 8) and a large amount of work, and global GCC expertise 
and vision for what internal interfaces should look like in the context of 
such bytes, will be required to remove assumptions about target bytes 
being 8 bits before such a port can work.  It would not surprise me if a 
series of more (possibly much more) than 100 large patches to the 
target-independent compiler is needed to make such targets work properly.

http://gcc.gnu.org/ml/gcc/2010-03/msg00445.html

It's not particular useful to raise questions about "why" something is 
broken in such a case, as it's simply generally not been a design 
consideration at all; rather, propose a clear definition of a relevant 
interface that is meaningful for such a case, with a complete patch making 
all the code follow that definition.  And repeat 100 (possibly many more) 
times until all interfaces are properly defined for general target byte 
sizes.  And keep a careful watch on all patches going through gcc-patches 
for anything hardcoding new assumptions about target byte size.

-- 
Joseph S. Myers
jos...@codesourcery.com


Double word left shift optimisation

2012-07-26 Thread Jon Beniston
Hi,

I'd like to try to optimise double word left shifts of sign/zero extended
operands if a widening multiply instruction is available. For the following
code:

long long f(long a, long b)
{   
  return (long long)a << b;
}

ARM, MIPS etc expand to a fairly long sequence like:

nor $3,$0,$5
sra $2,$4,31
srl $7,$4,1
srl $7,$7,$3
sll $2,$2,$5
andi$6,$5,0x20
sll $3,$4,$5
or  $2,$7,$2
movn$2,$3,$6
movn$3,$0,$6

I'd like to optimise this to something like:

 (long long) a * (1 << b)

Which should just be 3 or so instructions. I don't think this can be
sensibly done in the target backend as the generated pattern is too
complicated to match and am not familiar with the middle end. Any
suggestions as to where and how this should be best implemented?

Thanks,
Jon




HTTP header doesn't specify utf-8

2012-07-26 Thread Dhruv Matani
Hello,

This page: http://gcc.gnu.org/onlinedocs/libstdc++/faq.html (and other
pages) don't include a utf-8 charset in the content-type http header,
which is causing the page to be rendered incorrectly in firefox. Is it
possible to fix that? Even though the html header contains the utf-8
line, firefox is rendering it incorrectly. In fact, even IE renders it
wrong (scroll to the bottom of the page and see the funny accented
characters).

Expected header: Content-Type: text/xml; charset=UTF-8

Got:
$ curl -I "http://gcc.gnu.org/onlinedocs/libstdc++/faq.html";
HTTP/1.1 200 OK
Date: Thu, 26 Jul 2012 16:47:48 GMT
Server: Apache/2.0.52 (Red Hat)
Last-Modified: Wed, 18 Jan 2012 00:55:03 GMT
ETag: "1245db-f939-e5d9a7c0"
Accept-Ranges: bytes
Content-Length: 63801
Vary: Accept-Encoding
Content-Type: text/html
X-Pad: avoid browser bug


-- 
   -Dhruv Matani.
http://dhruvbird.com/

"What's the simplest thing that could possibly work?"
-- Ward Cunningham


Re: HTTP header doesn't specify utf-8

2012-07-26 Thread Jonathan Wakely
On 26 July 2012 17:54, Dhruv Matani wrote:
> Hello,
>
> This page: http://gcc.gnu.org/onlinedocs/libstdc++/faq.html (and other
> pages) don't include a utf-8 charset in the content-type http header,

See http://gcc.gnu.org/ml/gcc/2012-04/msg00597.html and
http://gcc.gnu.org/ml/gcc/2012-06/msg00125.html


Re: HTTP header doesn't specify utf-8

2012-07-26 Thread Dhruv Matani
Thanks!

http://gcc.gnu.org/ml/gcc/2012-06/msg00128.html also mentions that
http header munging as the preferred method.

http://gcc.gnu.org/ml/gcc/2012-04/msg00597.html shows why ff & ie are right.

Thanks again!
-Dhruv.

On Thu, Jul 26, 2012 at 10:19 AM, Jonathan Wakely  wrote:
> On 26 July 2012 17:54, Dhruv Matani wrote:
>> Hello,
>>
>> This page: http://gcc.gnu.org/onlinedocs/libstdc++/faq.html (and other
>> pages) don't include a utf-8 charset in the content-type http header,
>
> See http://gcc.gnu.org/ml/gcc/2012-04/msg00597.html and
> http://gcc.gnu.org/ml/gcc/2012-06/msg00125.html



-- 
   -Dhruv Matani.
http://dhruvbird.com/

"What's the simplest thing that could possibly work?"
-- Ward Cunningham


Re: Double word left shift optimisation

2012-07-26 Thread Ian Lance Taylor
On Thu, Jul 26, 2012 at 8:57 AM, Jon Beniston  wrote:
>
> I'd like to try to optimise double word left shifts of sign/zero extended
> operands if a widening multiply instruction is available. For the following
> code:
>
> long long f(long a, long b)
> {
>   return (long long)a << b;
> }
>
> ARM, MIPS etc expand to a fairly long sequence like:
>
> nor $3,$0,$5
> sra $2,$4,31
> srl $7,$4,1
> srl $7,$7,$3
> sll $2,$2,$5
> andi$6,$5,0x20
> sll $3,$4,$5
> or  $2,$7,$2
> movn$2,$3,$6
> movn$3,$0,$6
>
> I'd like to optimise this to something like:
>
>  (long long) a * (1 << b)
>
> Which should just be 3 or so instructions. I don't think this can be
> sensibly done in the target backend as the generated pattern is too
> complicated to match and am not familiar with the middle end. Any
> suggestions as to where and how this should be best implemented?

It seems to me that you could just add an ashldi3 pattern.

Ian


Function return type depends on its recursive invocation

2012-07-26 Thread Simone Pellegrini

Hello,
I was experimenting with trailer function return types together with 
decltype to define a function whose return type depends on the recursive 
invocation of the function itself.


The basic idea is to be able to define a meta-function which takes a 
variable number of arguments and arranges the elements into a tree, for 
example:


makeTree(int,bool,std::string) => std::pairstd::string>>


Because the return type depends on the number and type of the function 
parameters, I defined the function in the following way:


template 
auto makeTree(const Arg1& arg1, const Arg2& arg2, const Arg3& arg3, 
const Args&... args) ->

std::pair
{
return {arg1, makeTree(arg2,arg3,args...)};
}

// termination case which takes 2 arguments
template 
std::pair makeTree(const LhsTy& lhs, const RhsTy& rhs) {
return {lhs, rhs};
}

I am using GCC 4.7.1. The code compiles when I invoke the makeTree with 
2 and 3 arguments, however it fails when I go beyond 3 with the 
following error message:


test.cpp: In function ‘int main(int, char**)’:
test.cpp:26:22: error: no matching function for call to ‘makeTree(int, 
int, int, int)’

test.cpp:26:22: note: candidates are:
test.cpp:8:24: note: template std::pair<_T1, 
_T2> makeTree(const LhsTy&, const RhsTy&)

test.cpp:8:24: note: template argument deduction/substitution failed:
test.cpp:26:22: note: candidate expects 2 arguments, 4 provided
test.cpp:13:6: note: template... Args> std::pair...))> makeTree(const Arg1&, const Arg2&, const Arg3&, const Args& ...)

test.cpp:13:6: note: template argument deduction/substitution failed:
test.cpp: In substitution of ‘templateArg3, class ... Args> std::pairargs ...))> makeTree(const Arg1&, const Arg2&, const Arg3&, const Args& 
...) [with Arg1 = int; Arg2 = int; Arg3 = int; Args = {int}]’:

test.cpp:26:22: required from here
test.cpp:13:6: error: no matching function for call to ‘makeTree(const 
int&, const int&, const int&)’

test.cpp:13:6: note: candidate is:
test.cpp:8:24: note: template std::pair<_T1, 
_T2> makeTree(const LhsTy&, const RhsTy&)

test.cpp:8:24: note: template argument deduction/substitution failed:
test.cpp:13:6: note: candidate expects 2 arguments, 3 provided


I wonder if this is a limitation of GCC or this kind of usage of 
decltype is not compliant with the C++11 standard.


More details are available on the following link: 
http://cpplove.blogspot.co.at/2012/07/decltype-insanity-aka-when-return-type.html


thanks for your time,

Simone Pellegrini



Re: Double word left shift optimisation

2012-07-26 Thread Oleg Endo
On Thu, 2012-07-26 at 10:51 -0700, Ian Lance Taylor wrote:
> On Thu, Jul 26, 2012 at 8:57 AM, Jon Beniston  
> wrote:
> >
> > I'd like to try to optimise double word left shifts of sign/zero extended
> > operands if a widening multiply instruction is available. For the following
> > code:
> >
> > long long f(long a, long b)
> > {
> >   return (long long)a << b;
> > }
> >
> > ARM, MIPS etc expand to a fairly long sequence like:
> >
> > nor $3,$0,$5
> > sra $2,$4,31
> > srl $7,$4,1
> > srl $7,$7,$3
> > sll $2,$2,$5
> > andi$6,$5,0x20
> > sll $3,$4,$5
> > or  $2,$7,$2
> > movn$2,$3,$6
> > movn$3,$0,$6
> >
> > I'd like to optimise this to something like:
> >
> >  (long long) a * (1 << b)
> >
> > Which should just be 3 or so instructions. I don't think this can be
> > sensibly done in the target backend as the generated pattern is too
> > complicated to match and am not familiar with the middle end. Any
> > suggestions as to where and how this should be best implemented?
> 
> It seems to me that you could just add an ashldi3 pattern.
> 

This is interesting.  I've quickly tried it out on the SH port.  It can
be accomplished with the combine pass, although there are a few things
that should be taken care of:
- an "extendsidi2" pattern is required (so that the extension is not
  performed before expand)
- an "ashldi3" pattern that accepts "reg:DI << reg:DI"
- maybe some adjustments to the costs calculations
  (wasn't required in my case)

With those in place, combine will try to match the following pattern

(define_insn_and_split "*"
  [(set (match_operand:DI 0 "arith_reg_dest" "=r")
(ashift:DI (sign_extend:DI (match_operand:SI 1 "arith_reg_operand"
"r"))
   (sign_extend:DI (match_operand:SI 2 "arith_reg_operand" 
"r"]
  "TARGET_SH2"
  "#"
  "&& can_create_pseudo_p ()"
  [(const_int 0)]
{
  rtx tmp = gen_reg_rtx (SImode);
  emit_move_insn (tmp, const1_rtx);
  emit_insn (gen_ashlsi3 (tmp, tmp, operands[2]));
  emit_insn (gen_mulsidi3 (operands[0], tmp, operands[1]));
  DONE;
})

which eventually results in the expected output

mov #1,r1   ! 24movsi_i/3   [length = 2]
shldr5,r1   ! 25ashlsi3_d   [length = 2]
dmuls.l r4,r1   ! 27mulsidi3_i  [length = 2]
sts macl,r0 ! 28movsi_i/5   [length = 2]
rts ! 35*return_i   [length = 2]
sts mach,r1 ! 29movsi_i/5   [length = 2]

One potential pitfall might be the handling of a real "reg:DI << reg:DI"
if there are no patterns already there that handle it (as it is the case
for the SH port).  If I observed correctly, the "ashldi3" expander must
not FAIL for a "reg:DI << reg:DI" (to do a lib call), or else combine
would not arrive at the pattern above.

Hope this helps.

Cheers,
Oleg




Re: Function return type depends on its recursive invocation

2012-07-26 Thread Jonathan Wakely
On 26 July 2012 20:36, Simone Pellegrini wrote:
> Hello,
> I was experimenting with trailer function return types together with
> decltype to define a function whose return type depends on the recursive
> invocation of the function itself.

This question is not appropriate for this mailing list, which is for
discussing development of GCC, not questions about using GCC or
questions about C++.

Your question would be more appropriate for the gcc-help list or
better yet, on a more general forum for C++ questions, such as
Stackoverflow, where you'll find your question answered already e.g.
http://stackoverflow.com/questions/11596898/variadic-template-and-inferred-return-type-in-concat/11597196#11597196


Re: HTTP header doesn't specify utf-8

2012-07-26 Thread Benjamin De Kosnik

ouch. I had forgotten about this, which is now PR 54102. 

-benjamin