[Bug target/25671] test_bit() compilation does not expand to "bt" instruction

2006-04-11 Thread avi at argo dot co dot il


--- Comment #4 from avi at argo dot co dot il  2006-04-11 15:36 ---
Benchmark results, 32 bit code, various methods

On an athlon 64:

   bts reg, (reg):  1 cycle
   bts reg, (mem):  3 cycles
   C code (reg):1 cycle
   C code (mem):5 cycles

On a Xeon:

   bts reg, (reg):  6 cycles
   bts reg, (mem): 15 cycles
   C code (reg):1 cycle
   C code (mem):5 cycles

Looks like a very small win on athlon 64 when modifying memory.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25671




[Bug target/25671] test_bit() compilation does not expand to "bt" instruction

2006-04-11 Thread avi at argo dot co dot il


--- Comment #5 from avi at argo dot co dot il  2006-04-11 15:38 ---
Created an attachment (id=11243)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11243&action=view)
benchmark for various set_bit() implementions


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25671



[Bug target/25671] test_bit() compilation does not expand to "bt" instruction

2006-04-11 Thread avi at argo dot co dot il


--- Comment #6 from avi at argo dot co dot il  2006-04-11 15:39 ---
oops, the benchmark was for bts. will do again for bt.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25671



[Bug target/25671] test_bit() compilation does not expand to "bt" instruction

2006-04-11 Thread avi at argo dot co dot il


--- Comment #7 from avi at argo dot co dot il  2006-04-11 15:53 ---
Created an attachment (id=11244)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11244&action=view)
bt instruction benchmark

redone the test for test_bit(), this time always forcing a memory access:

Athlon 64:

 bt:  3 cycles
 generic: 3 cycles

Xeon:

 bt: 10 cycles
 generic: 4 cycles

so, bt might be usable for -Os, but likely not with the effort.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25671



[Bug c++/27312] New: excessive stack use for automatic object on stack

2006-04-25 Thread avi at argo dot co dot il
compiling the following

--start-code
struct X {
void g();
};

void g();

void f()
{
X x;

x.g();
g();
}
--end-code-

yields (with -O2)

 subl$24, %esp

in the prologue. without the empty class only 12 bytes are subtracted,
presumably to preserve stack alignment.

this is wasteful of stack space.


-- 
   Summary: excessive stack use for automatic object on stack
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: avi at argo dot co dot il
 GCC build triplet: x86_64-redhat-linux
  GCC host triplet: x86_64-redhat-linux
GCC target triplet: i386-redhat-linux (with -m32)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27312



[Bug c++/27312] excessive stack use for automatic object on stack

2006-04-25 Thread avi at argo dot co dot il


--- Comment #2 from avi at argo dot co dot il  2006-04-25 15:57 ---
But why 24? gcc could place the object in any of the 12 bytes needed for stack
alignment.

I don't see any reason why the empty object needs to be aligned to more than a
byte boundary.

What am I missing?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27312



[Bug target/25671] New: test_bit() compilation does not expand to "bt" instruction

2006-01-04 Thread avi at argo dot co dot il
the code

int test_bit(unsigned long *words, int bit)
{
int wsize = (sizeof *words) * 8;
return (words[bit / wsize] & (1 << (bit % wsize))) != 0;
}

can compile to

xor %rax, %rax
bt  %rsi, (%rdi)
setc %al

but instead compiles to a much longer sequence, using many more registers,
which is probably slower as well. If gcc recognized this common idiom (like it
recognizes bit rotate sequences), smaller and more optimal code would be
generated (especially if the result of the test is in an if statement - it
could boil down to a bt; jc sequence).


-- 
   Summary: test_bit() compilation does not expand to "bt"
instruction
   Product: gcc
   Version: 4.0.2
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: avi at argo dot co dot il
 GCC build triplet: x86_64-redhat-linux
  GCC host triplet: x86_64-redhat-linux
GCC target triplet: x86_64-redhat-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25671




[Bug inline-asm/29357] New: inline asm %c0 template form not documented

2006-10-05 Thread avi at argo dot co dot il
the form %c0, as in

asm ( "movl $42, %c0(%1)" : : "i"(offsetof(...)), "r"(...) : "memory" );

is not documented.


-- 
   Summary: inline asm %c0 template form not documented
   Product: gcc
   Version: 4.2.0
Status: UNCONFIRMED
  Severity: trivial
  Priority: P3
 Component: inline-asm
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: avi at argo dot co dot il
 GCC build triplet: N/A
  GCC host triplet: N/A
GCC target triplet: N/A


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29357



[Bug inline-asm/29357] inline asm %c0 template form not documented

2006-10-05 Thread avi at argo dot co dot il


--- Comment #1 from avi at argo dot co dot il  2006-10-05 16:05 ---
Created an attachment (id=12384)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12384&action=view)
proposed documentation patch

I don't have a coypright assignment, but most of this is copied verbatim from
the internals documentation.  I added three lines.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29357



[Bug c/41483] New: gcc fails to elide indirect function call through immutable static variable

2009-09-27 Thread avi at argo dot co dot il
In the following code:

void h(void);

static void g()
{
h();
}

static void (*f)(void) = g;

void k(void)
{
f();
}

It is trivial to see that 'f' cannot change and thus the statement 'f();' can
be compiled as a direct call or jump.  gcc however emits an indirect jump on
x86_64:

   0:   48 8b 05 00 00 00 00mov0x0(%rip),%rax# 7 
3: R_X86_64_PC32.rodata-0x4
   7:   ff e0   jmpq   *%rax

Changing the initialization to 'f = h' produces the desired results:

 :
   0:   e9 00 00 00 00  jmpq   5 
1: R_X86_64_PC32h-0x4

Both compiled with -O3 (though expected to work with -O2)


-- 
   Summary: gcc fails to elide indirect function call through
immutable static variable
   Product: gcc
   Version: 4.4.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: avi at argo dot co dot il
 GCC build triplet: x86_64-pc-linux
  GCC host triplet: x86_64-pc-linux
GCC target triplet: x86_64-pc-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41483



[Bug c/41483] gcc fails to elide indirect function call through immutable static variable

2009-09-27 Thread avi at argo dot co dot il


--- Comment #3 from avi at argo dot co dot il  2009-09-28 05:51 ---
Of course, sorry about the noise.  Marking as invalid.


-- 

avi at argo dot co dot il changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41483



[Bug c++/23477] New: default-initializing array new expression uses memcpy() instead of memset(), bloats executable

2005-08-19 Thread avi at argo dot co dot il
the program 
 
int main() 
{ 
new int[1000](); 
} 
 
generates a 40MB executable. it compiles into a memcpy() of 40MB of zeros into 
the newly-allocated array. 
 
tested at -O0 and -O3. 
 
gcc version 4.0.1 20050727 (Red Hat 4.0.1-5)

-- 
   Summary: default-initializing array new expression uses memcpy()
instead of memset(), bloats executable
   Product: gcc
   Version: 4.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: avi at argo dot co dot il
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i386-redhat-linux
  GCC host triplet: i386-redhat-linux
GCC target triplet: i386-redhat-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23477


[Bug c++/23480] New: default-initializing variable size array new expression does not work

2005-08-19 Thread avi at argo dot co dot il
the following function 
 
  int* f(int n) { return new int[n](); } 
 
translates to 
 
_Z1fi: 
.LFB2: 
pushl   %ebp 
.LCFI0: 
movl%esp, %ebp 
.LCFI1: 
sall$2, 8(%ebp) 
leave 
.LCFI2: 
jmp _Znaj 
 
which does not default-initialize the array.

-- 
   Summary: default-initializing variable size array new expression
does not work
   Product: gcc
   Version: 4.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: avi at argo dot co dot il
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i386-redhat-linux
  GCC host triplet: i386-redhat-linux
GCC target triplet: i386-redhat-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23480