[Bug c/23605] New: memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com
I have a bit of a disagreement with the optimization toward memset()
calls.  In one of my libraries, libteklti, I have a function named
ucharempty(), which frees a uchar_t (unique character structure) from
memory.  If the user elects to have the memory erased prior to calling
free(), memset() is supposed to reset the memory about to be freed.

In gcc 4.0.1, I have noticed that the optimization for memset()
calls has a few extra instructions in the generated assembly that do not
need to be inserted.

Per Ian Lance Taylor's request, I am going to attatch to this bug only
the source code and .i files as a tarball.

If you look closely, you can see that %edi can be automatically loaded
directly without problems, and that (%eax) can be directly loaded into
(%esp).


The following disassembly output is an example for the line:

0x08049172 : mov0x8(%ebp),%eax
0x08049175 : mov0x4(%eax),%edx
0x08049178 : mov0x8(%ebp),%eax
0x0804917b : mov(%eax),%ebx
0x0804917d : mov%ebx,%edi
0x0804917f : cld
0x08049180 : mov%edx,%ecx
0x08049182 : mov$0x0,%al
0x08049184 : repz stos %al,%es:(%edi)
0x08049186 : mov%ebx,%eax
0x08049188 : mov%eax,(%esp)
0x0804918b : call   0x8048c08 

associated C code:

#ifdefTEKLTI_ENFORCE_PRIVACY
free(
memset(
uchrtofree->uchar_t_ascii,
'\0',
sizeof(char) * uchrtofree->uchar_t_asciilen
)
);
#else/* not USE_386_ASM_ENFORCE_PRIVACY */



For reference:


Dump of assembler code for function ucharempty, for comparison:

>From uchar.c:
0x0804913d : push   %ebp
0x0804913e : mov%esp,%ebp
0x08049140 : push   %edi
0x08049141 : push   %ebx
0x08049142 : sub$0x10,%esp
0x08049145 : cmpl   $0x0,0x8(%ebp)
0x08049149 : jne0x8049172 
0x0804914b : mov0x804c1cc,%eax
0x08049150 : mov%eax,0xc(%esp)
0x08049154 : movl   $0x35,0x8(%esp)
0x0804915c : movl   $0x1,0x4(%esp)
0x08049164 : movl   $0x804b120,(%esp)
0x0804916b : call   0x8048c58 
0x08049170 : jmp0x804919b 
0x08049172 : mov0x8(%ebp),%eax
0x08049175 : mov0x4(%eax),%edx
0x08049178 : mov0x8(%ebp),%eax
0x0804917b : mov(%eax),%ebx
0x0804917d : mov%ebx,%edi
0x0804917f : cld
0x08049180 <
ucharempty+67>: mov%edx,%ecx
0x08049182 : mov$0x0,%al
0x08049184 : repz stos %al,%es:(%edi)
0x08049186 : mov%ebx,%eax
0x08049188 : mov%eax,(%esp)
0x0804918b : call   0x8048c08 
0x08049190 : mov0x8(%ebp),%eax
0x08049193 : mov%eax,(%esp)
0x08049196 : call   0x8048c08 
0x0804919b : add$0x10,%esp
0x0804919e : pop%ebx
0x0804919f : pop%edi
0x080491a0 : pop%ebp
0x080491a1 : ret
End of assembler dump.


>From uchar.386.c:

Dump of assembler code for function ucharempty:
0x08048fde : push   %ebp
0x08048fdf : mov%esp,%ebp
0x08048fe1 : mov0x8(%ebp),%ecx
0x08048fe4 : push   %edi
0x08048fe5 : mov(%ecx),%edi
0x08048fe7 : push   %edi
0x08048fe8 : mov0x4(%ecx),%ecx
0x08048feb : mov$0x0,%al
0x08048fed : repz stos %al,%es:(%edi)
0x08048fef : call   0x8048be0 
0x08048ff4 : pop%edi
0x08048ff5 : pop%edi
0x08048ff6 : mov%ebp,%esp
0x08048ff8 : pop%ebp
0x08048ff9 : jmp0x8048be0 
End of assembler dump.

-- 
   Summary: memset() Optimization on x86-32 bit
   Product: gcc
   Version: 4.0.1
Status: UNCONFIRMED
  Severity: minor
  Priority: P2
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: kevin at planetsaphire dot com
CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: -g -O0
  GCC host triplet: Kernel 2.6.12
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605


[Bug target/23605] memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com

--- Additional Comments From kevin at planetsaphire dot com  2005-08-28 
20:18 ---
Created an attachment (id=9604)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9604&action=view)
Testcases and .i Files of uchar.*

This attatchment contains only the source files of my project, as well as the
.i files of uchar.*

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605


[Bug target/23605] memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com

--- Additional Comments From kevin at planetsaphire dot com  2005-08-28 
20:26 ---
(In reply to comment #1)
> Are you compiling your source at -O0 or GCC at -O0?  If the former, then this
is most likely not a bug.

-O2 does not do any optimization at all, and -O0 optimizes the code to a certain
extent.  The testcase I submitted was compiled with the -O0 flag.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605


[Bug target/23605] memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com

--- Additional Comments From kevin at planetsaphire dot com  2005-08-28 
20:34 ---
(In reply to comment #4)
> You are compiling at -O0 so this is not a bug and we don't care that much
about code generation at 
> -O0.

So you're invalidating this bug because -O0 optimizes this and -O2 does not?  I
think this is clearly a bug, and so does Ian Lance Taylor per his e-mail 
earlier.


-- 
   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605


[Bug target/23605] memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com

--- Additional Comments From kevin at planetsaphire dot com  2005-08-28 
21:58 ---
(In reply to comment #6)
> inlining memset is not an optimization as most OS's memsets are better than
the inlined version, using 
> sse registers,etc.

I have finished reviewing over the glibc memset.* source files for the 32-bit
Intel platforms, simply because every one using Linux is using glibc as the
"libc".  I find that SSE (nor even MMX) is used in the 32-bit implementations of
memset.

I think it is best to change this bug into an enhancement for the next available
GCC branch.  The reason for this change is because of a few reasons:

1. Fedora Core 3 (the distro installed on my computer) does not install the i686
binaries of glibc during install; rather, it installs the i386 version.

2. You are right about systems having better memset()s, though considering the
widespread use of glibc, most implementations do not utilize SSE, MMX, etc. 
Maybe the memset() optimization can be turned on by the use of a new flag? 
After all, the i386 build of glibc does not include the use of instructions that
can possibly be used on i686.  In addition, there may be a few circumstances
where the user may not want to use the i686 build, such as debugging, apps that
require the i386 build (perhaps to get around a few glibc bugs), and hardware
processor issues with other functions in glibc.

I hope the GCC staff and the steering committee reviews over this possible
enhancement seriously.  The optimization would allow the user to get around
slower code in certain situations when it comes to using memset().

-- 
   What|Removed |Added

   Severity|minor   |enhancement
 Status|RESOLVED|UNCONFIRMED
   Priority|P2  |P1
 Resolution|INVALID |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605


[Bug target/23605] memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com

--- Additional Comments From kevin at planetsaphire dot com  2005-08-29 
00:16 ---
err... I meant "get rid of the pushpop instructions for ebx" because ebx
wouldn't be used (probably taken care of automatically anyway)

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605


[Bug target/23605] memset() Optimization on x86-32 bit

2005-08-28 Thread kevin at planetsaphire dot com

--- Additional Comments From kevin at planetsaphire dot com  2005-08-29 
00:36 ---
Also, is setting %eax to $0 once per memset good enough?  I don't think the
"stos" instruction would reset %eax...  the resulting assembly code in
tektester.386.s is the same in -O3 and -O2...

-- 
   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23605