MSVC hook function prologue

2009-09-02 Thread Stefan Dösinger
Hello,
After a rather long break due to other work I tried to revive my work on 
support for the function prologue used in Win32 API functions on Windows - a 
function prologue that some apps running in Wine expect.

This thread from January explains what I am trying to do:
http://gcc.gnu.org/ml/gcc/2009-01/msg00089.html

Essentially I want a function attrib that starts the function with this 
sequence, no matter what other parameters, code in the function, attributes 
or whatever are used:
8b ff  mov%edi,%edi
55 push   %ebp
8b ec  mov%esp,%ebp

I have attached the latest version of my patch for comments. It is mainly 
rebased against gcc changes that were made in the meantime. I also improved 
the REG_FRAME_RELATED_EXPR notes a bit, and only set it if the movs and pops 
are used for the frame pointer setup.

I also now know that I don't(or cannot) care about 64 bit right now. The 
windows apps currently do Windows API function hooking only in 32 bit, and 
there is no emerging scheme yet for hooking Win64 functions in the same way.

Currently I still have these problems:
*) There is apparently some plugin framework in the works. Can this 
functionality implemented as a plugin?

*) The way I read the msvc_prologue attribute seems wrong to me. I could read 
it directly in ix86_expand_prologue, but I got lost in the different trees 
gcc uses. I'm yet again trying to find this in the code and in the docs.

*) The code generated if no frame pointer is needed isn't pretty, but Wine 
will always need a frame pointer, so any optimization in that area won't get 
much test exposure.

*) The stack alignment code + msvc_prologue is used by Wine on osx though. 
Currently I pop %ebp after the 5 byte prologue, and the normal code recreates 
the frame pointer afterwards. My understanding is that I can avoid this by 
keeping the original frame pointer, but adjusting a lot of offsets after the 
alignment to find the function parameters and align the stack properly on 
calls. However, this is currently above my head.

*) What other changes are needed to get a functionality like this into 
mainline?

Thank you,
Stefan Dösinger
Index: gcc/configure.ac
===
--- gcc/configure.ac	(revision 151348)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@
   [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
 [Define if your assembler supports the sahf mnemonic.])])
 
+gcc_GAS_CHECK_FEATURE([swap suffix],
+  gcc_cv_as_ix86_swap,,,
+  [movl.s %esp, %ebp],,
+  [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+[Define if your assembler supports the swap suffix.])])
+
 gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
   gcc_cv_as_ix86_diff_sect_delta,,,
   [.section .rodata
Index: gcc/config/i386/i386.h
===
--- gcc/config/i386/i386.h	(revision 151348)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@
  to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue   */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md	(revision 151348)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
(UNSPECV_RDTSC		18)
(UNSPECV_RDTSCP		19)
(UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,16 @@
(set_attr "length_immediate" "0")
(set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+[(unspec_volatile [(match_operand 0 "register_operand" "0")
+   (match_operand 1 "register_operand" "1")]
+ UNSPECV_VSWAPMOV )]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 151348)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,19 @@
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+{
+  if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+{
+  return 1;
+}
+}
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree

Re: MSVC hook function prologue

2009-09-03 Thread Stefan Dösinger
Am Thursday 03 September 2009 00:04:43 schrieb Paolo Bonzini:

>> *) The stack alignment code + msvc_prologue is used by Wine on osx though.
>> ...
> I don't think this would prevent the patch from getting the patch in.
Ok, I'll read the patch contribution guidelines again and hope for the best. 
Ideally I'd like gcc to generate efficient code for this, since we use this 
on OSX, but I can live with the current situation for now - very few of the 
functions we have to make hookable are performance critical.

> > *) What other changes are needed to get a functionality like this into
> > mainline?
>
> I think right now I'd make only two cosmetic adjustments:
I'll look into them - the 2nd one seems to make sense to me, for the first one 
I have to look.

> > as otherwise for 64-bit target warning would be shown always?

> I don't know, I was just reworking Stefan's patch.  He didn't include 
> function names (-p) in the patch so I don't know what function this is 
> part of.
It was ix86_handle_abi_attribute. I'm usually using git, and don't like cvs 
and svn too much. It seems svn diff doesn't support a -p option here. Maybe 
I'll just switch to git-svn.

Thanks for your help!


Re: MSVC hook function prologue

2009-09-04 Thread Stefan Dösinger
Am Thursday 03 September 2009 00:04:43 schrieb Paolo Bonzini:
> (define_insn "vswapmov"
> [(set (match_operand 0 "register_operand" "0")
>(match_operand 1 "register_operand" "1")
>   (unspec_volatile [] UNSPECV_VSWAPMOV)]
I ran into a problem with this: build/genattrtab doesn't like the empty 
operand list for the unspec_volatile. So after looking at some other insns I 
added a const_int 0. There was also a parenthesis missing, which I added 
after the "register_operand" "1"), to close the "set".

(define_insn "vswapmov"
  [(set (match_operand 0 "register_operand" "0")
(match_operand 1 "register_operand" "1"))
   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
  ""
  "movl.s\t%1,%0"
  [(set_attr "length" "2")
   (set_attr "length_immediate" "0")
   (set_attr "modrm" "0")])

This however leads to the following errors and warnings:
build/genrecog ../.././gcc/config/i386/i386.md \
  insn-conditions.md > tmp-recog.c
../.././gcc/config/i386/i386.md:15751: operand 0 missing output reload
../.././gcc/config/i386/i386.md:15751: warning: operand 0 missing mode?
../.././gcc/config/i386/i386.md:15751: warning: operand 1 missing mode?

I guess the error isn't about the const_int 0, but about operand 0. Any ideas?


Re: MSVC hook function prologue

2009-09-04 Thread Stefan Dösinger
Am Friday 04 September 2009 13:47:20 schrieb Paolo Bonzini:
> > I guess the error isn't about the const_int 0, but about operand 0. Any
> > ideas?
>
> Yes, you need this:
>
>  [(set (match_operand:SI 0 "register_operand" "=r")
>(match_operand:SI 1 "register_operand" "r"))
>   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
That works, thanks!

I just found the "=r" and "r" stuff myself almost at the same time your mail 
arrived. But what does the "SI" do? I haven't found 

Now I went a step further, and implemented the suggestion from amylaar in this 
mail:
http://gcc.gnu.org/ml/gcc/2009-01/msg00174.html

> If you make it a parallel where the actual oprtation is paired with an
> empty unspec, no REG_FRAME_RELATED_EXPR is needed.  If the actual operation
> is hidden in the RTL, however, you have to add it in a 
REG_FRAME_RELATED_EXPR.
> The latter alternative is more complicated.  However, there is a benefit to
> choosing this: win the stack realign or !frame_pointer_needed cases, the
> (early) move of esp to ebp is not really supposed to establish a frame
> pointer, and thus you then don't want any cfi information emitted for it.
> Thus, you can then simply leave out the REG_FRAME_RELATED_EXPR note.

Now the definition looks like this:
(define_insn "vswapmov"
  [(parallel
[(set (match_operand:SI 0 "register_operand" "=r")
  (match_operand:SI 1 "register_operand" "r"))
 (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
   )]
  ""
  "movl.s\t%1,%0"
  [(set_attr "length" "2")
   (set_attr "length_immediate" "0")
   (set_attr "modrm" "0")])

I am still compiling, so I don't know if it works yet.

I attached the current state of the whole patch. I added the attribute to the 
documentation, and generated the patch with function names this time.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 151419)
+++ gcc/doc/extend.texi	(working copy)
@@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
 
 Note, This feature is currently sorried out for Windows targets trying to
 
+...@item msvc_prologue
+...@cindex @code{msvc_prologue} attribute
+
+On 32 bit x86-*-* targets, you can use this function attribute to make
+gcc generate the "hot-patching" function prologue used in Win32 API
+functions in Microsoft Windows XP Service Pack 2 and newer. This requires
+support for the swap suffix in the assembler.
+
 @item naked
 @cindex function without a prologue/epilogue code
 Use this attribute on the ARM, AVR, IP2K and SPU ports to indicate that
Index: gcc/configure.ac
===
--- gcc/configure.ac	(revision 151419)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@ foo:	nop
   [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
 [Define if your assembler supports the sahf mnemonic.])])
 
+gcc_GAS_CHECK_FEATURE([swap suffix],
+  gcc_cv_as_ix86_swap,,,
+  [movl.s %esp, %ebp],,
+  [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+[Define if your assembler supports the swap suffix.])])
+
 gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
   gcc_cv_as_ix86_diff_sect_delta,,,
   [.section .rodata
Index: gcc/config/i386/i386.h
===
--- gcc/config/i386/i386.h	(revision 151419)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@ struct GTY(()) machine_function {
  to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue   */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md	(revision 151419)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
(UNSPECV_RDTSC		18)
(UNSPECV_RDTSCP		19)
(UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,18 @@
(set_attr "length_immediate" "0")
(set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+  [(parallel
+[(set (match_operand:SI 0 "register_operand" "=r")
+  (match_operand:SI 1 "register_operand" "r"))
+ (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
+   )]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 151419)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,19 @@ ix86_function_type_ab

Re: MSVC hook function prologue

2009-09-04 Thread Stefan Dösinger
Am Friday 04 September 2009 14:23:39 schrieb Paolo Bonzini:
> The parallel is implicit in define_insn, so it is not different.  It
> does not make any harm I guess, but it looks "weird" to a more familiar
> eye. :-)
Ok, I removed it again :-)

> +#ifdef HAVE_AS_IX86_SWAP
> +  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
> +#endif
>
> it's better to always provide the attribute, and call "sorry" in
> ix86_function_msvc_prologue if you don't have the .s suffix.
Fixed!

> Another two nits since I've found a more serious one: :-)
>
> 1) do not remove spurious lines.
Ooops. Forgot to read the diff...

> 2) extra long line, go to new line *before* ? and colon:
>
> +  if (TARGET_64BIT ? is_attribute_p ("msvc_prologue", name) :
> !is_attribute_p ("msvc_prologue", name))
Fixed!

I attached another version of the patch - I restarted the compile, so I still 
don't know if it fully works.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 151419)
+++ gcc/doc/extend.texi	(working copy)
@@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
 
 Note, This feature is currently sorried out for Windows targets trying to
 
+...@item msvc_prologue
+...@cindex @code{msvc_prologue} attribute
+
+On 32 bit x86-*-* targets, you can use this function attribute to make
+gcc generate the "hot-patching" function prologue used in Win32 API
+functions in Microsoft Windows XP Service Pack 2 and newer. This requires
+support for the swap suffix in the assembler. (GNU Binutils 2.19.51 or later)
+
 @item naked
 @cindex function without a prologue/epilogue code
 Use this attribute on the ARM, AVR, IP2K and SPU ports to indicate that
Index: gcc/configure.ac
===
--- gcc/configure.ac	(revision 151419)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@ foo:	nop
   [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
 [Define if your assembler supports the sahf mnemonic.])])
 
+gcc_GAS_CHECK_FEATURE([swap suffix],
+  gcc_cv_as_ix86_swap,,,
+  [movl.s %esp, %ebp],,
+  [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+[Define if your assembler supports the swap suffix.])])
+
 gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
   gcc_cv_as_ix86_diff_sect_delta,,,
   [.section .rodata
Index: gcc/config/i386/i386.h
===
--- gcc/config/i386/i386.h	(revision 151419)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@ struct GTY(()) machine_function {
  to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue   */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md	(revision 151419)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
(UNSPECV_RDTSC		18)
(UNSPECV_RDTSCP		19)
(UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,16 @@
(set_attr "length_immediate" "0")
(set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(match_operand:SI 1 "register_operand" "r"))
+   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 151419)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,24 @@ ix86_function_type_abi (const_tree fntype)
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+{
+  if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+{
+#ifdef HAVE_AS_IX86_SWAP
+  return 1;
+#else
+  sorry ("msvc_prologue needs swap suffix support in as");
+  return 0;
+#endif
+}
+}
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree fndecl)
 {
@@ -4808,6 +4826,11 @@ ix86_call_abi_override (const_tree fndecl)
 cfun->machine->call_abi = ix86_abi;
   else
 cfun->machine->call_abi = ix86_function_type_abi (TREE_TYPE (fndecl));
+
+  if (fndecl == NULL_TREE)
+cfun->machine->msvc_prologue = 0;
+  else
+cfun->machine->msvc_prologue = ix86_function_msvc_prologue (TREE_TYPE (fndecl));
 }
 
 /* MS and SYSV ABI have different

Re: MSVC hook function prologue

2009-09-04 Thread Stefan Dösinger
Am Friday 04 September 2009 14:49:42 schrieb Stefan Dösinger:
> I attached another version of the patch - I restarted the compile, so I
> still don't know if it fully works.
Seems to be working - gcc compiles fine, my test function has the right 
starting bytes. Wine compiles and runs, and Steam successfully hooks the 
functions.

I'll add a test case for this feature as well, and submit it next week. Before 
I finally submit it I want to make sure Alexandre hasn't changed his mind 
about all this.


CVS/SVN binutils and gcc on MacOS X?

2009-09-04 Thread Stefan Dösinger
Hi,
I tried to install binutils from CVS and the gcc SVN code on my mac to test my 
msvc_prologue work there, but I ran into an interesting problem:

When using the SVN gcc with my own as, I cannot compile any files:
Assembler messages:
Fatal error: Invalid listing option `r'

This can happen when compiling something manually, or during gcc bootstrap. 
The build type is i386-apple-darwin9.8.0.

Using my own gcc with the system assembler works. Using the system gcc with my 
as build works as well. But my own as and gcc together fail. Google wasn't 
much help unfortunately.

Is this a known problem? Am I doing something wrong here? Do I need any 
special Darwin patches for as or gcc?

Thanks,
Stefan


Re: CVS/SVN binutils and gcc on MacOS X?

2009-09-04 Thread Stefan Dösinger
Am Friday 04 September 2009 23:35:55 schrieb Andreas Tobler:
> No, you don't do anything wrong. It is simply not supported, the
> binutils from gnu.
>
> You can rely on gcc being able to work with the MacOS-X 'binutils' aka:
> ld, as. But don't try to build it for yourself. It's somehow like sparc
> solaris, rely on the systems as/ld.
Drats, that's what I was afraid of :-(

Unfortunately I need support for the swap suffix in as, so using the system 
binaries is not an option. Is the only thing I can do to find the source of 
the as version, backport the swap suffix and hope for the best?


Re: CVS/SVN binutils and gcc on MacOS X?

2009-09-04 Thread Stefan Dösinger
Am Friday 04 September 2009 23:50:11 schrieb Stefan Dösinger:
> Unfortunately I need support for the swap suffix in as, so using the system
> binaries is not an option. Is the only thing I can do to find the source of
> the as version, backport the swap suffix and hope for the best?
Ok, I grabbed the cctools source and hacked in .s suffix support. Now the 
bootstrap process got beyond the initial configure run. Lets see what happens 
once the compilation is finished...



Re: MSVC hook function prologue

2009-09-05 Thread Stefan Dösinger

Are there non-Microsoft DLLs that expect to be hooked this way?  If
so, I think the patch is interesting for gcc independent of whether it
is useful for Wine.
I haven't seen any so far. Its certainly possible some server apps  
have the 2 byte nop at the beginning of functions for a similar hot- 
patching mechanism, but beyond that I don't think any app needs this.




Re: MSVC hook function prologue

2009-09-06 Thread Stefan Dösinger
Am Saturday 05 September 2009 17:08:19 schrieb Ross Ridge:
> If this patch is essentially only for one application, maybe the idea
> of implementing a more generally useful naked attribute would be the
> way to go.  I implemented a naked attribute in my private sources to
> do something similar, although supporting hookable prologues was just
> a small part of its more general use in supporting an assembler based API.
We don't really like the naked attribute, because it makes maintaining a C 
function that uses it a pain. Alexandre once said that he would reject any 
solution for the hook problem that is based on the naked attribute. This 
especially becomes a pain when the function has to do stack realignment, like 
all our Win32 functions on OSX.

But yeah, this functionality will probably be used only by Wine, since Linux 
and OSX offer more comfortable hooking mechanisms than opcode replacements. 
Although you never know, perhaps someone else finds a use for this I did not 
anticipate.


Re: MSVC hook function prologue

2009-09-08 Thread Stefan Dösinger
Ok, Alexandre hasn't changed his opinion, the function attrib is ok with him.

I attached another version of the patch, this time adding some testcases. Two 
more questions though:

*) How can I skip the tests if msvc_prologue is not available because as 
doesn't support the swap suffix? I think it would be wrong if the tests 
failed in that case because the compiler says sorry().
I am currently scanning the other tests for hints, haven't found any yet.

*) Is the way I added the new files in the diff ok? (diff -u /dev/null 
newfile). Or is there some more SVN-ish way?

Am Sunday 06 September 2009 11:36:23 schrieb Andreas Schwab:
> There are no x86-*-* targets, they must match i[34567]86-*-*.
Fixed that issue as well
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 151512)
+++ gcc/doc/extend.texi	(working copy)
@@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
 
 Note, This feature is currently sorried out for Windows targets trying to
 
+...@item msvc_prologue
+...@cindex @code{msvc_prologue} attribute
+
+On 32 bit i[34567]86-*-* targets, you can use this function attribute to make
+gcc generate the "hot-patching" function prologue used in Win32 API
+functions in Microsoft Windows XP Service Pack 2 and newer. This requires
+support for the swap suffix in the assembler. (GNU Binutils 2.19.51 or later)
+
 @item naked
 @cindex function without a prologue/epilogue code
 Use this attribute on the ARM, AVR, IP2K and SPU ports to indicate that
Index: gcc/configure.ac
===
--- gcc/configure.ac	(revision 151512)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@ foo:	nop
   [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
 [Define if your assembler supports the sahf mnemonic.])])
 
+gcc_GAS_CHECK_FEATURE([swap suffix],
+  gcc_cv_as_ix86_swap,,,
+  [movl.s %esp, %ebp],,
+  [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+[Define if your assembler supports the swap suffix.])])
+
 gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
   gcc_cv_as_ix86_diff_sect_delta,,,
   [.section .rodata
Index: gcc/config/i386/i386.h
===
--- gcc/config/i386/i386.h	(revision 151512)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@ struct GTY(()) machine_function {
  to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue   */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md	(revision 151512)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
(UNSPECV_RDTSC		18)
(UNSPECV_RDTSCP		19)
(UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,16 @@
(set_attr "length_immediate" "0")
(set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(match_operand:SI 1 "register_operand" "r"))
+   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 151512)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,24 @@ ix86_function_type_abi (const_tree fntype)
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+{
+  if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+{
+#ifdef HAVE_AS_IX86_SWAP
+  return 1;
+#else
+  sorry ("msvc_prologue needs swap suffix support in as");
+  return 0;
+#endif
+}
+}
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree fndecl)
 {
@@ -4808,6 +4826,11 @@ ix86_call_abi_override (const_tree fndecl)
 cfun->machine->call_abi = ix86_abi;
   else
 cfun->machine->call_abi = ix86_function_type_abi (TREE_TYPE (fndecl));
+
+  if (fndecl == NULL_TREE)
+cfun->machine->msvc_prologue = 0;
+  else
+cfun->machine->msvc_prologue = ix86_function_msvc_prologue (TREE_TYPE (fndecl));
 }
 
 /* MS and SYSV ABI have different set of call used registers.  Avoid expensive
@@ -8316,6 +8339,7 @@ ix86_expand_prologue (void)
   bool pic_reg_used;
   struct ix86_frame frame;
   HOST_WIDE_INT allocate;
+  int gen_

Re: wie kann Ich gcc herunterladen ?

2009-09-25 Thread Stefan Dösinger


Am 25.09.2009 um 07:38 schrieb gerhard gangl:


hallo gcc_team suche gcc zum downloaden
wer kann mir helfen ?

liebe grüße u. danke im vorraus _gerhard_ gangl, reichsstr. 77, 8045  
graz
Fast alle Linux Distributionen bringen gcc mit, man muss ihn nur mit  
dem Paketmanager installieren. Ansonsten: http://gcc.gnu.org/install/binaries.html


PS: Die übliche Sprache auf dieser Mailingliste ist Englisch, nicht  
alle hier verstehen Deutsch.




Feature request concerning opcodes in the function prolog

2009-01-07 Thread Stefan Dösinger
Hello,

I am working on the Wine project(www.winehq.org), which (obviously) uses gcc
to compile its own code. There are some Windows applications like
Steam(www.steampowered.com) and others which try to hook Win32 API functions
by patching the first 5 bytes of the Windows API functions exported by our
DLLs with a jump. Unfortunately these applications are rather picky
regarding the instructions they expect in the prolog. I'm trying to add some
function attributes to give Wine more control over that, but I am afraid
that I need some help.

This is what the average Win32 API prolog looks like:

:   8b ff  mov%edi,%edi
:   55 push   %ebp
:   8b ec  mov%esp,%ebp

There are two differences to the code gcc generates that hurt us:

8b ff  mov%edi,%edi
This instruction was added by Microsoft in XP Service Pack 2 to make hooking
the Win32 API functions easier. Microsoft uses it themselves for a feature
they call "hotpatching", to avoid reboots after updates. Consequently many
other applications try to use it. I think Microsoft generates it using the
__naked__ function attribute by writing the function prolog and epilog by
hand, but I saw some mails on this list which make it pretty clear that this
will never ever be supported in gcc, and we don't like it ourselves and
would prefer some other solution.

The attached file gcc.diff contains a kinda helpless attempt to generate
this instruction. My first problem is that any optimization optimizes it
away because it is a NOP. Is there any way to prevent the optimizer from
removing it?

The second problem is that I haven't found yet how to read my attribute
"ms_hook" in that code. The different trees still confuse me.

Finally, is there a nicer way to specify the %edi register? Hardcoding the
value 5 doesn't look pretty.


8b ec  mov%esp,%ebp
The problem here is the "8b ec". Binutils generates "89 e5". It’s the same
instruction, but the Microsoft version has the direction inversion flag set.
Unfortunately the Windows apps only know the 8b ec version because that is
all they find on Windows.
(This flag is also set on the mov %edi, %edi above - gcc currently generates
89 ff)

I talked to the binutils maintainers and they helped me by adding a new
instruction suffix .s:
movl%esp, %ebp   => 89 e5
movl.s  %esp, %ebp   => eb ec

Now I need gcc to set this suffix, but here I am pretty lost. I haven't
found out yet how to add this to the gcc code. The other issue is
compatibility with old binutils versions, since so far this feature is in
svn only and no release has it. Does gcc attempt to detect binutils
features, or does it just assume that everything it needs is there?

Thanks for your help,
Stefan Dösinger



gcc.diff
Description: Binary data


RE: Feature request concerning opcodes in the function prolog

2009-01-07 Thread Stefan Dösinger


> -Original Message-
> From: H.J. Lu [mailto:hjl.to...@gmail.com]
Nice to see a familiar face, or better, mail address :-)

> You need to check assembler feature with autconf before using them.
> See HAVE_AS_IX86_SAHF as example.
Thanks!

Does that look ok? It seems to detect the support correctly for me:

gcc_GAS_CHECK_FEATURE([swap suffix],
  gcc_cv_as_ix86_swap,,,
  [movl.s %esp, %ebp],,
  [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
  [Define if your assembler supports the swap suffix.])])




RE: Feature request concerning opcodes in the function prolog

2009-01-07 Thread Stefan Dösinger
> You can make a new instruction pattern with an UNSPEC_VOLATILE pattern.
> For a quick prototype you could also use an assembler prologue,
> although
> if you need not experiment with different insn sequences, this will
> likely
> be more work in the long run if/when assembler prologues are eventually
> discontinued.
Thanks for the hint - I'll see what I can find. Do you have any hints for
good examples in the existing code?

> > Now I need gcc to set this suffix, but here I am pretty lost. I
> haven't
> > found out yet how to add this to the gcc code.
> 
> This becomes a non-issue when you define your own UNSPEC_VOLATILE
> pattern
> or assembler prologue.
Ideally I'd like to set this suffix on all instructions in a function(gas
ignores the suffix on instrs that don't support it). I'll probably survive
if this is just set on the mov %edi, %edi and mov %esp, %ebp. But
unfortunately not all functions apps try to hook have the mov %edi, %edi,
but apps still try to hook them. This means that 2 bytes remain problematic,
and on those there's often a xor %eax, %eax (which is either 31 c0 or 33 c0)
or a sub modifying the stack pointer. So the problem potentially goes beyond
the prologue.




RE: Feature request concerning opcodes in the function prolog

2009-01-07 Thread Stefan Dösinger
> An example of an unspec_volatile instruction pattern in
> config/i386/i386.md
> is "cld".
I ran across that, your hints should give me some information to chew on for
the next hours. Currently I am compiling with this code to see what happens:

(define_insn "movnop"
[(unspec_volatile [(const_int 0)] UNSPECV_MOVNOP)]
  ""
  "movl.r\t%edi,%edi"
  [(set_attr "length" "2")
   (set_attr "length_immediate" "0")
   (set_attr "modrm" "0")])

and then a gen_movnop(/* better name anyone? */) to use it. Lets see what
happens
 
> But since you have to have a new gas anyway, wouldn't it be simpler to
> have
> a new option for gas to instruct it to choose the opcodes that are
> expected
> by the win32 applications?
This was my first idea, but Alexandre Julliard(the Wine maintainer) disliked
it and prefered a function attribute to turn it on per-function. However,
from looking at the gcc code it seems that this is the best option to
generate Win32-friendly code everywhere. I'll look at your other suggestions
and talk to Alexandre again.




RE: Feature request concerning opcodes in the function prolog

2009-01-08 Thread Stefan Dösinger
> >> But since you have to have a new gas anyway, wouldn't it be simpler
> to
> >> have
> >> a new option for gas to instruct it to choose the opcodes that are
> >> expected
> >> by the win32 applications?
> > This was my first idea, but Alexandre Julliard(the Wine maintainer)
> disliked
> > it and prefered a function attribute to turn it on per-function.
I talked to Alexandre again, and his main concern wasn't so much the global
flag, but that the existance of the push %ebp; mov %esp, %ebp was still up
to the feelings of the compiler and the moon phase.

So what he wants is something like a msvc_prolog attribute that makes sure
that the function starts with the following instructions and bytecode
sequence, no matter what -fomit-frame-pointer and friends say:

8b ff mov.s %edi, %edi
55push %ebp
8b ec mov.s %esp, %ebp

So we basically need the msvc_prolog to add the "mov.s %edi, %edi" and force
the frame pointer on, and make sure that this whole code is right at the
beginning of the function(potentially conflicts with the stack alignment
LEA)

An alternative would be an ability to add custom assembler code to the start
of each function, similarly to the __naked__ attribute, but probably with
the constraint that the asm code is in total a nop, so the compiler still
generates its own prolog. However, I have no idea how that could be
implemented and fit into the C syntax, and it wouldn't be too nice wrt
performance

What will not really work is writing an __ASM_GLOBAL_FUNC that has the
wrapper code and then calls or jmps to the real function. First of all that
looks pretty ugly, and some windows software(copy protection mostly) doesn't
like CALLs or JMPs(yes, these DRM systems often conflict with other DRM
systems that install hooks or even rootkits)

I'll try to come up with some proof of concept code later today.




RE: Feature request concerning opcodes in the function prolog

2009-01-08 Thread Stefan Dösinger

> You don't need to force the frame pointer on, it is sufficient to say
> that
> ebp needs restoring at the end of the function no matter if it looks
> otherwise
> used or not - and you have to take the frame size impact of the saved
> ebp into
> account.
How does this fit together with the stack realignment code? This is
something I am not sure with yet.

I am considering something like this pseudo code(not done with real coding
yet):

int setup_frame_pointer = frame_pointer_needed;

if(msvc_prolog)
{
emit_insn(gen_movnop(%di)); /* mov %edi, %edi */
emit_insn(gen_msvc_fp(%sp, %bp)); /* push %esp; mov %esp, %ebp */

if(frame_pointer_needed && !( crtl->drap_reg &&
crtl->stack_realign_needed))
{
setup_frame_pointer = 0;
}
else
{
emit_insn(gen_revert_msvc_fp(%bp)); /* pop %ebp */
}
}

/* ... */
stack realignment code here
/* ... */

if(setup_frame_pointer /* used to be frame_pointer_needed */) {
/* Normal fp setup code */
}

Basically the idea is that if the frame pointer is set up there's this code
mov.s %edi, %edi
push %ebp
mov.s %esp, %ebp
; frame pointer already set up, continue with normal code. Nice and pretty

If the frame pointer is not needed:
mov.s %edi, %edi
push %ebp
mov.s %esp, %ebp
pop %ebp
; Continue normally here. I think that case can't be improved too much,
since the msvc_prolog stuff modifies the base pointer.

Now my problem: If the frame pointer is needed, and the stack realignment is
needed:
mov.s %edi, %edi
push %ebp
mov.s %esp, %ebp
pop %ebp
leal ...
push %ebp
mov %esp, %ebp

This doesn't look pretty, really.

> Moreover, if your prologue beings with an unspec_volatile that emits
> the
> three-instruction sequence you want, the optimizers should leave it
> there
> at the start of the function.
> Although it is properly easiest to get debug and unwind information
> right
> if you make this three separate unspec_volatile patterns, with their
> respective REG_FRAME_RELATED_EXPR notes where applicable.
> I.e. the push ebp saves ebp and changes the stack.
> The mov.s esp,ebp needs a REG_FRAME_RELATED_EXPR note only if you are
> actually using a frame pointer.
The REG_FRAME_RELATED_EXPR is set with this, right: ?
RTX_FRAME_RELATED_P (insn) = 1;

I haven't yet figured out what it does exactly.

Another issue is 64 bit. I guess if I use the hard_frame_pointer_rtx,
stack_pointer_rtx and similar register indentifiers and pass it to the
gen_ functions I should get proper code for both 32 and 64 bit, right?
Besides that I still have to look if Windows has such a hooking-friendly
function prologue on Win64 too, or if that is just a 32 bit issue.




RE: Feature request concerning opcodes in the function prolog

2009-01-08 Thread Stefan Dösinger
> If ebp needs to be saved because it contains a user variable, it is
> better
> not to pop it in the prologue - pop it in the epilogue instead, and you
> don't
> need to have another save/restore.
Sounds reasonable. Is there any flag I can set to make the epilogue pop ebp?
 
> This can be done with much shorter assembly, at the cost of a bit more
> logic in your prologue / epilogue expanders:
> 
> mov.s %edi, %edi
> push %ebp
> mov.s %esp, %ebp
> leal ... (adjust value to account for the ebp stack slot)
Yes, but then ebp contains the value of esp before it was realigned.
Couldn't that cause problems? Of course I guess I could just copy %esp into
%ebp again after the leal. The other concern is that now %ebp is in the
unaligned address on the stack, while before that it was aligned.




RE: Feature request concerning opcodes in the function prolog

2009-01-12 Thread Stefan Dösinger
Here's some code attached that actually works, but is far from perfect.

The 'msvc_prologue' attribute is limited to 32 bit. None of the applications
that try to place hooks are ported to Win64 yet, so it is impossible to tell
what they'll need. Besides, maybe I am lucky that when they appear I can
tell their autors to think about Wine.

The first problem I (still) have is passing the msvc_prologue attribute
around. I abused some other code to do that. How does the attribute handling
work, and what would the prefered way be?

The 2nd thing I have to figure out is how and when I have to set
REG_FRAME_RELATED_EXPR.

The msvc_prologue + frame_pointer_needed + !stack_realignment_needed case
produces the best possible code. The fp setup from msvc_prologue is used for
its purpose.

The msvc_prologue + !frame_pointer_needed case could be optimized, as you
said. However, that changes all the stack frame offsets and sizes, and I do
not quite understand all the code I have to modify for that. I think this
should be a separate patch, although arguably be ready before msvc_prologue
is added. I personally don't care too much about this combination of
parameters(Wine won't need/use it), so this optimization would get lost
otherwise.

With stack_realignment_needed frame_pointer_needed is on as well, and this
code is created(copypasted together by hand, somehow the stack alignment
attribute doesn't do anything on my Linux box)

movl.s %edi, %edi
pushl  %ebp
movl.s %esp, %ebp
pop%ebp
lea0x4(%esp),%ecx
and$0xfff0,%esp
pushl  -0x4(%ecx)
push   %ebp 
mov%esp,%ebp

If we try to get rid of the pop-push-mov, the following things change:

*) The value of %ebp
*) The location of the pushed ebp on the stack
*) The alignment of %esp after the whole procedure(its alignment+4 before,
and the +4 is lost afterwards)

Now the question is, what parts are important for the stack realignment
attribute? I think we can't correct point 2. If we correct (1) and (3) this
has to be done with less than 3-4 extra instructions if it should be worth
the effort. I don't see any code right now that would accomplish that, but
maybe I'm missing something.



msvc.diff
Description: Binary data


RE: Feature request concerning opcodes in the function prolog

2009-01-12 Thread Stefan Dösinger
> Have you thought about making .s an assembler command-line flag, so
> that
> this flag could be passed automatically by the compiler under mingw?
Yes.

For my purposes it is not really suitable, because we have to make sure that
the push %ebp and mov %esp, %ebp are there, no matter what the compiler
arguments are(-fomit-frame-pointer). So just adding the mov %edi, %edi isn't
enough, and while I'm at it I can add the .s to the insns anyway. (see the
archives for more details)

However, an assembler command line flag and gcc setting it for msvc_prologue
aren't mutually exclusive, so if mingw needs it it could be done anyway.