DLL function importing "bugs"?

2017-01-16 Thread karol82
Hi,
I found out some small bugs when function is imported from DLL

Test library:

#include 

__declspec(dllexport)
bool test() {
return true;
}

extern "C"
BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved) {
return TRUE;
}

Build with:
  g++ -shared -o dlltest.dll dlltest.cpp -Wl,--out-implib,libdlltest.a -O3 \
  -flto -s
And small test program:

#ifdef USE_DLL_IMPORT
__declspec(dllimport)
#endif
bool test();

int main() {
return test() ? 0 : -1;
}

Build with:
g++ apptest.cpp -o apptest1.exe -O3 -flto -ldlltest -L. -s
g++ apptest.cpp -o apptest2.exe -O3 -flto dlltest.dll -s
g++ apptest.cpp -o apptest3.exe -O3 -flto -ldlltest -L. -DUSE_DLL_IMPORT -s

And now, generated code for calling `test()` function:

For apptest1, apptest2:
  in main: call_Z4testv
  in _Z4testv: jmp cs:__imp__Z4testv
For apptest3:
  in main:callcs:_Z4testv
  not-referenced: jmp cs:_Z4testv

Some other test for __builtin functions that full versions are in MSVCRT.dll:

__declspec(dllimport)
void * memcpy(void * destination, const void * source, size_t num);

void * memcpy_wrapper(size_t num, void * destination, const void * source) {
return memcpy_wrapper(destination, source, num);
}

In this case memcpy_wrapper uses wrapper (as apptest1 and apptest2 above) for
calling memcpy from MSVCRT.dll. I don't really know if this is a gcc or mingw
problem.

1. LTO should find out that `test()` function is really DLL function and do not
   use wrapper.
2. Optimizer should know that wrapper isn't referenced and remove it.
3. Wrappers shouldn't be used for builtin functions.

I'm not experienced enough to even build gcc by myself so I have no idea how to
fix it. I hope that someone smarter will do this (or tell me if it is
impossible).

I'm using: g++.exe (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 6.3.0

Sorry for my English, but I'm still learning :)
Regards, Karol Rudnik.


Re: oddities in the moxie gcc backend

2017-01-16 Thread Anthony Green
Mikael Pettersson  writes:

> I have a toy backend based on the moxie backend as a template.  During its
> development I found some oddities in the moxie backend that may be bugs.
>
> 1. The REGNO_OK_FOR_INDEX_P(NUM) macro in moxie.h is:
>
> #define REGNO_OK_FOR_INDEX_P(NUM) MOXIE_FP
>
> Since MOXIE_FP is 0, this returns false for every register.  Should the body
> be a literal 0, or some comparison between NUM and MOXIE_FP?

Great catch!  You are probably right.

> 2. I see no actual use of MOXIE_PC or the SPECIAL_REGS register class.  Could 
> they
> be deleted (with adjustments for decrementing MOXIE_CC)?

Yes, probably.

> 3. moxie_compute_frame () doesn't take !fixed_regs[regno] into account, which 
> the
> related loops in moxie_expand_prologue () and moxie_expand_epilogue ()
> do.  Bug?

Hmm.. looks like it.

> There are also some minor nits:
>
> 4. The comment above `size_for_adjusting_sp' states it's used in 
> expand_epilogue(),
> which it isn't.
>
> 5. The "Compute this since .." comment in moxie_initial_elimination_offset () 
> should
> probably refer to callee_saved_reg_size not local_vars_size, to match the 
> code.
>
> 6. There are two idential definitions of TRULY_NOOP_TRUNCATION(op,ip) in 
> moxie.h.
> The first one looks misplaced and should probably be deleted.

Thanks for all of this feedback.   I'm going to test some patches.

AG


On Sun, Jan 15, 2017 at 9:02 AM, Mikael Pettersson  wrote:
> I have a toy backend based on the moxie backend as a template.  During its
> development I found some oddities in the moxie backend that may be bugs.
>
> 1. The REGNO_OK_FOR_INDEX_P(NUM) macro in moxie.h is:
>
> #define REGNO_OK_FOR_INDEX_P(NUM) MOXIE_FP
>
> Since MOXIE_FP is 0, this returns false for every register.  Should the body
> be a literal 0, or some comparison between NUM and MOXIE_FP?
>
> 2. I see no actual use of MOXIE_PC or the SPECIAL_REGS register class.  Could 
> they
> be deleted (with adjustments for decrementing MOXIE_CC)?
>
> 3. moxie_compute_frame () doesn't take !fixed_regs[regno] into account, which 
> the
> related loops in moxie_expand_prologue () and moxie_expand_epilogue () do.  
> Bug?
>
> There are also some minor nits:
>
> 4. The comment above `size_for_adjusting_sp' states it's used in 
> expand_epilogue(),
> which it isn't.
>
> 5. The "Compute this since .." comment in moxie_initial_elimination_offset () 
> should
> probably refer to callee_saved_reg_size not local_vars_size, to match the 
> code.
>
> 6. There are two idential definitions of TRULY_NOOP_TRUNCATION(op,ip) in 
> moxie.h.
> The first one looks misplaced and should probably be deleted.
>
>
> /Mikael


Re: Throwing exceptions from a .so linked with -static-lib* ?

2017-01-16 Thread Paul Smith
On Thu, 2017-01-12 at 21:49 +, Yuri Gribov wrote:
> Note that documentation for -static-libgcc explicitly mentions that
>    There are several situations in which an application should
> use the shared libgcc instead of the static version.  The most
>    common of these is when the application wishes to throw and
> catch exceptions across different shared libraries.  In that case,
>    each of the libraries as well as the application itself
> should use the shared libgcc.
> Removing -static-libgcc fixes problem with your reprocase.

I could have sworn I tried all different combinations of the different
static flags, but sure enough if I don't add -static-libgcc on either
the .so or the executable things work OK.

Thanks!


[RFC] Further LRA subreg handling issues

2017-01-16 Thread Matthew Fortune
Hi Vladimir,

I'm working on PR target/78660 which is looking like a latent LRA bug.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660

I believe the problem is in the same area as a bug was fixed in 2015:

https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02165.html

Eric pointed out that the new issue relates to something reload
specifically dealt with in reload1.c:eliminate_regs_1:

  if (MEM_P (new_rtx)
  && ((x_size < new_size
   /* On RISC machines, combine can create rtl of the form
  (set (subreg:m1 (reg:m2 R) 0) ...)
  where m1 < m2, and expects something interesting to
  happen to the entire word.  Moreover, it will use the
  (reg:m2 R) later, expecting all bits to be preserved.
  So if the number of words is the same, preserve the
  subreg so that push_reload can see it.  */
   && !(WORD_REGISTER_OPERATIONS
&& (x_size - 1) / UNITS_PER_WORD
   == (new_size -1 ) / UNITS_PER_WORD))
  || x_size == new_size)
  )
return adjust_address_nv (new_rtx, GET_MODE (x), SUBREG_BYTE (x));
  else
return gen_rtx_SUBREG (GET_MODE (x), new_rtx, SUBREG_BYTE (x));

However the code in lra-constraints.c:curr_insn_transform does not appear
to make any attempt to handle a special case for WORD_REGISTER_OPERATIONS.
I tried the following patch to account for this, which 'works' but I'm not
at all sure what the conditions should be (the comment from reload will
need adapting and including as well):

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 260591a..ac8d116 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4086,7 +4086,9 @@ curr_insn_transform (bool check_only_p)
  && (goal_alt[i] == NO_REGS
  || (simplify_subreg_regno
  (ira_class_hard_regs[goal_alt[i]][0],
-  GET_MODE (reg), byte, mode) >= 0)
+  GET_MODE (reg), byte, mode) >= 0)))
+ || (GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (reg))
+ && WORD_REGISTER_OPERATIONS)))
{
  if (type == OP_OUT)
type = OP_INOUT;

I think at the very least the issue Richard pointed out in the previous
fix must be dealt with as the new testcase triggers exactly what he
described I believe

Richard Sandiford wrote:
> So IMO the patch is too broad.  I think it should only use INOUT reloads
> for !strict_low if the inner mode is wider than a word and the outer mode
> is strictly narrower than the inner mode.  That's on top of Vlad's
> comment about only converting OP_OUTs, of course.

And here is my attempt at dealing with that:

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index ac8d116..8a0f40f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4090,7 +4090,17 @@ curr_insn_transform (bool check_only_p)
  || (GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (reg))
  && WORD_REGISTER_OPERATIONS)))
{
- if (type == OP_OUT)
+ /* An OP_INOUT is required when reloading a subreg of a
+mode wider than a word to ensure that data beyond the
+word being reloaded is preserved.  Also automatically
+ensure that strict_low_part reloads are made into
+OP_INOUT which should already be true from the backend
+constraints.  */
+ if (type == OP_OUT
+ && (curr_static_id->operand[i].strict_low
+ || (GET_MODE_SIZE (GET_MODE (reg)) > UNITS_PER_WORD
+ && GET_MODE_SIZE (mode)
+< GET_MODE_SIZE (GET_MODE (reg)
type = OP_INOUT;
  loc = &SUBREG_REG (*loc);
  mode = GET_MODE (*loc);

Any thoughts on whether this is along the right track would be appreciated.

Thanks,
Matthew


Re: LTO crashes with fortran code in SPEC CPU 2006

2017-01-16 Thread Andrew Pinski
On Sun, Jan 15, 2017 at 4:15 PM, Andrew Pinski  wrote:
> On Sun, Jan 15, 2017 at 4:09 PM, kugan
>  wrote:
>>
>>
>> On 15/01/17 15:57, Andrew Pinski wrote:
>>>
>>> Just this is just an FYI until I reduce the testcases but 5 benchmarks
>>> in SPEC CPU 2006 with fortran code is causing an ICE on
>>> aarch64-linux-gnu with -Ofast -flto -mcpu=thunderx2t99
>>> -fno-aggressive-loop-optimizations -funroll-loops:
>>> lto1: internal compiler error: in ipa_get_type, at ipa-prop.h:448
>>> 0x107c58f ipa_get_type
>>> ../../gcc/gcc/ipa-prop.h:448
>>> 0x107c58f propagate_constants_across_call
>>> ../../gcc/gcc/ipa-cp.c:2259
>>> 0x1080f4f propagate_constants_topo
>>> ../../gcc/gcc/ipa-cp.c:3170
>>> 0x1080f4f ipcp_propagate_stage
>>> ../../gcc/gcc/ipa-cp.c:3267
>>> 0x1081fcb ipcp_driver
>>> ../../gcc/gcc/ipa-cp.c:4997
>>> Please submit a full bug report,
>>> with preprocessed source if appropriate.
>>> Please include the complete backtrace with any bug report.
>>> See  for instructions.
>>> lto-wrapper: fatal error: gfortran returned 1 exit status
>>>
>>> I don't know when this started as I am just starting to run SPEC CPU
>>> 2006 fp side with my spec cpu 2006 config.
>>
>>
>> I am seeing this too for aatch64 with -O3 -flto. It did work few weeks back.
>> This must be a new bug.
>
> I am reducing the crash right now with 459.GemsFDTD since that one
> seems like the smallest one to reduce.


Reduced it and filed it as PR 79108.  Note the GC parameters are
needed to reproduce the bug.

Thanks,
Andrew

>
> Thanks,
> Andrew
>
>>
>> Thanks,
>> Kugan
>>
>>
>>>
>>> Thanks,
>>> Andrew
>>>
>>


make[1]: *** wait: No child processes during make -j8 check

2017-01-16 Thread Martin Sebor

I've run into this failure during make check in the past with
a very large make -j value (such as -j128), but today I've had
two consecutive make check runs fail with -j12 and -j8 on my 8
core laptop with no much else going on.  The last thing running
was the go test suite.  Has something changed recently that
could be behind it?  (My user process limit is 62863.)

Thanks
Martin


Re: make[1]: *** wait: No child processes during make -j8 check

2017-01-16 Thread Andrew Pinski
On Mon, Jan 16, 2017 at 4:37 PM, Martin Sebor  wrote:
> I've run into this failure during make check in the past with
> a very large make -j value (such as -j128), but today I've had
> two consecutive make check runs fail with -j12 and -j8 on my 8
> core laptop with no much else going on.  The last thing running
> was the go test suite.  Has something changed recently that
> could be behind it?  (My user process limit is 62863.)

I just did a -j32 build/test and it worked on my 32 core ARM64 machine.
I was doing a -j128 a few days ago also on another machine too.

Thanks,
Andrew


>
> Thanks
> Martin