About the GCC mirror on igor.onlinedirect.bg

2014-11-19 Thread igor

Hello,
I write to inform you that unfortunately OnlineDirect (the sponsoring 
company)

was acquired and the Igor machine will be stopped in the coming weeks.

Best regards,
Igor team



New mirror

2009-01-17 Thread igor
Hello, we decided to run new GCC mirror in Bulgaria. Here are the details.
Country: Bulgaria
City: Sofia
Bandwidth: 2 gbps aggregated link to the Bulgarian Peering, 500 mbps
international
Contact: i...@onlinedirect.bg
URL: http://gcc.igor.onlinedirect.bg/
FTP: ftp://gcc.igor.onlinedirect.bg/others/gcc/
1000 connections limit. Gets synced every 6 hours.

Best regards



return void from void function is allowed.

2006-10-31 Thread Igor Bukanov

GCC 4.1.2 and 4.0.3 incorrectly accepts the following program:

void f();

void g()
{
   return f();
}

No warning are issued on my Ubuntu Pentium-M box. Is it a known bug?

Regards, Igor


return void from void function is allowed.

2006-10-31 Thread Igor Bukanov

-- Forwarded message --
From: Igor Bukanov <[EMAIL PROTECTED]>
Date: Oct 31, 2006 9:48 PM
Subject: Re: return void from void function is allowed.
To: Mike Stump <[EMAIL PROTECTED]>


On 10/31/06, Mike Stump <[EMAIL PROTECTED]> wrote:


This is valid in C++.


My copy of 1997 C++ public draft contains:

6.6.3  The return statement
...
2 A return statement without an expression can be used only in functions
 that  do not return a value, that is, a function with the return value
 type   void,   a   constructor   (_class.ctor_),   or   a   destructor
 (_class.dtor_).   A  return  statement  with an expression can be used
 only in functions returning a value; the value of  the  expression  is
 returned  to  the caller of the function.  If required, the expression
 is implicitly converted to the return type of the function in which it
 appears.   A return statement can involve the construction and copy of
 a temporary object (_class.temporary_).  Flowing  off  the  end  of  a
 function  is  equivalent  to  a  return with no value; this results in
 undefined behavior in a value-returning function.

My reading of that is C++ does not allow return void-expression from
void function. Was it changed later?


And final thought, wrong mailing list...   gcc-help would have been

better.

I thought bugs in GCC can be discussed here. Sorry if it is a wrong assumption.

Regards, Igor


gcc-4.1.2 testsuite report MAC OS 10.3.9 Power PC G4 darwin7.9.0

2007-04-14 Thread Igor Nazarov

Test Run By igor on Sat Apr 14 03:32:31 2007
Native configuration is powerpc-apple-darwin7.9.0

=== gcc tests ===

Schedule of variations:
unix

FAIL: gcc.c-torture/compile/pr23237.c  -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr23237.c  -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr23237.c  -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr23237.c  -O3 -fomit-frame-pointer  (test 
for excess errors)
FAIL: gcc.c-torture/compile/pr23237.c  -O3 -fomit-frame-pointer 
-funroll-loops  (test for excess errors)
FAIL: gcc.c-torture/compile/pr23237.c  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions  (test for excess errors)

FAIL: gcc.c-torture/compile/pr23237.c  -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr23237.c  -Os  (test for excess errors)

FAIL: tmpdir-gcc.dg-struct-layout-1/t001 
c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: tmpdir-gcc.dg-struct-layout-1/t024 
c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: tmpdir-gcc.dg-struct-layout-1/t025 
c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: tmpdir-gcc.dg-struct-layout-1/t026 
c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: tmpdir-gcc.dg-struct-layout-1/t027 
c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: tmpdir-gcc.dg-struct-layout-1/t028 
c_compat_x_tst.o-c_compat_y_tst.o execute


FAIL: gcc.dg/attr-weakref-1.c (test for excess errors)
FAIL: gcc.dg/builtins-18.c (test for excess errors)
FAIL: gcc.dg/builtins-20.c (test for excess errors)
FAIL: gcc.dg/builtins-55.c (test for excess errors)
FAIL: gcc.dg/darwin-version-1.c (test for excess errors)

FAIL: gcc.dg/torture/builtin-convert-1.c  -O0  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-1.c  -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-1.c  -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-1.c  -O3 -fomit-frame-pointer  
(test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-1.c  -O3 -g  (test for excess 
errors)

FAIL: gcc.dg/torture/builtin-convert-1.c  -Os  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-2.c  -O0  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-2.c  -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-2.c  -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-2.c  -O3 -fomit-frame-pointer  
(test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-2.c  -O3 -g  (test for excess 
errors)

FAIL: gcc.dg/torture/builtin-convert-2.c  -Os  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-3.c  -O0  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-3.c  -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-3.c  -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-3.c  -O3 -fomit-frame-pointer  
(test for excess errors)
FAIL: gcc.dg/torture/builtin-convert-3.c  -O3 -g  (test for excess 
errors)

FAIL: gcc.dg/torture/builtin-convert-3.c  -Os  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c  -O0  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c  -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c  -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c  -O3 -fomit-frame-pointer  (test 
for excess errors)

FAIL: gcc.dg/torture/builtin-power-1.c  -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c  -Os  (test for excess errors)

FAIL: gcc.target/powerpc/darwin-longlong.c execution test
FAIL: gcc.target/powerpc/pr18096-1.c stack frame too large (test for 
warnings, line 11)

FAIL: gcc.target/powerpc/pr18096-1.c (test for excess errors)
FAIL: gcc.target/powerpc/stabs-attrib-vect-darwin.c scan-assembler 
.stabs.*vi:\\(0,16\\)[EMAIL PROTECTED]


=== gcc Summary ===

# of expected passes39466
# of unexpected failures47
# of expected failures  98
# of untested testcases 28
# of unsupported tests  382


=== g++ tests ===

Schedule of variations:
unix

FAIL: g++.dg/abi/rtti3.C scan-assembler .weak[ \t]_?_ZTSPP1A
XPASS: g++.dg/tree-ssa/pr14814.C scan-tree-dump-times &this 0
FAIL: g++.dg/warn/huge-val1.C (test for excess errors)
FAIL: g++.dg/warn/weak1.C (test for excess errors)

FAIL: g++.dg/special/conpr-3.C execution test

XPASS: g++.old-deja/g++.eh/badalloc1.C execution test

=== g++ Summary ===

# of expected passes12287
# of unexpected failures4
# of unexpected successes   2
# of expected failures  67
# of unsupported tests  120

=== gfortran tests ===

Schedule of variations:
unix

FAIL: gfortran.dg/large_real_kind_2.F90  -O0  execution test
FAIL: gfortran.dg/large_real_kind_2.F90  -O1  execution test
FAIL: gfortran.dg/large_real_kind_2.F90  -O2  execution test
FAIL: gfortran.dg/large_real_kind_2.F90  -O3 -fomit-frame-pointer  
execution test
FAIL: gfortran.dg/large_real_kind_2.F90  -O3 -fomit-f

Reorder/combine insns on superscalar arch

2016-01-14 Thread Igor Shevlyakov
Guys,

I'm trying to make compiler to generate better code on superscalar
in-order machine but can't find the right way to do it.

Imagine the following code:

long f(long* p, long a, long b)
{
  long a1 = a << 2;
  long a2 = a1 + b;
  return p[a1] + p[a2];
}

by default compiler generates something like this in some pseudo-asm:

shl r3, r3, 2
add r4, r3, r4
ld8r15, [r2 + r3 * 8]
ld8r2, [ r2 + r4 * 8]
   {add r2, r2, r15  ;ret}

but it would be way better this way:

  {   sh_add  r4, r4, (r3 << 2)   ; shl   r3, r3, 2  }
  {   ld8r15, [r2 + r3 * 8]   ;   ld8r2, [ r2 + r4 * 8] }
  {add r2, r2, r15  ;ret}

2nd sequence is 2 cycles shorter. Combiner pass even shows patterns
like this but fail to transform this as it wrapped in parallel:

Failed to match this instruction:
(parallel [
(set (reg:DI 56)
(plus:DI (mult:DI (reg:DI 3 r3 [ a ])
(const_int 4 [0x4]))
(reg:DI 4 r4 [ b ])))
(set (reg/v:DI 40 [ a1 ])
(ashift:DI (reg:DI 3 r3 [ a ])
(const_int 2 [0x2])))
])

What would be a proper way to perform reorganizations like this in general way?

The same goes with the pointer increment:

add r2, r2, 1
ld r3, [r2+0]

would be much better off like this:

{ ld r3, [r2 + 1] ; add r2, r2, 1 }

Are those kind of things overlooked or I failed to set something in
machine-dependent portion?

Thanks a lot for your thoughts


Re: Reorder/combine insns on superscalar arch

2016-01-14 Thread Igor Shevlyakov
Thanks Jeff,

I really hoped that I missed something and there was better answer.
But does it do any harm if combiner will try to check every piece of a
parallel like that and if every component is matchable and total cost
is not worse to emit them separately?
It will change nothing for single issue machines just some reordering
but it will help many multi-issue...
What the pitfalls or this approach are?

Thanks

On Thu, Jan 14, 2016 at 8:55 PM, Jeff Law  wrote:
> On 01/14/2016 04:47 PM, Igor Shevlyakov wrote:
>>
>> Guys,
>>
>> I'm trying to make compiler to generate better code on superscalar
>> in-order machine but can't find the right way to do it.
>>
>> Imagine the following code:
>>
>> long f(long* p, long a, long b)
>> {
>>long a1 = a << 2;
>>long a2 = a1 + b;
>>return p[a1] + p[a2];
>> }
>
>
>
>>
>> by default compiler generates something like this in some pseudo-asm:
>>
>>  shl r3, r3, 2
>>  add r4, r3, r4
>>  ld8r15, [r2 + r3 * 8]
>>  ld8r2, [ r2 + r4 * 8]
>> {add r2, r2, r15  ;ret}
>>
>> but it would be way better this way:
>>
>>{   sh_add  r4, r4, (r3 << 2)   ; shl   r3, r3, 2  }
>>{   ld8r15, [r2 + r3 * 8]   ;   ld8r2, [ r2 + r4 * 8] }
>>{add r2, r2, r15  ;ret}
>
>
>
>>
>> 2nd sequence is 2 cycles shorter. Combiner pass even shows patterns
>> like this but fail to transform this as it wrapped in parallel:
>>
>> Failed to match this instruction:
>> (parallel [
>>  (set (reg:DI 56)
>>  (plus:DI (mult:DI (reg:DI 3 r3 [ a ])
>>  (const_int 4 [0x4]))
>>  (reg:DI 4 r4 [ b ])))
>>  (set (reg/v:DI 40 [ a1 ])
>>  (ashift:DI (reg:DI 3 r3 [ a ])
>>  (const_int 2 [0x2])))
>>  ])
>
> You can always write a pattern which matches the PARALLEL.  You can then
> either arrange to emit the assembly code from that pattern or split the
> pattern (after reload/lra)
>
>>
>> What would be a proper way to perform reorganizations like this in general
>> way?
>>
>> The same goes with the pointer increment:
>>
>> add r2, r2, 1
>> ld r3, [r2+0]
>>
>> would be much better off like this:
>>
>> { ld r3, [r2 + 1] ; add r2, r2, 1 }
>>
>> Are those kind of things overlooked or I failed to set something in
>> machine-dependent portion?
>
> Similarly.  You may also get some mileage from LEGITIMIZE_ADDRESS, though it
> may not see the add/load together which would hinder its ability to generate
> the code you want.
>
> Note that using a define_split to match these things prior to reload likely
> won't work because combine will likely see the split pattern as being the
> same cost as the original insns.
>
> In general the combiner really isn't concerned with superscalar issues,
> though you can tackle some superscalar things with creative patterns that
> match parallels or which match something more complex, but then split it up
> later.
>
> Note that GCC supports a number of superscalar architectures -- they were
> very common for workstations and high end embedded processors for many
> years.  MIPS, PPC, HPPA, even x86, etc all have variants which were tuned
> for superscalar code generation.  I'm sure there's tricks you can exploit in
> every one of those architectures to help generate code with fewer data
> dependencies and thus more opportunities to exploit the superscalar nature
> of your processor.
>
>
>
> jeff


Is this FE bug or am I missing something?

2016-09-11 Thread Igor Shevlyakov
Guys,

Small sample below fails (at least on 6.1) for multiple targets. The
difference between two functions start at the very first tree pass...

Please confirm that I'm not crazy and it's not supposed to be like this...

Thanks

--
#include "limits.h"
#include "stdio.h"

int* __attribute__((noinline)) f1(int* p, int x)
{
return &p[x + 1];
}

int* __attribute__((noinline)) f2(int*p, int x)
{
return &p[1 + x];
}

int P[10];

int main()
{
int x = INT_MAX;
if (f1(P, x) != f2(P, x)) {
printf("Error!\n");
abort();
}
}
--


Re: Is this FE bug or am I missing something?

2016-09-12 Thread Igor Shevlyakov
Well, my concern is not what happens with overflow (which in second
case -fsanitize=undefined will address), but rather consistency of
that 2 cases.

p[x+1] generates RTL which leads to better generated code at the
expense of leading to overflow, while p[1+x] never overflows but leads
to worse code.
It would be beneficial to make the behaviour consistent between those 2 cases.

Thanks for your input

On Mon, Sep 12, 2016 at 12:51 AM, Marc Glisse  wrote:
> On Sun, 11 Sep 2016, Igor Shevlyakov wrote:
>
>> Small sample below fails (at least on 6.1) for multiple targets. The
>> difference between two functions start at the very first tree pass...
>
>
> You are missing -fsanitize=undefined (and #include ).
>
> Please use the mailing list gcc-h...@gcc.gnu.org next time.
>
> --
> Marc Glisse


RE: Listing a maintainer for libcilkrts, and GCC's Cilk Plus implementation generally?

2014-09-23 Thread Zamyatin, Igor
> The original plan was for Balaji to take on this role; however, his assignment
> within Intel has changed and thus he's not going to have time to work on
> Cilk+ anymore.
> 
> Igor Zamyatin has been doing a fair amount of Cilk+ maintenance/bugfixing
> and it might make sense for him to own it in the long term if he's interested.

That's right. 
Can I add 2 records (cilk plus and libcilkrts) to Various Maintainers section?

Thanks,
Igor

> 
> jeff


RE: Listing a maintainer for libcilkrts, and GCC's Cilk Plus implementation generally?

2015-03-06 Thread Zamyatin, Igor
> I apologize. They got caught up in other issues. They've been merged into
> our mainstream and I believe they were just posted to the cilkplus.org
> website and submitted to GCC.

I'm going to submit latest cilk runtime sources next week so I will check the 
mentioned change.

Thanks,
Igor

> 
>   - Barry
> 
> -Original Message-
> From: Thomas Schwinge [mailto:tho...@codesourcery.com]
> Sent: Thursday, March 5, 2015 7:42 PM
> To: Jeff Law
> Cc: Zamyatin, Igor; Iyer, Balaji V; gcc@gcc.gnu.org; Tannenbaum, Barry M;
> H.J. Lu; Jakub Jelinek
> Subject: Re: Listing a maintainer for libcilkrts, and GCC's Cilk Plus
> implementation generally?
> 
> Hi!
> 
> On Thu, 5 Mar 2015 13:39:44 -0700, Jeff Law  wrote:
> > On 02/23/15 14:41, H.J. Lu wrote:
> > > On Mon, Sep 29, 2014 at 4:00 AM, Jakub Jelinek 
> wrote:
> > >> On Mon, Sep 29, 2014 at 12:56:06PM +0200, Thomas Schwinge wrote:
> > >>> On Tue, 23 Sep 2014 11:02:30 +, "Zamyatin, Igor"
>  wrote:
> > >>>> Jeff Law wrote:
> > >>>>> The original plan was for Balaji to take on this role; however,
> > >>>>> his assignment within Intel has changed and thus he's not going
> > >>>>> to have time to work on
> > >>>>> Cilk+ anymore.
> > >>>>>
> > >>>>> Igor Zamyatin has been doing a fair amount of Cilk+
> > >>>>> maintenance/bugfixing and it might make sense for him to own it in
> the long term if he's interested.
> > >>>>
> > >>>> That's right.
> > >>>
> > >>> Thanks!
> > >>>
> > >>>> Can I add 2 records (cilk plus and libcilkrts) to Various Maintainers
> section?
> > >>>
> > >>> I understand Jeff's email as a pre-approval of such a patch.
> > >>
> > >> I think only SC can appoint maintainers, and while Jeff is in the
> > >> SC, my reading of that mail wasn't that it was the SC that has
> > >> acked that, but rather a question if Igor is willing to take that
> > >> role, which then would need to be acked by SC.
> > >
> > > Where are we on this?  Do we have a maintainer for Cilk Plus and its
> > > run-time library?
> > Not at this time.  There was a bit of blockage on various things with
> > the steering committee (who approves maintainers).  I've got a
> > half-dozen or so proposals queued (including Cilk maintainership).
> 
> What's the process then, that I get my Cilk Plus (libcilkrts) portability 
> patches
> committed to GCC?  I was advisd this must be routed through Intel (Barry M
> Tannenbaum CCed), which I have done months ago: I submitted the patches
> to Intel, and -- as I understood it -- Barry and I seemed to agree about them
> (at least I don't remember any requests for changes to be made on my side),
> but I have not seen a merge from Intel to update GCC's libcilkrts.  Should I
> now commit to GCC the pending patches, <http://news.gmane.org/find-
> root.php?message_id=%3C8738bae1mp.fsf%40kepler.schwinge.homeip.net
> %3E>
> and following?
> 
> 
> Grüße,
>  Thomas


Pta_flags enum overflow in i386.c

2011-07-13 Thread Igor Zamyatin
Hi All!

As you may see pta_flags enum in i386.c is almost full. So there is a
risk of overflow in quite near future. Comment in source code advises
"widen struct pta flags" which is now defined as unsigned. But it
looks not optimal.

What will be the most proper solution for this problem?


Thanks in advance,
Igor


Option to print word size, alignment on the target platform

2006-01-25 Thread Igor Bukanov
Is there any option to ask GCC to print various size and alignment
info on the target platform?  This would be very nice during cross
compilation when one can not run the executables to autoconfigure for
such parameters.

Currently I consider for that a hack like copiling the following source:

#include 
union aligned_fields {
double d;
void (*f)();
...
};

struct align_test {
union aligned_fields u1;
char c;
};

const char DATA_POINTER_SIZE[sizeof(void *)];
const char FUNCTION_POINTER_SIZE[sizeof(void (*)())];
const char UNIVERSAL_ALIGN[offsetof(struct align_test, c)];
const char SHORT_SIZE[sizeof(short)];

and then running "nm --print-size" from binutils for the target on it to get:
0004 0004 C DATA_POINTER_SIZE
0004 0004 C FUNCTION_POINTER_SIZE
0002 0002 C SHORT_SIZE
0008 0008 C UNIVERSAL_ALIGN

But I doubt that this is reliable. So perhaps there is something like
gcc -print-target-info ?


Re: Option to print word size, alignment on the target platform

2006-01-25 Thread Igor Bukanov
On 1/25/06, Paul Brook <[EMAIL PROTECTED]> wrote:
> Autoconf already has tests for things like this. Something along the lines of:
>
> const char D_P_S_4[sizeof(void *) == 4 : -1 : 1];
> const char D_P_S_8[sizeof(void *) == 8 : -1 : 1];
>
> Then see which compiles, or grep the error messages.

Right, but are there any way to learn about endianess of the paltform
or the direction of stack growth just from knowing that program
compiles or not? GCC nows about this and it would be nice if there is
a way to expose these.

Regards, Igor


Re: Option to print word size, alignment on the target platform

2006-01-25 Thread Igor Bukanov
On 1/25/06, Robert Dewar <[EMAIL PROTECTED]> wrote:
> A convenient way to get the endianness is to use
> the System.Bit_Order attribute in Ada.

But this requires to run the program on the target which is not
possible with a cross-compiler. Or is there a trick to declare
something in Ada that would force the program to miscompile depending
on the target endianness?

Regards, Igor


GCC 4.1: too strict aliasing?

2006-05-15 Thread Igor Bukanov

Consider the following code that starting with GCC 4.1.0 generates
'dereferencing type-punned pointer will break strict-aliasing rules'
warning:


~> cat test.c
struct inner {
   struct inner *next;
};

struct outer {
   struct inner base;
   int value;
};

/* List of outer elements where all outer.base.next point to struct outer */
struct outer_list {
   struct outer *head;
};

struct outer *search_list(struct outer_list *list, int value)
{
   struct outer *elem, **pelem;

   pelem = &list->head;
   while ((elem = *pelem)) {
   if (elem->base.value == value) {
   /* Hit, move atom's element to the front of the list. */
   *pelem = (struct outer*)elem->base.next;
   elem->base.next = &list->head->base;
   list->head = elem;
   return elem;
   }

   /*** LINE GENERATING WARNING */
   pelem = (struct outer **)&elem->base.next;
   }
   return 0;
}

~> gcc -c -Wall -O2 test.c
test.c: In function 'search_list':
test.c:29: warning: dereferencing type-punned pointer will break
strict-aliasing rules

But why the warning is generated? Doesn't it guaranteed that
offsetof(struct outer, base) == 0 and one can always safely cast
struct inner* to struct outer* if struct inner is a part struct outer
so struct* outer can alias struct* inner?


Wrong code for i686 target with -O3 -flto

2013-07-22 Thread Igor Zamyatin
Hi All!

Unfortunately now the compiler generates wrong code for i686 target
when options -O3 and -flto are used. It started more than a month ago
and reflected in PR57602.

Such combination of options could be quite important at least from the
performance point of view.

Since there was almost no reaction on this PR I'd like to ask either
to look at it in some observable future or revert the commit which is
guilty for the issue.


Thanks,
Igor


Intel® Memory Protection Extensions support in the GCC

2013-07-24 Thread Zamyatin, Igor
Hi All!

This is to let you know that enabling of Intel® MPX technology (see details in 
http://download-software.intel.com/sites/default/files/319433-015.pdf) in GCC 
has been started. (Corresponding changes in binutils are here - 
http://sourceware.org/ml/binutils/2013-07/msg00233.html)

Currently compiler changes for Intel® MPX has been put in the branch 
svn://gcc.gnu.org/svn/gcc/branches/mpx (will soon be reflected in svn.html). 
Ilya Enkovich (in cc) will be the main person maintaining this branch and 
submitting changes into the trunk.

Some implementation details could be found on wiki

Thanks,
Igor 



Compilation flags in libgfortran

2013-10-15 Thread Igor Zamyatin
Hi All!

Is there any particular reason that matmul* modules from libgfortran
are compiled with -O2 -ftree-vectorize?

I see some regressions on Atom processor after r202980
(http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html)

Why not just use O3 for those modules?


Thanks,
Igor


Re: Compilation flags in libgfortran

2013-10-16 Thread Igor Zamyatin
Thanks a lot for the explanation!

I can take care of the benchmarking but only on Intel hardware... Do
you think that possble changes according those results would be
acceptable?

Thanks,
Igor

On Tue, Oct 15, 2013 at 11:46 PM, Janne Blomqvist
 wrote:
> On Tue, Oct 15, 2013 at 4:58 PM, Igor Zamyatin  wrote:
>> Hi All!
>>
>> Is there any particular reason that matmul* modules from libgfortran
>> are compiled with -O2 -ftree-vectorize?
>
> Yes, testing showed that it improved performance compared to the
> default options. See the thread starting at
>
> http://gcc.gnu.org/ml/fortran/2005-11/msg00366.html
>
> In the almost 8 years (!!) since the patch was merged, I believe the
> importance of vectorization for utilizing current processors has only
> increased.
>
> [snip]
>
>> Why not just use O3 for those modules?
>
> Back when the change was made, -ftree-vectorize wasn't enabled by -O3.
> IIRC I did some tests, and -O3 didn't really improve things beyond
> what "-O2 -funroll-loops -ftree-vectorize" already did. That was a
> while ago however, so if somebody (*wink*) would care to redo the
> benchmarks things might look different with today's GCC on today's
> hardware.
>
> Hope this helps,
>
> --
> Janne Blomqvist


Re: Compilation flags in libgfortran

2013-10-16 Thread Igor Zamyatin
 Yeah, this is my point exactly. Atom case seems just triggered that fact.

On Wed, Oct 16, 2013 at 2:22 PM, Kyrill Tkachov  wrote:
> On 16/10/13 10:37, pins...@gmail.com wrote:
>>>
>>> On Oct 15, 2013, at 6:58 AM, Igor Zamyatin  wrote:
>>> Hi All!
>>>
>>> Is there any particular reason that matmul* modules from libgfortran
>>> are compiled with -O2 -ftree-vectorize?
>>>
>>> I see some regressions on Atom processor after r202980
>>> (http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html)
>>>
>>> Why not just use O3 for those modules?
>>
>> -O3 and -O2 -ftree-vectorize won't give much performance difference.  What
>> you are seeing is the cost model needs improvement; at least for atom.
>
> Hi all,
> I think http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01908.html introduced
> the new "cheap" vectoriser cost model that favors compilation time over
> runtime performance and is set as default for -O2. -O3 uses the "dynamic"
> model which potentially gives better runtime performance in exchange for
> longer compile times (if I understand the new rules correctly).
> Therefore, I'd expect -O3 to give a better vector performance than -O2...
>
> Kyrill
>
>


Re: Compilation flags in libgfortran

2013-10-19 Thread Igor Zamyatin
Yeah, I can try to do benchmarking with such optset instead of O3.

Thanks,
Igor

On Thu, Oct 17, 2013 at 2:19 PM, Richard Biener
 wrote:
> On Wed, Oct 16, 2013 at 12:22 PM, Kyrill Tkachov  
> wrote:
>> On 16/10/13 10:37, pins...@gmail.com wrote:
>>>>
>>>> On Oct 15, 2013, at 6:58 AM, Igor Zamyatin  wrote:
>>>> Hi All!
>>>>
>>>> Is there any particular reason that matmul* modules from libgfortran
>>>> are compiled with -O2 -ftree-vectorize?
>>>>
>>>> I see some regressions on Atom processor after r202980
>>>> (http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html)
>>>>
>>>> Why not just use O3 for those modules?
>>>
>>> -O3 and -O2 -ftree-vectorize won't give much performance difference.  What
>>> you are seeing is the cost model needs improvement; at least for atom.
>>
>> Hi all,
>> I think http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01908.html introduced
>> the new "cheap" vectoriser cost model that favors compilation time over
>> runtime performance and is set as default for -O2. -O3 uses the "dynamic"
>> model which potentially gives better runtime performance in exchange for
>> longer compile times (if I understand the new rules correctly).
>> Therefore, I'd expect -O3 to give a better vector performance than -O2...
>
> But this suggests to compile with -O2 -ftree-vectorize
> -fvect-cost-model=dynamic, not building with -O3.
>
> Richard.
>
>> Kyrill
>>
>>


Re: GNU C extension: Function Error vs. Success

2014-03-10 Thread Igor Pashev

10.03.2014 18:27, Shahbaz Youssefi пишет:

FILE *fin = fopen("filename", "r") !! goto exit_no_file;

Or maybe permission denied? ;-)


Tree loop if conversion at O2

2012-09-20 Thread Igor Zamyatin
Hi All!

Is there any particular reason why tree loop if conversion
(tree-if-conv.c) isn't enabled by default on O2 (as far as I can see
it's true for any platforms)?

Thanks,
Igor


Bug repositories

2013-01-28 Thread Igor Kovacevic
Hello,
I'm a master student and I'm writing my thesis on bug triaging in open
source project and I wondering if I can access to a big part of the
bug repository,
if I can, how to do it ?
Writing a crawler/parser for bugzilla or something else?
I need 5 to 8 years of development.

Thanks a lot,

Igor K.


Request about adding a new micro support

2009-05-10 Thread Alessio Igor Bogani
Hi All,

Sorry for my bad english.
How can I add to gcc support for a 8-bit micro (Harvard architecture)?
An RTFM link would be really appreciated. :-)

Thanks!

Ciao,
Alessio


lvx versus lxvd2x on power8

2017-04-10 Thread Igor Henrique Soares Nunes
Hi all,

I recently checked this old discussion about when/why to use lxvd2x instead of 
lvsl/lvx/vperm/lvx to load elements from memory to vector: 
https://gcc.gnu.org/ml/gcc/2015-03/msg00135.html

I had the same doubt and I was also concerned how performance influences on 
these approaches. So that, I created the following project to check which one 
is faster and how memory alignment can influence on results:

https://github.com/PPC64/load_vec_cmp

This is a simple code, that many loads (using both approaches) are executed in 
a simple loop in order to measure which implementation is slower. The project 
also considers alignment.

As it can be seen on this plot 
(https://raw.githubusercontent.com/igorsnunes/load_vec_cmp/master/doc/LoadVecCompare.png)
 an unaligned load using lxvd2x takes more time.

The previous discussion (as far as I could see) addresses that lxvd2x performs 
better than lvsl/lvx/vperm/lvx in all cases. Is that correct? Is my analysis 
wrong?

This issue concerned me, once lxvd2x is heavily used on compiled code.

Regards,

Igor


GSoC 2023

2023-03-27 Thread Igor Putovny via Gcc
Dear all,

I am a student of computer science and I was thinking about applying for
Google Summer of Code 2023. Naturally, I wanted to reach out to you before
applying for GCC projects.

>From selected topics you are interested in, several grabbed my attention:
1. Bypass assembler when generating LTO object file
2. Extend the static analysis pass
3. Rust Front-End: HIR Dump
4. Rust Front-End: Improving user errors

I have to admit that I feel a bit intimidated by projects of "hard
difficulty", because I have seen how hard it is to find your way in a large
codebase (which GCC definitely is).

Therefore, I would like to ask you for your opinion about these topics and
the level of theoretical/practical experience with compilers you are
expecting.

As for the languages used, I have advanced knowledge of C and intermediate
knowledge of C++.

Thank you very much for your time.

Best regards,
Igor Putovný


GSoC 2023

2023-03-27 Thread Igor Putovny via Gcc
 Dear all,

I am an undergraduate student of computer science and I am interested in
GCC projects for Google Summer of Code 2023.

>From selected topics you are interested in, several grabbed my attention:
1. Bypass assembler when generating LTO object file
2. Rust Front-End: HIR Dump
3. Rust Front-End: Improving user errors

May I ask you for more information about these projects and other knowledge
or skills you are expecting?

Thank you very much for your time.

Best regards,
Igor Putovný