On Tue, 20 Oct 2015, Richard Biener wrote:
On Tue, Oct 20, 2015 at 8:46 AM, Hurugalawadi, Naveen
wrote:
Hi,
+/* Fold X + (X / CST) * -CST to X % CST. */
This one is still wrong
Removed.
I don't understand the point of the FLOAT_TYPE_P check.
The check was there in fold-const. So, just h
On Wed, 21 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > (This patch serves as a straw man proposal to have something concrete for
> > discussion and further patches)
> >
> > On PTX, stack memory is private to each thread. When master thread
> > construct
On 10/20/2015 01:02 AM, Nikolai Bozhenov wrote:
On 10/15/2015 09:42 PM, Trevor Saunders wrote:
Sorry, a little late to the party.. but why is print_insn even in
rtl.h?
it seems that sched-vis.c is the only thing that uses it...
Andrew
I'm going to use it in the scheduler...
but then wouldn't
On 10/20/2015 04:00 PM, Joseph Myers wrote:
On Tue, 20 Oct 2015, Jeff Law wrote:
2015-10-20 Eric Botcazou
* fold-const.c (tree_binary_nonnegative_warnv_p) :
Recurse on operand #1 instead of operand #0.
: Do not recurse.
: Likewise.
Isn't this a function of t
On Wed, 21 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > The NVPTX backend emits each functions either as .func (callable only from
> > the
> > device code) or as .kernel (entry point for a parallel region). OpenMP
> > lowering adds "omp target entrypoint
I hate conditionally compiled code :(
Many ports in config-list.mk are currently failing to build with -Werror
because a recent varasm.c change introduces parameter that is
conditionally not used.
One day we won't have these kinds of problems, or at least we'll have
lot fewer of them.
I'
Hi,
As analyzed in PR67921, I think the issue is caused by fold_binary_loc which
folds:
4 - (sizetype) &c - (sizetype) ((int *) p1_8(D) + ((sizetype) a_23 * 24 +
4))
into below form:
((sizetype) -((int *) p1_8(D) + ((sizetype) a_23 * 24 + 4)) - (sizetype)
&c) + 4
Look the minus sizetype expres
On 10/20/2015 06:03 PM, Hans-Peter Nilsson wrote:
On Mon, 19 Oct 2015, Jeff Law wrote:
If I hack up GCC's old jump threader to avoid threading across backedges and
instead let the FSM threader handle that case, then we end up with cases where
the FSM threader creates irreducible loops with margi
Hi,
here is updated patch that applies changes suggested by Richard. I apologize
for the delay - the testing failed several times on gcc10.fsffrance.org for me
for out-of-memory errors (which are unrelated) and I was on the travel.
Bootstrapped/regtested x86_64-linux, OK?
* tree.c (verif
Hi,
>> use if (wi::bit_and (@2, @1) == 0)
Done.
>> and instead of the 2nd group
>> place a :c on the minus of the one not matching INTEGER_CSTs.
Done.
Just curious to know whether ":c" act as commutative operation in the input as
well as output in this case?
Regression tested without any extra f
On Fri, Oct 9, 2015 at 8:04 PM, Bernd Schmidt wrote:
> On 10/09/2015 02:00 PM, Bin.Cheng wrote:
>>
>> I further bootstrap and test attached patch on aarch64. Also three
>> cases in spec2k6/fp are improved by 3~6%, two cases in spec2k6/fp are
>> regressed by ~2%. Overall score is improved by ~0.8
Attached is a hopefully near-ready-for-commit version of the SH/FDPIC
patch. I believe I've addressed all comments by Oleg and Kaz on the
previous versions of the patch. I'm still working on drafting the
Changelog entry (there's a lot to go in it, and I might very well be
going into more detail tha
On Tue, Oct 20, 2015 at 4:37 PM, Bernd Schmidt wrote:
> On 10/15/2015 12:37 PM, H.J. Lu wrote:
>>
>> On Thu, Oct 15, 2015 at 1:44 AM, Richard Biener
>> wrote:
>>>
>>> On Wed, Oct 14, 2015 at 6:21 PM, H.J. Lu wrote:
By default, there is no visibility on builtin functions. When there is
We have code to suppress NSDMI when another member of the anonymous
union is explicitly initialized, but that wasn't handling the case where
an entire anonymous union is copied in a defaulted copy or move constructor.
Tested x86_64-pc-linux-gnu, applying to trunk.
commit b95998c461b67461f23d5b
On 10/20/2015 11:51 PM, Alexander Monakov wrote:
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
My experience has been that there is practically no way of using bar.sync
reliably, since we can't control warp divergence and reconvergence at the
ptx level but the hardware bar.sync instruction only wor
On 10/21/2015 02:03 AM, Hans-Peter Nilsson wrote:
If you're using Thunderbird, you can quote non-inline patches by
selecting the parts you want to quote before replying.
Bernd
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
(This patch serves as a straw man proposal to have something concrete for
discussion and further patches)
On PTX, stack memory is private to each thread. When master thread constructs
'omp_data_o' on its own stack and passes it to other threads v
On Mon, 19 Oct 2015, Jeff Law wrote:
> If I hack up GCC's old jump threader to avoid threading across backedges and
> instead let the FSM threader handle that case, then we end up with cases where
> the FSM threader creates irreducible loops with marginal benefit.
>
> This can be seen in ssa-dom-th
Attached is a slightly updated patch that tweaks the diagnostic
messages to avoid assuming the English punctuation, and adds
a few test cases exercising the text of the diagnostics.
Martin
On 10/13/2015 11:22 AM, Martin Sebor wrote:
C++ placement new expression is susceptible to buffer overflow
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
(note to reviewers: I'm not sure what we're after here, on the high level;
will be happy to rework the patch in a saner manner based on feedback, or even
drop it for now)
At the moment the attribute setting logic in omp-low.c is such that if a
fun
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
The NVPTX backend emits each functions either as .func (callable only from the
device code) or as .kernel (entry point for a parallel region). OpenMP
lowering adds "omp target entrypoint" attribute to functions outlined from
target regions. Unlik
On Tue, Oct 20, 2015 at 03:40:14PM -0500, David Edelsohn wrote:
> On Tue, Oct 20, 2015 at 3:38 PM, Szabolcs Nagy wrote:
> > 2015-10-20 Gregor Richards
> > Szabolcs Nagy
> >
> > * config/rs6000/secureplt.h (LINK_SECURE_PLT_DEFAULT_SPEC): Define.
> > * config/rs6000/
On 10/15/2015 12:37 PM, H.J. Lu wrote:
On Thu, Oct 15, 2015 at 1:44 AM, Richard Biener
wrote:
On Wed, Oct 14, 2015 at 6:21 PM, H.J. Lu wrote:
By default, there is no visibility on builtin functions. When there is
explicitly declared visibility on the C library function which a builtin
functi
On 10/16/2015 10:21 PM, Steve Ellcey wrote:
Here is the second part of the MIPS frame header optimization patch.
I'll leave reviewing the functionality to the MIPS maintainers. But...
+ return TARGET_OLDABI && flag_frame_header_optimization && (optimize > 0);
+ if ((fn != NULL)
On Wed, 21 Oct 2015, Eric Botcazou wrote:
> > Isn't this a function of the language and in some cases isn't it
> > implementation defined (true for C/C++ until C++11)?
>
> I don't think that C/C++ use FLOOR_MOD_EXPR, only Ada does AFAIK. In any
> case, I don't see how this can be implementation
> Isn't this a function of the language and in some cases isn't it
> implementation defined (true for C/C++ until C++11)?
I don't think that C/C++ use FLOOR_MOD_EXPR, only Ada does AFAIK. In any
case, I don't see how this can be implementation-defined given:
/* Division for integer result that
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
> Consider
>
> struct t { int a; int b; };
> struct A { struct t v[2]; } a;
>
> So I think we've established that
> &a.v[2]
> is valid, giving a pointer just past the end of the structure. How about
> &a.v[2].a
> and
> &a.v[2].b
> The first of thes
On Tue, 20 Oct 2015, Jeff Law wrote:
> > 2015-10-20 Eric Botcazou
> >
> > * fold-const.c (tree_binary_nonnegative_warnv_p) :
> > Recurse on operand #1 instead of operand #0.
> > : Do not recurse.
> > : Likewise.
> Isn't this a function of the language and in some cases isn't it
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > On NVPTX, there's 16 hardware barriers for each thread team, each barrier
> > has
> > a variable waiter count. The instruction 'bar.sync N, M;' allows to wait on
> > barrier number N until M threads h
On 10/20/2015 11:41 PM, Cesar Philippidis wrote:
Was it this one that you're referring to Bernd? I think this is the
patch that introduces the "oacc ganglocal" attribute. It has bitrot
significantly though.
Yeah, the bits in nvptx.c are the ones I was referring to. Thanks!
What are you planni
On 10/20/2015 11:36 PM, Alexander Monakov wrote:
Thanks, NVPTX will need a low buf_fixed size, perhaps 64 bytes or so.
What about the generic case, should it use a more generous threshold,
or revert to existing unbounded alloca?
Any ideas how big is the required allocation size is in practice?
On 10/20/2015 02:13 PM, Bernd Schmidt wrote:
> On 10/20/2015 11:04 PM, Alexander Monakov wrote:
>> On Tue, 20 Oct 2015, Bernd Schmidt wrote:
>>
>>> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
This allows to emit decls in 'shared' memory from the middle-end.
* config/nvptx/nvpt
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > NVPTX does not support alloca or variable-length stack allocations, thus
> > heap allocation needs to be used instead. I've opted to make this a generic
> > change instead of guarding it with an #if
On 10/13/2015 09:59 AM, Ilya Enkovich wrote:
Looking into that I got an impression vector modes are used by C/C++
vector extensions only. And I think regression testing would reveal some
failures otherwise.
Maybe this stuff hasn't bled into the Fortran front-end, but the gfortran
front-end ce
---
gcc/config/i386/i386.c | 21 +
gcc/doc/tm.texi| 7 +++
gcc/doc/tm.texi.in | 2 ++
gcc/dwarf2out.c| 48 +---
gcc/target.def | 10 ++
gcc/targhooks.c| 8
gcc/targhooks.h
---
gcc/testsuite/gcc.target/i386/addr-space-3.c | 10 ++
1 file changed, 10 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/i386/addr-space-3.c
diff --git a/gcc/testsuite/gcc.target/i386/addr-space-3.c
b/gcc/testsuite/gcc.target/i386/addr-space-3.c
new file mode 100644
index
---
gcc/doc/extend.texi | 46 --
1 file changed, 44 insertions(+), 2 deletions(-)
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index bdbf513..677a4d4 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1240,8 +1240,8 @@ As an extens
---
gcc/cselib.c | 22 +-
gcc/fold-const.c | 14 +++---
gcc/testsuite/gcc.target/i386/addr-space-2.c | 11 +++
3 files changed, 31 insertions(+), 16 deletions(-)
create mode 100644 gcc/testsuite/gcc.ta
While cmps and movs allow a segment override of the ds:esi
source, the es:edi source/destination cannot be overriden.
Simplify things in the backend for now by disallowing
segments for string insns entirely.
---
gcc/config/i386/i386-protos.h | 1 +
gcc/config/i386/i386.c| 61 +
---
gcc/tree-ssa-address.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index 042f9c9..bd10ae7 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -388,7 +388,7 @@ create_mem_ref_raw (tree type, tree alias_ptr_t
---
gcc/config/i386/i386.c | 10 ++
gcc/doc/tm.texi| 5 +
gcc/doc/tm.texi.in | 2 ++
gcc/fold-const.c | 6 +-
gcc/gimple.c | 12 +---
gcc/target.def | 9 +
gcc/targhooks.c| 9 +
gcc/targhooks.h| 1 +
---
gcc/config/i386/i386-c.c | 2 ++
gcc/config/i386/i386-protos.h | 1 +
gcc/config/i386/i386.c| 39 ++-
3 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 3f28f6c..e3a3012 100
Signed-off-by: Richard Henderson
---
gcc/config/i386/i386-protos.h | 3 +--
gcc/config/i386/i386.c| 12 ++--
gcc/config/i386/i386.h| 3 ++-
gcc/config/i386/predicates.md | 8
gcc/config/i386/rdos.h| 2 +-
5 files changed, 14 insertions(+), 14 deletions
---
gcc/config/i386/i386-c.c | 6 +
gcc/config/i386/i386-protos.h| 3 +
gcc/config/i386/i386.c | 176 +--
gcc/testsuite/gcc.target/i386/addr-space-1.c | 11 ++
4 files changed, 131 insertions(+), 65 deletions(-)
The current default of making all undefined coversions being
set to null is not useful. It has caused all users to lie
and say that spaces are subsets when they are not, just so
that they can override the conversion.
---
gcc/expr.c | 30 ++
1 file changed, 18 insertion
---
gcc/config/i386/i386.md | 32 ++--
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d0c0d23..ccb672d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2595,9 +2595,19 @@
[(
If all address spaces use the same modes and forms, we would
be forced to replicate these hooks in the backend. Which would
then require the creation of a new hook to replace
target_default_pointer_address_modes_p.
---
gcc/targhooks.c | 39 ---
1 file changed,
Changes since v1:
* Documentation for __seg_* and __SEG_*.
* Add a couple more test cases, suggested by comments on v1.
* Fix operands_equal_p and cselib wrt address spaces.
* Emit DW_AT_segment for x86 segments.
I think this includes all of the feedback from v1, except wrt
PR66768, which
On 10/20/2015 11:13 PM, Alexander Monakov wrote:
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
2. Make gomp_nvptx_main a device (.func) function. To have that work, we'd
need to additionally emit a "trampoline" of sorts in the NVPTX backend. For
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > 2. Make gomp_nvptx_main a device (.func) function. To have that work, we'd
> > need to additionally emit a "trampoline" of sorts in the NVPTX backend. For
> > each OpenMP target entrypoint foo$_omp_
On 10/20/2015 11:04 PM, Alexander Monakov wrote:
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
This allows to emit decls in 'shared' memory from the middle-end.
* config/nvptx/nvptx.c (nvptx_legitimate_address_p): Adjust prototype.
(nvp
I discovered these two tests try and set an unreasonable vector_length. They're
attempting to check reduction behaviour, so I've applied this patch to trunk to
reduce the vector length.
I already fixed them on the gomp4 branch, and additional functionality there
will give a diagnostic on such
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
The approach I've taken in libgomp/nvptx is to have a single entry point,
gomp_nvptx_main, that can take care of initial allocation, transferring
control to target region function, and finalization.
At the moment it has the prototype:
void gomp_nv
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > This allows to emit decls in 'shared' memory from the middle-end.
> >
> > * config/nvptx/nvptx.c (nvptx_legitimate_address_p): Adjust prototype.
> > (nvptx_section_for_decl): If type of dec
On Tue, 20 Oct 2015, Bernd Schmidt wrote:
> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > Due to special treatment of types, emitting variables of type _Bool in
> > global scope is impossible: extern references are emitted with .u8, but
> > definitions use .u64. This patch fixes the issu
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
On NVPTX, there's 16 hardware barriers for each thread team, each barrier has
a variable waiter count. The instruction 'bar.sync N, M;' allows to wait on
barrier number N until M threads have arrived. M should be pre-multiplied by
warp width. It
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
This allows to emit decls in 'shared' memory from the middle-end.
* config/nvptx/nvptx.c (nvptx_legitimate_address_p): Adjust prototype.
(nvptx_section_for_decl): If type of decl has a specific address
space, return it.
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
Due to special treatment of types, emitting variables of type _Bool in
global scope is impossible: extern references are emitted with .u8, but
definitions use .u64. This patch fixes the issue by treating boolean type as
integer types.
* c
On 10/20/2015 08:34 PM, Alexander Monakov wrote:
NVPTX does not support alloca or variable-length stack allocations, thus
heap allocation needs to be used instead. I've opted to make this a generic
change instead of guarding it with an #ifdef: libgomp usually leaves thread
stack size up to libc,
This patch by Chris Manghane changes the Go frontend to not check for
an invalid constant when lowering a binary expression. This fixes
https://golang.org/issue/12615 . Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu. Committed to mainline.
Ian
Index: gcc/go/gofrontend/MERGE
=
On Tue, Oct 20, 2015 at 3:38 PM, Szabolcs Nagy wrote:
> On 20/10/15 06:16, Alan Modra wrote:
>>
>> On Mon, Oct 19, 2015 at 08:10:32PM +0100, Szabolcs Nagy wrote:
>>>
>>> On 19/10/15 14:04, Szabolcs Nagy wrote:
On 19/10/15 12:12, Alan Modra wrote:
>
> On Thu, Oct 15, 2015 at 06:50
On 20/10/15 06:16, Alan Modra wrote:
On Mon, Oct 19, 2015 at 08:10:32PM +0100, Szabolcs Nagy wrote:
On 19/10/15 14:04, Szabolcs Nagy wrote:
On 19/10/15 12:12, Alan Modra wrote:
On Thu, Oct 15, 2015 at 06:50:50PM +0100, Szabolcs Nagy wrote:
A powerpc toolchain built with (or without) --enable-
Do you refer to this comment?
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66790#c20)
I do have to say that I am still uncomfortable with changing RRE to
use a MUST problem rather than a MAY problem. I see this as dumbing
down the compiler to provide the semantics of uninitialized variables
an
On 10/20/2015 06:53 PM, Joseph Myers wrote:
On Tue, 20 Oct 2015, Martin Sebor wrote:
I think -Warray-bounds should emit consistent diagnostics for invalid
array references regardless of the contexts. I.e., given
struct S {
int A [5][7];
int x;
} s;
these should bot
On 10/20/2015 10:17 AM, Marek Polacek wrote:
The function is_cilkplus_vector_p is defined both in c-parser.c and parser.c
and is exactly the same so it seems that it should rather be defined in the
common Cilk+ code (even though for such a small static inline fn it probably
doesn't matter much).
On 10/20/15 16:20, Ilya Verbin wrote:
On Tue, Oct 20, 2015 at 15:54:45 -0400, Nathan Sidwell wrote:
There might be a situation when some func or var is lost during regular LTO,
even if flag_openacc is present. In this case "missing OpenACC ..." message
would be wrong. And if flag_openacc is
On Tue, Oct 20, 2015 at 15:54:45 -0400, Nathan Sidwell wrote:
> @@ -1209,16 +1209,11 @@ input_overwrite_node (struct lto_file_de
>
>if (!success)
> {
> - if (flag_openacc)
> - {
> - if (TREE_CODE (node->decl) == FUNCTION_DECL)
> - error ("Missing routine function %
On 07/09/15 12:53, Kugan wrote:
>
> This a new version of the patch posted in
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
> more testing and spitted the patch to make it more easier to review.
> There are still couple of issues to be addressed and I am working on them
Another small cleanup I noticed. We can use %qD to print a decl name.
Applied to gomp4 branch.
nathan
2015-10-20 Nathan Sidwell
* lto-cgraph.c (input_overwrite_node): Cleanup openacc diagnostic
emission.
Index: gcc/lto-cgraph.c
On 10/20/2015 01:16 PM, Pierre-Marie de Rodat wrote:
On 10/20/2015 12:01 PM, Jeff Law wrote:
* The patch series for transition to standard DWARF for Ada
(https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01857.html). There are 8
patches, each one depending on the previous one, except the 6/8 one
On 10/20/2015 12:01 PM, Jeff Law wrote:
* The patch series for transition to standard DWARF for Ada
(https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01857.html). There are 8
patches, each one depending on the previous one, except the 6/8 one
(https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01361.h
I made this change on the delayed folding branch and then noticed that
it broke pointer-arith-10.c, which you added to the testsuite. The
patch changes the -original dump from
return (char *) ((sizetype) p + (sizetype) i);
to
return (char *) i + (sizetype) p;
It's not clear to me why th
On 10/20/2015 10:57 AM, Joseph Myers wrote:
On Tue, 20 Oct 2015, Martin Sebor wrote:
An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).
Just-past-the-
On 10/20/2015 10:33 AM, Eric Botcazou wrote:
Hi,
this test started to fail recently as the result of the work of Richard S.,
but the underlying issue had been latent for a long time. It boils down to
this excerpt from the VRP1 dump file:
Found new range for _9: [0, 12]
marking stmt to be not s
On NVPTX, there's 16 hardware barriers for each thread team, each barrier has
a variable waiter count. The instruction 'bar.sync N, M;' allows to wait on
barrier number N until M threads have arrived. M should be pre-multiplied by
warp width. It's also possible to 'post' the barrier without susp
On NVPTX, we don't need most of target.c functionality, except for GOMP_teams.
Provide it as a copy of the generic implementation for now (it most likely
will need to change down the line: on NVPTX we do need to spawn several
thread blocks for #pragma omp teams).
Alternatively, it might make sense
Note: this patch will have to be more complex if we go with 'approach 2'
described in a later patch, 07/14 "launch target functions via gomp_nvptx_main".
For OpenMP offloading, libgomp invokes 'gomp_nvptx_main' as the accelerator
kernel, passing it a pointer to outlined target region function. Th
This patch ports team.c to nvptx by arranging an initialization/cleanup
routine, gomp_nvptx_main, that all (pre-started) threads can run. It
initializes a thread pool and proceeds to run gomp_thread_start in all threads
except thread zero, which runs original target region function.
Thread-privat
(This patch serves as a straw man proposal to have something concrete for
discussion and further patches)
On PTX, stack memory is private to each thread. When master thread constructs
'omp_data_o' on its own stack and passes it to other threads via
GOMP_parallel by reference, other threads cannot
NVPTX provides vprintf, but there's no stream separation: everything is
printed as if into stdout. This is the minimal change to get error.c working.
* error.c [__nvptx__]: Replace vfprintf, fputs, fputc with [v]printf.
---
libgomp/error.c | 5 +
1 file changed, 5 insertions(+)
diff
The approach I've taken in libgomp/nvptx is to have a single entry point,
gomp_nvptx_main, that can take care of initial allocation, transferring
control to target region function, and finalization.
At the moment it has the prototype:
void gomp_nvptx_main(void (*fn)(void*), void *fndata);
but it'
(note to reviewers: I'm not sure what we're after here, on the high level;
will be happy to rework the patch in a saner manner based on feedback, or even
drop it for now)
At the moment the attribute setting logic in omp-low.c is such that if a
function that should be present in target code does no
This provides minimal implementations of gomp_dynamic_max_threads and
omp_get_num_procs.
* config/nvptx/proc.c: New.
---
libgomp/config/nvptx/proc.c | 40
1 file changed, 40 insertions(+)
diff --git a/libgomp/config/nvptx/proc.c b/libgomp/config/n
This patch removes 0-size libgomp stubs where generic implementations can be
compiled for the NVPTX target.
It also removes non-stub critical.c, which contains assembly implementations
for GOMP_atomic_{start,end}, but does not contain implementations for
GOMP_critical_*. My understanding is that
NVPTX does not support alloca or variable-length stack allocations, thus
heap allocation needs to be used instead. I've opted to make this a generic
change instead of guarding it with an #ifdef: libgomp usually leaves thread
stack size up to libc, so avoiding unbounded stack allocation makes sense
This allows to emit decls in 'shared' memory from the middle-end.
* config/nvptx/nvptx.c (nvptx_legitimate_address_p): Adjust prototype.
(nvptx_section_for_decl): If type of decl has a specific address
space, return it.
(nvptx_addr_space_from_address): Ditto.
Due to special treatment of types, emitting variables of type _Bool in
global scope is impossible: extern references are emitted with .u8, but
definitions use .u64. This patch fixes the issue by treating boolean type as
integer types.
* config/nvptx/nvptx.c (init_output_initializer): Also
Hello,
This patch series moves libgomp/nvptx porting further along to get initial
bits of parallel execution working, mostly unbreaking the testsuite. Please
have a look! I'm interested in feedback, and would like to know if it's
suitable to become a part of a branch.
This patch series ports en
The NVPTX backend emits each functions either as .func (callable only from the
device code) or as .kernel (entry point for a parallel region). OpenMP
lowering adds "omp target entrypoint" attribute to functions outlined from
target regions. Unlike OpenACC offloading, OpenMP offloading does not in
This patch by Chris Manghane fixes the Go frontend to correctly
diagnose an attempt to shift by a string value. This fixes
https://golang.org/issue/12618 . Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu. Committed to mainline.
Ian
Index: gcc/go/gofrontend/MERGE
==
David,
On 10/20/2015 11:17 AM, David Edelsohn wrote:
Did this revised patch address the comments about MIR from Kenny?
Do you refer to this comment?
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66790#c20)
I do have to say that I am still uncomfortable with changing RRE to
use a MUST problem
On 20 October 2015 at 17:26, Kyrill Tkachov wrote:
> Hi Marcus,
>
> On 20/10/15 17:05, Marcus Shawcroft wrote:
>>
>> On 16 October 2015 at 13:58, Kyrill Tkachov
>> wrote:
>>>
>>> Hi all,
>>>
>>> We already support load/store-pair operations on the D-registers when
>>> they
>>> contain an FP value
In preparing this patch set for trunk, I discovered I'd flubbed the calculations
for default contiguous looping. This fixes the calculation in the target-side
loop transformation code. I also realized that the calculation appropriate for
an accelerator is not the best for the host. For the la
On 20 October 2015 at 17:31, Kyrill Tkachov wrote:
>
> On 20/10/15 17:28, Ramana Radhakrishnan wrote:
>>
>> On Tue, Oct 20, 2015 at 4:26 PM, Marcus Shawcroft
>> wrote:
>>>
>>> On 19 October 2015 at 14:57, Kyrill Tkachov
>>> wrote:
>>>
2015-10-19 Kyrylo Tkachov
* config/aar
On Tue, 20 Oct 2015, Martin Sebor wrote:
> An array subscript is out of range, even if an object is apparently
> accessible with the given subscript (as in the lvalue expression
> a[1][7] given the declaration int a[4][5]) (6.5.6).
Just-past-the-end is only out of range if the dereference i
On 10/20/2015 09:48 AM, Bernd Schmidt wrote:
On 10/20/2015 05:31 PM, Martin Sebor wrote:
On 10/20/2015 07:20 AM, Bernd Schmidt wrote:
On 10/16/2015 09:34 PM, Martin Sebor wrote:
Thank you for the review. Attached is an updated patch that hopefully
addresses all your comments. I ran the check
On Tue, 20 Oct 2015, Martin Sebor wrote:
> I think -Warray-bounds should emit consistent diagnostics for invalid
> array references regardless of the contexts. I.e., given
>
> struct S {
> int A [5][7];
> int x;
> } s;
>
> these should both be diagnosed:
>
> int i =
On Tue, 20 Oct 2015, Marek Polacek wrote:
> Joseph noticed that we were wrongly accepting multiple attributes without
> commas. Thus fixed by breaking out of the loop when parsing the attributes --
> if we don't see a comma after an attribute, then the next tokens must be ));
> if
> not, then th
On 14/10/15 13:30, Wilco Dijkstra wrote:
Enable instruction fusion of dependent AESE; AESMC and AESD; AESIMC pairs. This
can give up to 2x
speedup on many AArch64 implementations. Also model the crypto instructions on
Cortex-A57 according
to the Optimization Guide.
Passes regression tests.
Hello.
Following patch fixes up HSA kernel from kernel dispatching mechanism,
where we forgot to wait in a loop for the dispatched child kernel. Apart from
that,
there's a small follow-up which changes naming scheme for HSA modules.
Martin
>From eca686b6495bf7faa32ecb292f94558c6bfdbdce Mon Sep 1
1 - 100 of 195 matches
Mail list logo