[PATCH 24/25] Ignore LLVM's blank lines.

2018-09-05 Thread ams
The GCN toolchain must use the LLVM assembler and linker because there's no binutils port. The LLVM tools do not have the same diagnostic style as binutils, so the "blank line(s) in output" tests are inappropriate (and very noisy). The LLVM tools also have different command line options, so it's

[PATCH 25/25] Port testsuite to GCN

2018-09-05 Thread ams
This collection of miscellaneous patches configures the testsuite to run on AMD GCN in a standalone (i.e. not offloading) configuration. It assumes you have your Dejagnu set up to run binaries via the gcn-run tool. 2018-09-05 Andrew Stubbs Kwok Cheung Yeung Julian Br

[PATCH 23/25] Testsuite: GCN is always PIE.

2018-09-05 Thread ams
The GCN/HSA loader ignores the load address and uses a random location, so we build all GCN binaries as PIE, by default. This patch makes the necessary testsuite adjustments to make this work correctly. 2018-09-05 Andrew Stubbs gcc/testsuite/ * gcc.dg/graphite/scop-19.c: Chec

[PATCH 22/25] Add dg-require-effective-target exceptions

2018-09-05 Thread ams
There are a number of tests that fail because they assume that exceptions are available, but GCN does not support them, yet. This patch adds "dg-require-effective-target exceptions" in all the affected tests. There's probably an automatic way to test for exceptions, but the current implementatio

[PATCH 18/25] Fix interleaving of Fortran stop messages

2018-09-05 Thread ams
Fortran STOP and ERROR STOP use a different function to print the "STOP" string and the message string. On GCN this results in out-of-order output, such as "ERROR STOP ". This patch fixes the problem by making estr_write use the proper Fortran write, not C printf, so both parts are now output th

[PATCH 20/25] GCN libgcc.

2018-09-05 Thread ams
This patch contains the GCN port of libgcc. I've broken it out just to keep both parts more manageable. We have the usual stuff, plus a "gomp_print" implementation intended to provide a means to output text to console without using the full printf. Originally this was because we did not have a

[PATCH 19/25] GCN libgfortran.

2018-09-05 Thread ams
This patch contains the GCN port of libgfortran. We use the minimal configuration created for NVPTX. That's all that's required, besides the target-independent bug fixes posted already. 2018-09-05 Andrew Stubbs Kwok Cheung Yeung Julian Brown Tom de Vri

[PATCH 17/25] Fix Fortran STOP.

2018-09-05 Thread ams
The minimal libgfortran setup was created for NVPTX, but will also be used by AMD GCN. This patch simply removes an assumption that NVPTX is the only user. Specifically, NVPTX exit is broken, but AMD GCN exit works just fine. 2018-09-05 Andrew Stubbs libgfortran/ * runtime/mi

[PATCH 15/25] Don't double-count early-clobber matches.

2018-09-05 Thread ams
Given a pattern with a number of operands: (match_operand 0 "" "=&v") (match_operand 1 "" " v0") (match_operand 2 "" " v0") (match_operand 3 "" " v0") GCC will currently increment "reject" once, for operand 0, and then decrement it once for each of the other operands, ending with reject == -2 an

[PATCH 14/25] Disable inefficient vectorization of elementwise loads/stores.

2018-09-05 Thread ams
If the autovectorizer tries to load a GCN 64-lane vector elementwise then it blows away the register file and produces horrible code. This patch simply disallows elementwise loads for such large vectors. Is there a better way to disable this in the middle-end? 2018-09-05 Julian Brown

[PATCH 13/25] Create TARGET_DISABLE_CURRENT_VECTOR_SIZE

2018-09-05 Thread ams
This feature probably ought to be reworked as a proper target hook, but I would like to know if this is the correct solution to the problem first. The problem is that GCN vectors have a fixed number of elements (64) and the vector size varies with element size. E.g. V64QI is 64 bytes and V64SI i

[PATCH 16/25] Fix IRA ICE.

2018-09-05 Thread ams
The IRA pass makes an assumption that any pseudos created after the pass begins were created explicitly by the pass itself and therefore will have corresponding entries in its other tables. The GCN back-end, however, often creates additional pseudos, in expand patterns, to represent the necessary

[PATCH 10/25] Convert BImode vectors.

2018-09-05 Thread ams
GCN uses V64BImode to represent vector masks in the middle-end, and DImode bit-masks to represent them in the back-end. These must be converted at expand time and the most convenient way is to simply use a SUBREG. This works fine except that simplify_subreg needs to be able to convert immediates

[PATCH 12/25] Make default_static_chain return NULL in non-static functions

2018-09-05 Thread ams
This patch allows default_static_chain to be called from the back-end without it knowing if the function is static or not. Or, to put it another way, without duplicating the check everywhere it's used. 2018-09-05 Tom de Vries gcc/ * targhooks.c (default_static_chain): Return

[PATCH 11/25] Simplify vec_merge according to the mask.

2018-09-05 Thread ams
This patch was part of the original patch we acquired from Honza and Martin. It simplifies vector elements that are inactive, according to the mask. 2018-09-05 Jan Hubicka Martin Jambor * simplify-rtx.c (simplify_merge_mask): New function. (simplify_ternary_oper

[PATCH 08/25] Fix co-array allocation

2018-09-05 Thread ams
The Fortran front-end has a bug in which it uses "int" values for "size_t" parameters. I don't know why this isn't problem for all 64-bit architectures, but GCN ends up with the data in the wrong argument register and/or stack slot, and bad things happen. This patch corrects the issue by setting

[PATCH 09/25] Elide repeated RTL elements.

2018-09-05 Thread ams
GCN's 64-lane vectors tend to make RTL dumps very long. This patch makes them far more bearable by eliding long sequences of the same element into "repeated" messages. 2018-09-05 Andrew Stubbs Jan Hubicka Martin Jambor * print-rtl.c (print_rtx_operand_code

[PATCH 06/25] Remove constant vec_select restriction.

2018-09-05 Thread ams
The vec_select operator is documented to require a const_int for the lane selector operand, but GCN has an instruction that can select the lane at runtime, so it seems reasonable to remove this restriction. This patch simply replaces assertions that the operand is constant with early exits from t

[PATCH 04/25] SPECIAL_REGNO_P

2018-09-05 Thread ams
GCN has some registers which are special purpose, but not "fixed" because we want the register allocator to track their usage and select alternatives that use different special registers (e.g. scalar cc vs. vector cc). Sometimes this leads the regrename pass to ICE. Quite how it gets confused is

[PATCH 03/25] Improve TARGET_MANGLE_DECL_ASSEMBLER_NAME.

2018-09-05 Thread ams
The HSA GPU drivers can't cope with binaries that have the same symbol defined multiple times, even though the names are not exported. This happens whenever there are file-scope static variables with matching names. I believe it's also an issue with switch tables. This is a bug, but outside our

[PATCH 05/25] Add sorry_at diagnostic function.

2018-09-05 Thread ams
The plain "sorry" diagnostic only gives the "current" location, which is typically the last line of the function or translation unit by time we get to the back end. GCN uses "sorry" to report unsupported language features, such as static constructors, so it's useful to have a "sorry_at" variant.

[PATCH 07/25] [pr82089] Don't sign-extend SFV 1 in BImode

2018-09-05 Thread ams
This is an update of the patch posted to PR82089 long ago. We ran into the same bug on GCN, so we need this fixed as part of this series. 2018-09-05 Andrew Stubbs Tom de Vries PR82089 gcc/ * expmed.c (emit_cstore): Fix handling of result_mode == BImode

[PATCH 00/25] AMD GCN Port

2018-09-05 Thread ams
Hi All, This patch series contains the non-OpenACC/OpenMP portions of a port to AMD GCN3 and GCN5 GPU processors. It's sufficient to build single-threaded programs, with vectorization in the usual way. C and Fortran are supported, C++ is not supported, and the other front-ends have not been test

[PATCH 02/25] Propagate address spaces to builtins.

2018-09-05 Thread ams
At present, pointers passed to builtin functions, including atomic operators, are stripped of their address space properties. This doesn't seem to be deliberate, it just omits to copy them. Not only that, but it forces pointer sizes to Pmode, which isn't appropriate for all address spaces. This

[PATCH 01/25] Handle vectors that don't fit in an integer.

2018-09-05 Thread ams
GCN vector sizes range between 64 and 512 bytes, none of which have correspondingly sized integer modes. This breaks a number of assumptions throughout the compiler, but I don't really want to create modes just for this purpose. Instead, this patch fixes up the cases that I've found, so far, suc