Hi Tom, Here is a link to our nvptx vector length patches on github:
https://github.com/cesarjp/gcc/tree/trunk-og8-vl-private Specifically, the code lives in the trunk-og8-vl-private branch. There are a couple of outstanding dependency patches: * Teach gfortran to lower OpenACC routine dims https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00368.html b186c651f37 [openacc] Make GFC default to -1 for OpenACC routine dims * Add target hook TARGET_GOACC_ADJUST_PARALLELISM https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00369.html 49b2039013e [openacc] Add target hook TARGET_GOACC_ADJUST_PARALLELISM * Enable firstprivate OpenACC reductions https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00370.html 1f70cdb7cf0 (HEAD -> trunk-og8-vl-private, github/trunk-og8-vl-private) [OpenACC] Enable firstprivate OpenACC reductions * Adjust offsets for present data clauses https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01213.html 8bcda2f1a2b [libgomp, OpenACC] Adjust offsets for present data clauses Of the patches in trunk-og8-vl-private, the following are just general refactors and cleanups which do not change any functionality: 7eb378e9b0c [nvptx] Generalize state propagation and synchronization 10aa1f74d5a [nvptx] Use MAX, MIN, ROUND_UP macros 9dfe611f3d8 [nvptx] Use TARGET_SET_CURRENT_FUNCTION 4fbe0e812bd [nvptx] Add axis_dim fbe43dac79f [nvptx] Add thread count parm to bar.sync 57d3f8c88ff [nvptx] only use one bar.sync barriers in OpenACC offloaded code f14d0e882eb [nvptx] Fix whitespace in nvptx_single and nvptx_neuter_pars 82d81fffb0f [nvptx] make nvptx state propagation function names more generic 95703737e09 [nvptx] consolidate offloaded function attributes into struct offload_attrs 8c9e897c36d [nvptx] Rename worker_bcast variables oacc_bcast. 45147e7e3f3 [nvptx] update openacc dim macros caa641ecfb4 [nvptx] Update insufficient launch message to accommodate large vectors The following patches actually implement the new vector length functionality. Note that trunk doesn't support missing arguments between colons in -fopenacc-dim like -fopenacc-dim=::64, so I had to remove a couple or adjust a couple of your test cases from og8. 591973d3c3a [nvptx] use user-defined vectors when possible fb9cefa5b17 [nvptx] Handle large vector reductions 5154d363d07 [nvptx] Force vl32 if calling vector-partitionable routines f62e3afcf6a [nvptx, openacc] Don't emit barriers for empty loops 4cc408658fb [PR85246] [nvptx] Fix propagation of branch cond in vw-neutered code d97ed5fc580 [nvptx] Simplifly logic in nvptx_single 62f0c5df3dd [nvptx] Enable worker partitioning with warp-sized vector_length f2cf96b0df3 [nvptx] Handle large vectors in libgomp eba014c260c [nvptx] Enable large vectors f31d8b98ca1 [nvptx] Add vector_length 128 testcases Let me know if you encounter any problems with that github branch. This branch has recently been recently rebased against trunk. Further, I bootstrapped and regtested it on x86_64 Linux target with nvptx offloading. Thanks, Cesar