[Mesa-dev] [PATCH 5/6] swr/rast: don't use 32-bit gathers for elements < 32-bits in size

2018-01-04 Thread Tim Rowley
Using a gather for elements less than 32-bits in size can cause pagefaults when loading the last elements in a page-aligned-sized buffer. --- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 61 +- 1 file changed, 60 insertions(+), 1 deletion(-) diff --git a/src/gallium/dr

[Mesa-dev] [PATCH 1/6] swr/rast: SIMD16 builder - cleanup naming (simd2 -> simd16)

2018-01-04 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/builder.cpp | 76 +- .../drivers/swr/rasterizer/jitter/builder.h| 45 +++--- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 133 .../drivers/swr/rasterizer/jitter/builder_misc.h | 50 +++--- .../drivers/swr/rast

[Mesa-dev] [PATCH 4/6] swr/rast: autogenerate named structs instead of literal structs

2018-01-04 Thread Tim Rowley
Results in far smaller and useful IR output. --- .../swr/rasterizer/codegen/templates/gen_llvm.hpp | 23 ++ 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_llvm.hpp b/src/gallium/drivers/swr/rasterizer/co

[Mesa-dev] [PATCH 3/6] swr/rast: SIMD16 fetch shader jitter cleanup

2018-01-04 Thread Tim Rowley
Bake in USE_SIMD16_BUILDER code paths (for USE_SIMD16_SHADER defined), remove USE_SIMD16_BUILDER define, remove deprecated psuedo-SIMD16 code paths. --- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 1118 +++- 1 file changed, 383 insertions(+), 735 deletions(-) diff --git a

[Mesa-dev] [PATCH 2/6] swr/rast: shuffle header files for msvc pre-compiled header usage

2018-01-04 Thread Tim Rowley
--- src/gallium/drivers/swr/Makefile.sources | 1 + .../drivers/swr/rasterizer/jitter/JitManager.cpp | 36 +- .../drivers/swr/rasterizer/jitter/JitManager.h | 46 +-- .../drivers/swr/rasterizer/jitter/blend_jit.cpp| 3 +- .../drivers/swr/rasterizer/jitter/builder.

[Mesa-dev] [PATCH 0/6] swr: update rasterizer

2018-01-04 Thread Tim Rowley
Highlights include simd16 cleanup (renaming and removing old codepaths), fixing a potential crash with the fetch shader, and code cleanups. Tim Rowley (6): swr/rast: SIMD16 builder - cleanup naming (simd2 -> simd16) swr/rast: shuffle header files for msvc pre-compiled header usage swr/r

[Mesa-dev] [PATCH 6/6] swr/rast: switch win32 jit format to COFF

2018-01-04 Thread Tim Rowley
Allows for call-stack and exception handling for jitted functions. --- src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp b/src/gallium/drivers/swr/rasterizer/jit

[Mesa-dev] [PATCH] swr/rast: fix invalid sign masks in avx512 simdlib code

2018-01-04 Thread Tim Rowley
Should be 0x8000 instead of 0x800. Cc: mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/swr/rasterizer/common/simdlib_128_avx512.inl | 2 +- src/gallium/drivers/swr/rasterizer/common/simdlib_256_avx512.inl | 2 +- src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512.inl |

[Mesa-dev] [PATCH] swr/rast: fix build break for llvm-6

2018-01-02 Thread Tim Rowley
LLVM api change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104381 --- src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp b/src/gallium/drivers/swr/rasterizer/jitter

[Mesa-dev] [PATCH 19/20] swr/rast: EXTRACT2 changed from vextract/vinsert to vshuffle

2017-12-14 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 60 ++ .../drivers/swr/rasterizer/jitter/builder_misc.h | 3 +- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 30 +-- 3 files changed, 32 insertions(+), 61 deletions(-) diff --git a/src/gallium/drivers/

[Mesa-dev] [PATCH 17/20] swr/rast: Replace VPSRL with LSHR

2017-12-14 Thread Tim Rowley
Replace use of x86 intrinsic with general llvm IR instruction. Generates the same final assembly. --- .../swr/rasterizer/codegen/gen_llvm_ir_macros.py | 2 -- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 30 -- .../drivers/swr/rasterizer/jitter/builder_misc.h | 5

[Mesa-dev] [PATCH 20/20] swr/rast: Move more RTAI handling out of binner

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 13 + src/gallium/drivers/swr/rasterizer/core/clip.h | 1 + 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/gallium/drivers/swr/rasterizer/core/binner

[Mesa-dev] [PATCH 13/20] swr/rast: SIMD16 Fetch - Fully widen 32-bit integer vertex components

2017-12-14 Thread Tim Rowley
Also widen the 16-bit a 8-bit integer vertex component gathers to SIMD16. --- .../swr/rasterizer/codegen/gen_llvm_ir_macros.py | 1 + .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 36 + .../drivers/swr/rasterizer/jitter/builder_misc.h | 3 + .../drivers/swr/rasterizer/jitter/f

[Mesa-dev] [PATCH 18/20] swr/rast: Fix cache of API thread event manager

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/api.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp b/src/gallium/drivers/swr/rasterizer/core/api.cpp index 25a3f34841..09b482dcc0 100644 --- a/src/gallium/drivers/swr/rasterizer/co

[Mesa-dev] [PATCH 12/20] swr/rast: Replace INSERT2 vextract/vinsert with JOIN2 vshuffle

2017-12-14 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 38 ++--- .../drivers/swr/rasterizer/jitter/builder_misc.h | 5 +- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 92 ++ 3 files changed, 30 insertions(+), 105 deletions(-) diff --git a/src/gallium/drivers/s

[Mesa-dev] [PATCH 15/20] swr/rast: Pull of RTAI gather & offset out of clip/bin code

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 118 +++- src/gallium/drivers/swr/rasterizer/core/clip.cpp | 30 ++-- src/gallium/drivers/swr/rasterizer/core/clip.h | 35 +++-- src/gallium/drivers/swr/rasterizer/core/context.h | 4 +- .../drivers/swr/rasterizer/core

[Mesa-dev] [PATCH 16/20] swr/rast: Rework thread binding parameters for machine partitioning

2017-12-14 Thread Tim Rowley
Add BASE_NUMA_NODE, BASE_CORE, BASE_THREAD parameters to SwrCreateContext. Add optional SWR_API_THREADING_INFO parameter to SwrCreateContext to control reservation of API threads. Add SwrBindApiThread() function to allow binding of API threads to reserved HW threads. --- .../drivers/swr/rasteriz

[Mesa-dev] [PATCH 14/20] swr/rast: Remove no-op VBROADCAST of vID

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp index ec3b5eafcc..1312ac0009 100644 --- a/src/galli

[Mesa-dev] [PATCH 01/20] swr/rast: Remove unneeded copy of gather mask

2017-12-14 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 22 +- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 80 ++ 2 files changed, 23 insertions(+), 79 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp b/src/gallium/drivers/swr

[Mesa-dev] [PATCH 02/20] swr/rast: Binner fixes for viewport index offset handling

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 9 - src/gallium/drivers/swr/rasterizer/core/clip.h | 5 - 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/gallium/drivers/swr/rasterizer/core/binner.c

[Mesa-dev] [PATCH 00/20] swr: update rasterizer

2017-12-14 Thread Tim Rowley
Highlights include simd16 work, thread pool initialization rework, and code cleanup. Tim Rowley (20): swr/rast: Remove unneeded copy of gather mask swr/rast: Binner fixes for viewport index offset handling swr/rast: Corrections to multi-scissor handling swr/rast: WIP - Widen fetch shader

[Mesa-dev] [PATCH 04/20] swr/rast: WIP - Widen fetch shader to SIMD16

2017-12-14 Thread Tim Rowley
Widen vertex gather/storage to SIMD16 for all component types. --- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 716 - 1 file changed, 689 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp b/src/gallium/drivers/swr/ras

[Mesa-dev] [PATCH 11/20] swr/rast: SIMD16 Fetch - Fully widen 16-bit float vertex components

2017-12-14 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 55 +++--- 1 file changed, 48 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp index 2065db3475..c960dc77fb 100644

[Mesa-dev] [PATCH 03/20] swr/rast: Corrections to multi-scissor handling

2017-12-14 Thread Tim Rowley
binner's GatherScissors() will be turned into a real gather in the not too distant future. --- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 176 ++--- 1 file changed, 88 insertions(+), 88 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/g

[Mesa-dev] [PATCH 08/20] swr/rast: Pull most of the VPAI manipulation out of the binner/clipper

2017-12-14 Thread Tim Rowley
Move out of binner/clipper; hand them down from the frontend code instead. --- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 124 ++--- src/gallium/drivers/swr/rasterizer/core/clip.cpp | 25 ++--- src/gallium/drivers/swr/rasterizer/core/clip.h | 58 +++--- src/ga

[Mesa-dev] [PATCH 05/20] swr/rast: Convert gather masks to Nx1bit

2017-12-14 Thread Tim Rowley
Simplifies calling code, gets gather function interface closer to llvm's masked_gather. --- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 20 + .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 34 +- 2 files changed, 14 insertions(+), 40 deletions(-) dif

[Mesa-dev] [PATCH 06/20] swr/rast: Rewrite Shuffle8bpcGatherd using shuffle

2017-12-14 Thread Tim Rowley
Ease future code maintenance, prepare for folding simd8 and simd16 versions. --- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 244 ++--- 1 file changed, 62 insertions(+), 182 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp b/src/gallium/d

[Mesa-dev] [PATCH 07/20] swr/rast: Move GatherScissors to header

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 127 - src/gallium/drivers/swr/rasterizer/core/binner.h | 127 + 2 files changed, 127 insertions(+), 127 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/gallium/d

[Mesa-dev] [PATCH 10/20] swr/rast: SIMD16 Fetch - Fully widen 32-bit float vertex components

2017-12-14 Thread Tim Rowley
--- .../swr/rasterizer/codegen/gen_llvm_ir_macros.py | 3 +- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 41 - .../drivers/swr/rasterizer/jitter/builder_misc.h | 7 +- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 175 ++--- 4 files changed, 194 inserti

[Mesa-dev] [PATCH 09/20] swr/rast: Pass prim to ClipSimd

2017-12-14 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/clip.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h b/src/gallium/drivers/swr/rasterizer/core/clip.h index 148f661ab4..8b947668d3 100644 --- a/src/gallium/drivers/swr/raste

[Mesa-dev] [PATCH 08/10] swr/rast: Simplify GATHER* jit builder api

2017-11-20 Thread Tim Rowley
General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. --- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 32 ++--- .../drivers/swr/rasterizer/jitter/builder_misc.h | 6 +-- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 56 +++---

[Mesa-dev] [PATCH 06/10] swr/rast: Cache eventmanager

2017-11-20 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/archrast/archrast.h | 1 + src/gallium/drivers/swr/rasterizer/core/api.cpp| 5 + src/gallium/drivers/swr/rasterizer/core/api.h | 3 +++ 3 files changed, 9 insertions(+) diff --git a/src/gallium/drivers/swr/rasterizer/archrast/archrast.h

[Mesa-dev] [PATCH 05/10] swr/rast: Enable AVX-512 targets in the jitter

2017-11-20 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/knobs.h| 8 src/gallium/drivers/swr/rasterizer/jitter/JitManager.h | 2 -- 2 files changed, 10 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/knobs.h b/src/gallium/drivers/swr/rasterizer/core/knobs.h index fe0a044ae8

[Mesa-dev] [PATCH 02/10] swr/rast: Widen fetch shader to SIMD16

2017-11-20 Thread Tim Rowley
Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. --- .../drivers/swr/rasterizer/jitter/builder.cpp | 20 .../drivers/swr/rasterizer/jitter/builder.h| 16 ++ .../drivers/swr/rasterizer/j

[Mesa-dev] [PATCH 03/10] swr/rast: Code style change (NFC)

2017-11-20 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp index e15b300979..2fe6cfcf69 100644 --- a/src/gallium/d

[Mesa-dev] [PATCH 00/10] swr: update rasterizer

2017-11-20 Thread Tim Rowley
Highlights are code cleanups and more progress on simd16. Tim Rowley (10): swr/rast: support flexible vertex layout for DS output swr/rast: Widen fetch shader to SIMD16 swr/rast: Code style change (NFC) swr/rast: Points with clipdistance can't go through simplepoints path swr

[Mesa-dev] [PATCH 07/10] swr/rast: Add alignment to transpose targets

2017-11-20 Thread Tim Rowley
Needed to ensure alignment for avx512. Fixes address sanitizer crash. --- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/gallium/drivers/swr/rasterize

[Mesa-dev] [PATCH 01/10] swr/rast: support flexible vertex layout for DS output

2017-11-20 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 1 + src/gallium/drivers/swr/rasterizer/core/state.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp index 211e9e4b07.

[Mesa-dev] [PATCH 04/10] swr/rast: Points with clipdistance can't go through simplepoints path

2017-11-20 Thread Tim Rowley
Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. --- src/gallium/drivers/swr/rasterizer/core/frontend.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.h b/src/gallium/drivers/swr/ras

[Mesa-dev] [PATCH 09/10] swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader

2017-11-20 Thread Tim Rowley
Disabled for now. --- .../swr/rasterizer/codegen/gen_llvm_ir_macros.py | 1 + .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 126 +++-- .../drivers/swr/rasterizer/jitter/builder_misc.h | 31 - .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 91 ---

[Mesa-dev] [PATCH 10/10] swr/rast: Repair simd8 frontend code rot

2017-11-20 Thread Tim Rowley
Keep non-default simd8 frontend code running for comparison purposes. --- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/fronten

[Mesa-dev] [PATCH] swr/rast: Use gather instruction for i32gather_ps on simd16/avx512

2017-11-13 Thread Tim Rowley
Speed up avx512 platforms; fixes performance regression caused by swithc to simdlib. Cc: mesa-sta...@lists.freedesktop.org --- .../drivers/swr/rasterizer/common/simdlib_512_avx512.inl | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/src/gallium/drivers/swr/rast

[Mesa-dev] [PATCH] swr/rast: Faster emulated simd16 permute

2017-11-13 Thread Tim Rowley
Speed up simd16 frontend (default) on avx/avx2 platforms; fixes performance regression caused by switch to simdlib. Cc: mesa-sta...@lists.freedesktop.org --- .../swr/rasterizer/common/simdlib_512_emu.inl | 34 +++--- 1 file changed, 11 insertions(+), 23 deletions(-) diff --g

[Mesa-dev] [PATCH] gallivm: allow arch rounding with avx512

2017-11-01 Thread Tim Rowley
Fixes piglit vs-roundeven-{float,vec[234]} with simd16 VS. --- src/gallium/auxiliary/gallivm/lp_bld_arit.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c index cf1958b3b6..a1edd349f1 1

[Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask

2017-10-19 Thread Tim Rowley
A number of double/int64 operations don't have matching read and write usage masks, which the fallthrough case of tgsi_util_get_inst_usage_mask assumes for componentwise tagged instructions. No regressions in llvmpipe piglit; fixes a large number of swr regressions. --- src/gallium/auxiliary/tgsi

[Mesa-dev] [PATCH 6/7] swr/rast: Add api to override draws in flight

2017-10-19 Thread Tim Rowley
Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO. Patch by Jan Zielinski. --- src/gallium/drivers/swr/rasterizer/core/api.cpp| 26 +- src/gallium/drivers/swr/rasterizer/core/api.h | 4 src/gallium/drivers/swr/rasterizer/core/context.h | 2 ++

[Mesa-dev] [PATCH 4/7] swr/rast: Change DS memory allocation

2017-10-19 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 4 ++-- src/gallium/drivers/swr/rasterizer/core/state.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp

[Mesa-dev] [PATCH 1/7] swr/rast: Minor changes for os-x

2017-10-19 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/threads.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp b/src/gallium/drivers/swr/rasterizer/core/threads.cpp index 4bb395d..9ece064 100644 --- a/src/gallium/drivers/swr/r

[Mesa-dev] [PATCH 0/7] swr: rasterizer update

2017-10-19 Thread Tim Rowley
Highlights are code cleanups, some more simd16 work (disabled by default), and tuning for the Intel Xeon Phi architecture. Tim Rowley (7): swr/rast: Minor changes for os-x swr/rast: Miscellaneous viewport array code changes swr/rast: Fix indentation swr/rast: Change DS memory allocation

[Mesa-dev] [PATCH 3/7] swr/rast: Fix indentation

2017-10-19 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/state.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h b/src/gallium/drivers/swr/rasterizer/core/state.h index f7c9308..d9450fc 100644 --- a/src/gallium/drivers/swr/rasterizer/core/sta

[Mesa-dev] [PATCH 2/7] swr/rast: Miscellaneous viewport array code changes

2017-10-19 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 45 -- src/gallium/drivers/swr/rasterizer/core/clip.h | 14 +-- .../drivers/swr/rasterizer/core/frontend.cpp | 22 ++- src/gallium/drivers/swr/rasterizer/core/pa.h | 24 ++-- src/galliu

[Mesa-dev] [PATCH 7/7] swr: knob overrides for Intel Xeon Phi

2017-10-19 Thread Tim Rowley
Architecture benefits from having more threads/work outstanding. --- src/gallium/drivers/swr/swr_context.cpp | 27 +++ src/gallium/drivers/swr/swr_context.h | 2 ++ src/gallium/drivers/swr/swr_loader.cpp | 4 src/gallium/drivers/swr/swr_scratch.cpp | 2 +- src/ga

[Mesa-dev] [PATCH 5/7] swr/rast: Widen fetch shader to SIMD16 (disabled for now)

2017-10-19 Thread Tim Rowley
Refactored the gather operation to process 16 elements at a time via paired SIMD8 operations. --- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 441 - 1 file changed, 428 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp

[Mesa-dev] [PATCH 0/2] gallium/swr: simd16 work in progress

2017-10-11 Thread Tim Rowley
Changes to allow the swr work in progress native simd16 pipeline. Currently enabling this via USE_SIMD16_SHADERS in knobs.h will run the fetch shader with double pumped simd8, the vertex shaders in native simd16, and the rest of the pipeline in simd8. Tim Rowley (2): gallium: allow 512-bit

[Mesa-dev] [PATCH 1/2] gallium: allow 512-bit vectors

2017-10-11 Thread Tim Rowley
Increase the max allowed vector size from 256 to 512. No piglit llvmpipe regressions running on avx2. Cc: Dave Airlie Cc: Jose Fonseca --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 14 +++--- src/gallium/auxiliary/gallivm/lp_bld_type.h | 4 ++-- 2 files changed, 9 insertio

[Mesa-dev] [PATCH 2/2] swr: simd16 shaders work in progress

2017-10-11 Thread Tim Rowley
Start building vertex shaders as simd16. Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment. Cc: Bruce Cherniak --- src/gallium/drivers/swr/swr_screen.cpp | 6 ++ src/gallium/drivers/swr/swr_screen.h | 3 +++ src/gallium/drivers/swr/swr_shader.cpp | 14 --

[Mesa-dev] [PATCH 2/2] swr/rast: use proper alignment for debug transposedPrims

2017-10-03 Thread Tim Rowley
Causing a crash in ParaView waveletcontour.py test when _DEBUG defined due to vector aligned copy with unaligned address. --- src/gallium/drivers/swr/rasterizer/core/clip.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h b/src

[Mesa-dev] [PATCH 1/2] configure.ac: add _DEBUG to strip_unwanted_llvm_flags

2017-10-03 Thread Tim Rowley
Assert-enabled builds of llvm add _DEBUG to the LLVM_CFLAGS. This was causing a crash with swr running the ParaView waveletcontour.py test, due to a bug in our _DEBUG code. --- configure.ac | 1 + 1 file changed, 1 insertion(+) diff --git a/configure.ac b/configure.ac index 903a3979d4..b2768f46c

[Mesa-dev] [PATCH 7/9] swr/rast: Fix allocation of DS output data for USE_SIMD16_FRONTEND

2017-09-21 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp index 22a5705..aea8e88 100644 --- a/src/galliu

[Mesa-dev] [PATCH 0/9] swr: update rasterizer

2017-09-21 Thread Tim Rowley
Highlights: large change in the geometry shader api, cleanups. Tim Rowley (9): swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel format swr/rast: New GS state/context API swr/rast: Fetch compile state changes swr/rast: Move SWR_GS_CONTEXT from thread local storage to stack swr

[Mesa-dev] [PATCH 4/9] swr/rast: Move SWR_GS_CONTEXT from thread local storage to stack

2017-09-21 Thread Tim Rowley
Move structure, as the size is significantly reduced due to dynamic allocation of the GS buffers. --- .../drivers/swr/rasterizer/core/frontend.cpp | 23 +++--- 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp

[Mesa-dev] [PATCH 1/9] swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel format

2017-09-21 Thread Tim Rowley
--- .../drivers/swr/rasterizer/common/formats.cpp | 27 +++--- .../drivers/swr/rasterizer/core/format_traits.h| 2 +- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 16 ++--- 3 files changed, 28 insertions(+), 17 deletions(-) diff --git a/src/gallium/driver

[Mesa-dev] [PATCH 5/9] swr/rast: Properly sized null GS buffer

2017-09-21 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp index 15bc93d..22a5705 100644 --- a/src/gallium/drivers/swr/rast

[Mesa-dev] [PATCH 3/9] swr/rast: Fetch compile state changes

2017-09-21 Thread Tim Rowley
Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to FETCH_COMPILE_STATE. --- src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 6 ++ src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h | 7 ++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/src/g

[Mesa-dev] [PATCH 8/9] swr/rast: Remove code supporting legacy llvm (<3.9)

2017-09-21 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/JitManager.cpp | 11 ++- .../drivers/swr/rasterizer/jitter/JitManager.h | 7 -- .../drivers/swr/rasterizer/jitter/builder_misc.cpp | 102 ++--- 3 files changed, 15 insertions(+), 105 deletions(-) diff --git a/src/gallium/drivers/swr/r

[Mesa-dev] [PATCH 9/9] swr/rast: Handle instanceID offset / Instance Stride enable

2017-09-21 Thread Tim Rowley
Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require similar changes, will need address this if it is determined that this path is still in use. Handle Force Sequential Access in FetchJit::Create. --- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 46 ++-

[Mesa-dev] [PATCH 2/9] swr/rast: New GS state/context API

2017-09-21 Thread Tim Rowley
One piglit regression, which was a false pass: spec@glsl-1.50@execution@geometry@dynamic_input_array_index --- .../drivers/swr/rasterizer/core/frontend.cpp | 227 - src/gallium/drivers/swr/rasterizer/core/state.h| 55 +++-- src/gallium/drivers/swr/swr_shader.cpp

[Mesa-dev] [PATCH 6/9] swr/rast: Slightly more efficient blend jit

2017-09-21 Thread Tim Rowley
--- .../drivers/swr/rasterizer/jitter/blend_jit.cpp| 30 -- 1 file changed, 10 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp index f2e6e53..3258639 100644 --- a

[Mesa-dev] [PATCH] swr/rast: remove llvm fence/atomics from generated files

2017-09-19 Thread Tim Rowley
We currently don't use these instructions, and since their API changed in llvm-5.0 having them in the autogen files broke the mesa release tarballs which ship with generated autogen files. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847 CC: mesa-sta...@lists.freedesktop.org --- src/

[Mesa-dev] [PATCH 08/10] swr/rast: Missed conversion to SIMD_T

2017-09-11 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/gallium/drivers/swr/rasterizer/core/binner.cpp index a6713e8..e08e489 100644 --- a/src/gallium/drivers/swr/rasterizer

[Mesa-dev] [PATCH 01/10] swr/rast: Add new API SwrStallBE

2017-09-11 Thread Tim Rowley
SwrStallBE stalls the backend threads until all work submitted before the stall has finished. The frontend threads can continue to make forward progress. --- src/gallium/drivers/swr/rasterizer/core/api.cpp | 9 + src/gallium/drivers/swr/rasterizer/core/api.h | 8 2 files change

[Mesa-dev] [PATCH 07/10] swr/rast: whitespace changes

2017-09-11 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/jitter/jit_api.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/jit_api.h b/src/gallium/drivers/swr/rasterizer/jitter/jit_api.h index 9f69669..e589d2c 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/jit

[Mesa-dev] [PATCH 10/10] swr/rast: Fetch compile state changes

2017-09-11 Thread Tim Rowley
Add InstanceStrideEnable field and rename InstanceDataStepRate to InstanceAdvancementState in INPUT_ELEMENT_DESC structure. Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices() and FetchJit::JitGatherVertices() and assert if they are triggered. --- src/gallium/drivers/swr/ra

[Mesa-dev] [PATCH 05/10] swr/rast: Migrate memory pointers to gfxptr_t type

2017-09-11 Thread Tim Rowley
--- .../swr/rasterizer/codegen/gen_llvm_types.py| 2 +- src/gallium/drivers/swr/rasterizer/core/state.h | 5 +++-- .../drivers/swr/rasterizer/memory/StoreTile.h | 4 ++-- .../drivers/swr/rasterizer/memory/TilingFunctions.h | 2 +- src/gallium/drivers/swr/swr_context.cpp

[Mesa-dev] [PATCH 04/10] swr/rast: Remove hardcoded clip/cull slot from clipper

2017-09-11 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/clip.h | 35 +++--- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h b/src/gallium/drivers/swr/rasterizer/core/clip.h index e0aaf81..cde5261 100644 --- a/src/gallium/drive

[Mesa-dev] [PATCH 06/10] swr/rast: add graph write to jit debug putput

2017-09-11 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp index fc32b62..e4281f8 100644 --- a/src/galliu

[Mesa-dev] [PATCH 03/10] swr/rast: Start to remove hardcoded clipcull_dist vertex attrib slot

2017-09-11 Thread Tim Rowley
Add new field in SWR_BACKEND_STATE::vertexClipCullOffset to specify the start of the clip/cull section of the vertex header. Removed use of hardcoded slot from binner. --- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 11 ++- src/gallium/drivers/swr/rasterizer/core/state.h| 9

[Mesa-dev] [PATCH 09/10] swr/rast: adjust linux cpu topology identification code

2017-09-11 Thread Tim Rowley
Make more robust to handle strange strange configurations like a vmware exported 4-way numa X 1-core configuration. --- .../drivers/swr/rasterizer/core/threads.cpp| 81 ++ 1 file changed, 38 insertions(+), 43 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer

[Mesa-dev] [PATCH 02/10] swr/rast: Move clip/cull enables in API

2017-09-11 Thread Tim Rowley
Moved from from SWR_RASTSTATE to SWR_BACKEND_STATE. --- .../drivers/swr/rasterizer/core/backend.cpp| 4 ++-- .../drivers/swr/rasterizer/core/backend_impl.h | 2 +- .../drivers/swr/rasterizer/core/backend_sample.cpp | 4 ++-- .../swr/rasterizer/core/backend_singlesample.cpp | 4 ++

[Mesa-dev] [PATCH 00/10] swr: update rasterizer

2017-09-11 Thread Tim Rowley
Mostly some api changes, plus making the cpu topology code a bit more robust in the face of some odd configurations seen in virtualized environments. No piglit or vtk ctest regressions. Tim Rowley (10): swr/rast: Add new API SwrStallBE swr/rast: Move clip/cull enables in API swr/rast

[Mesa-dev] [PATCH 6/8] swr/rast: SIMD16 FE remove templated immediates workaround

2017-09-05 Thread Tim Rowley
Fixed properly in gcc-compatible fashion. --- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 110 - 1 file changed, 20 insertions(+), 90 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp b/src/gallium/drivers/swr/rasterizer/core/binner.cpp ind

[Mesa-dev] [PATCH 7/8] swr/rast: Remove use of C++14 template variable

2017-09-05 Thread Tim Rowley
SWR rasterizer must remain C++11 compliant. --- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 6 +++--- src/gallium/drivers/swr/rasterizer/core/binner.h | 14 +++--- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp

[Mesa-dev] [PATCH 0/8] swr: update rasterizer

2017-09-05 Thread Tim Rowley
Highlight is starting to unify the simd/simd16 code, removing lots of temporary code duplication. No piglit or vtk test regressions. Tim Rowley (8): swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets swr: set caps for VB 4-byte alignment swr/rast: Removed some trailing

[Mesa-dev] [PATCH 5/8] swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble

2017-09-05 Thread Tim Rowley
For consistency and to support overloading. --- src/gallium/drivers/swr/rasterizer/core/clip.h | 18 +- .../drivers/swr/rasterizer/core/frontend.cpp | 6 +++--- src/gallium/drivers/swr/rasterizer/core/pa.h | 22 +++--- 3 files changed, 15 insertions

[Mesa-dev] [PATCH 8/8] swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib types

2017-09-05 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/core/clip.cpp | 16 +- src/gallium/drivers/swr/rasterizer/core/clip.h | 1650 ++ src/gallium/drivers/swr/rasterizer/core/state.h |7 + 3 files changed, 465 insertions(+), 1208 deletions(-) diff --git a/src/gallium/drivers/swr/ras

[Mesa-dev] [PATCH 2/8] swr: set caps for VB 4-byte alignment

2017-09-05 Thread Tim Rowley
Needed to compensate for change to fetch jit requiring alignment. Fixes regressions in piglit: vertex-buffer-offsets and about another hundred of the vs-input*byte* tests. --- src/gallium/drivers/swr/swr_screen.cpp | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/g

[Mesa-dev] [PATCH 1/8] swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets

2017-09-05 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py | 1 + src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 7 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py b/src/gallium/dr

[Mesa-dev] [PATCH 3/8] swr/rast: Removed some trailing whitespace caught during review

2017-09-05 Thread Tim Rowley
--- .../rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp | 4 ++-- src/gallium/drivers/swr/rasterizer/core/fifo.hpp | 4 ++-- src/gallium/drivers/swr/rasterizer/core/pa.h | 12 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/

[Mesa-dev] [PATCH] swr: limit pipe_draw_info->restart_index usage

2017-08-23 Thread Tim Rowley
Only copy this value when in restart drawing mode. Eliminates valgrind errors when running trivial programs. --- src/gallium/drivers/swr/swr_draw.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_draw.cpp b/src/gallium/drivers/swr/swr_draw.cp

[Mesa-dev] [PATCH] configure: remove trailing "-a" in swr architecture test

2017-08-10 Thread Tim Rowley
Fixes "configure: line 27326: test: argument expected" CC: mesa-sta...@lists.freedesktop.org --- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 5b12dd8..316e6a8 100644 --- a/configure.ac +++ b/configure.ac @@ -2545,7 +2545,7 @@ i

[Mesa-dev] [PATCH] swr/rast: [rasterizer core] fix invalid casting for calls to Interlocked* functions

2017-08-09 Thread Tim Rowley
CID: 1416243, 1416244, 1416255 CC: mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/swr/rasterizer/core/api.cpp | 2 +- src/gallium/drivers/swr/rasterizer/core/context.h | 8 src/gallium/drivers/swr/rasterizer/core/threads.cpp | 4 ++-- 3 files changed, 7 insertions(+), 7 d

[Mesa-dev] [PATCH v2 17/17] swr/rast: fix core / knights split of AVX512 intrinsics

2017-08-01 Thread Tim Rowley
Move AVX512BW specific intrinics to be Core-only. Move some AVX512F intrinsics back to common implementation file. --- .../drivers/swr/rasterizer/common/simdlib.hpp | 2 + .../swr/rasterizer/common/simdlib_512_avx512.inl | 53 + .../rasterizer/common/simdlib_512_avx512

[Mesa-dev] [PATCH v2 14/17] swr/rast: gen_knobs template code style

2017-08-01 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/codegen/templates/gen_knobs.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_knobs.cpp b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_knobs.cpp index e6fe165..a95

[Mesa-dev] [PATCH v2 16/17] swr/rast: simplify knob default value setup

2017-08-01 Thread Tim Rowley
--- .../drivers/swr/rasterizer/codegen/templates/gen_knobs.h| 13 - src/gallium/drivers/swr/rasterizer/core/knobs_init.h| 12 +++- 2 files changed, 11 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_knobs.h b/sr

[Mesa-dev] [PATCH v2 15/17] swr/rast: split gen_knobs templates into .h/.cpp

2017-08-01 Thread Tim Rowley
Switch to a 1:1 mapping template:generated for future maintenance. --- src/gallium/drivers/swr/Makefile.am| 3 +- src/gallium/drivers/swr/SConscript | 2 +- .../drivers/swr/rasterizer/codegen/gen_knobs.py| 14 +- .../swr/rasterizer/codegen/templates/gen_kno

[Mesa-dev] [PATCH v2 06/17] swr/rast: stop using MSFT types in platform independent code

2017-08-01 Thread Tim Rowley
--- src/gallium/drivers/swr/rasterizer/common/os.h | 6 -- src/gallium/drivers/swr/rasterizer/core/api.cpp| 2 +- src/gallium/drivers/swr/rasterizer/core/api.h | 4 ++-- src/gallium/drivers/swr/rasterizer/core/binner.cpp | 4 ++-- src/gallium/dr

[Mesa-dev] [PATCH v2 08/17] swr/rast: rename frontend pVertexStore

2017-08-01 Thread Tim Rowley
Rename to reflect global nature. --- src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp index f9eda83..

[Mesa-dev] [PATCH v2 13/17] swr/rast: switch gen_knobs.cpp license

2017-08-01 Thread Tim Rowley
Unintentionally added with an apache2 license; relicense to match the rest of the tree. --- .../swr/rasterizer/codegen/templates/gen_knobs.cpp | 29 +- 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_knobs

[Mesa-dev] [PATCH v2 12/17] swr/rast: fix scons gen_knobs.h dependency

2017-08-01 Thread Tim Rowley
Copy/paste error was duplicating a gen_knobs.cpp rule. --- src/gallium/drivers/swr/SConscript | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/SConscript b/src/gallium/drivers/swr/SConscript index a32807d..c578d7a 100644 --- a/src/gallium/drivers/swr/SCo

[Mesa-dev] [PATCH v2 09/17] swr/rast: vmask() implementations for KNL

2017-08-01 Thread Tim Rowley
--- .../swr/rasterizer/common/simdlib_512_avx512_knights.inl | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512_knights.inl b/src/gallium/drivers/swr/rasterizer/common/simdlib_512_avx512_knights.inl index 17001be..2e

  1   2   3   4   5   6   7   >