Hi there, This series adds support for ARB_compute_shader on Kepler GK104. GK110+ support is still unstable and need more work.
Almost all dEQP compute tests pass with a ratio of ~97%. As usual, the list of fails is described below. About piglit, only two tests fail but this is related to images support. By the way, the series is built on top of "nvc0: avoid using magic numbers for the uniform_bo offsets". Please review, Thanks! Samuel Pitoiset (11): nvc0: use a different offset for buffers and surfaces nvc0: bind driver cb for compute on c7[] for Kepler nvc0: bind shader buffers for compute on Kepler nvc0: bind constant buffers for compute on Kepler nvc0: allow to use more than 7 UBOs for compute on Kepler nvc0: bump the number of available UBOs for compute on Kepler nvc0: reduce likelihood of collision for real buffers on Kepler nvc0: add indirect compute support on Kepler nvc0/ir: fix wrong pred emission for ld lock on GK104 nv50/ir: add atomics support on shared memory for Kepler nvc0: enable compute shaders on Kepler (GK104) .../drivers/nouveau/codegen/nv50_ir_driver.h | 2 + .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 5 +- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 216 ++++++++++++++-- .../nouveau/codegen/nv50_ir_lowering_nvc0.h | 13 +- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 17 +- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 18 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 1 - src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 282 +++++++++++++++++---- src/gallium/drivers/nouveau/nvc0/nve4_compute.h | 27 +- 10 files changed, 468 insertions(+), 117 deletions(-) -- 2.7.1 deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec4: fail deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/scalar: fail deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec2: fail deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec3: fail deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec4: fail Very likely related to sqrt. Same as Fermi. deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getboolean: fail deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getfloat: fail deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger: fail deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger64: fail deqp-gles31/functional/compute/basic/copy_image_to_ssbo_large: crash deqp-gles31/functional/compute/basic/copy_image_to_ssbo_small: crash deqp-gles31/functional/compute/basic/copy_ssbo_to_image_large: crash deqp-gles31/functional/compute/basic/copy_ssbo_to_image_small: crash deqp-gles31/functional/compute/basic/image_barrier_multiple: crash deqp-gles31/functional/compute/basic/image_barrier_single: crash No images support. deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/isampler2darray: fail deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler2darrayshadow: fail deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler2dshadow: fail deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler3d: fail deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/samplercubeshadow: fail I don't exactly know what happens with those tests because all other samplers ones pass. My assumption is that those fails are related to use float instead of vec4. Not sure if it's related to the compute support though. Need investigation. deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_5000: crash deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_5000: crash We submit too fast and the kernel kills the pushbuf. Same fails on Fermi, can be fixed later. Unrelated to the compute support. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev