[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon on Source games

2015-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=87886

--- Comment #13 from Michel Dänzer  ---
Does it still happen with the Radeon card with a 3.19-rc kernel?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen6: Fix crash with VS+TF after rendering with GS

2015-01-07 Thread Iago Toral Quiroga
Rendering with a GS and then using transform feedback with a program that does
not have a GS can crash in gen6. The reason for this is that
brw_begin_transform_feedback checks brw->geometry_program to decide if there
is a GS program, but this is not correct: brw->geometry_program is updated when
issuing drawing commands, so after rendering with a GS it will be non-NULL
until we draw again with a program that does not have a GS. If the next
program uses TF, we will call glBegintransformFeedback before issuing
the drawing command and hence brw->geometry_program will be non-NULL if
the previous rendering used a GS. The right thing to do here is to check
ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY] instead. This is what the
gen7 code path does too.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=87694
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 0dafd0f..ff93de3 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -241,7 +241,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
 
assert(brw->gen == 6);
 
-   if (brw->geometry_program) {
+   if (ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY]) {
   /* BRW_NEW_GEOMETRY_PROGRAM */
   shaderprog =
  ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY];
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: dri2: Use present extension. (Was: Re: [RFC] egl: Add DRI3 support to the EGL backend.)

2015-01-07 Thread Joonas Lahtinen
This is still awaiting for comments.

I'd rather hear what are the desirable modifications than try guessing.

On ma, 2014-11-10 at 15:18 +0200, Joonas Lahtinen wrote:
> Hi,
> 
> On pe, 2014-11-07 at 17:40 -0800, Eric Anholt wrote:
> > Ian Romanick  writes:
> > 
> > > On 11/06/2014 06:16 PM, Michel Dänzer wrote:
> > >> On 06.11.2014 19:18, Joonas Lahtinen wrote:
> > >>> On to, 2014-11-06 at 18:12 +0900, Michel Dänzer wrote:
> >  On 05.11.2014 20:14, Joonas Lahtinen wrote:
> > >
> > > Modified not refer to DRI3, just uses the present extension to get rid
> > > of the excess buffer invalidations.
> > 
> >  AFAICT there's no fallback from your changes to the current behaviour 
> >  if
> >  the X server doesn't support the Present extension. There probably 
> >  needs
> >  to be such a fallback.
> > >>>
> > >>> It gets rid of such nasty hack (the intel_viewport one), that I thought
> > >>> there is no point making fallback. Because without this, the egl dri2
> > >>> backend is fundamentally broken anyway.
> > >> 
> > >> Well, AFAICT your code uses Present extension functionality
> > >> unconditionally, without checking that the X server supports Present. I
> > >> can't see how that could possibly work on an X server which doesn't
> > >> support Present, but I think it would be better to keep it working at
> > >> least as badly as it does now in that case. :)
> > >
> > > I was going to say pretty much the same thing.  Aren't there (non-Intel)
> > > drivers that don't do Present?  If I'm not mistaken, some parts of DRI3
> > > (not sure about Present) are even disabled in the Intel driver when SNA
> > > is in use... or at least that was the case at one point.
> > 
> > They actually get a fallback implementation if there's no driver
> > support, which would be sufficient for this code.
> > 
> > However, Present is too new for Mesa to be unconditionally relying on in
> > my opinion.
> 
> Based on above discussion, I would bring back the dynamic detection like
> in the original patch. But for present extension instead of DRI3.
> Technically it would be very much the same, different naming
> conventions. And also, re-use the USE_INVALIDATE extension instead of
> adding DRI3 extension.
> 
> Would that be an acceptable solution?
> 
> Regards, Joonas
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Marek Olšák
From: Marek Olšák 

set_shader_resources is unused.

set_shader_buffers should support shader atomic counter buffers and shader
storage buffers from OpenGL.

The plan is to use slots 0..15 for atomic counters and slots 16..31
for storage buffers. Atomic counters are planned to be supported first.

This doesn't add any interface for images. The documentation is added
for future reference.
---

This is the interface only. I don't plan to do anything else for now.
Comments welcome.

 src/gallium/docs/source/context.rst | 16 
 src/gallium/docs/source/screen.rst  |  4 ++--
 src/gallium/drivers/galahad/glhd_context.c  |  2 +-
 src/gallium/drivers/ilo/ilo_state.c |  2 +-
 src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
 src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
 src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
 src/gallium/include/pipe/p_context.h| 20 +++-
 src/gallium/include/pipe/p_defines.h|  2 +-
 src/gallium/include/pipe/p_state.h  | 10 ++
 11 files changed, 38 insertions(+), 26 deletions(-)

diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index 5861f46..73fd35f 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -126,14 +126,14 @@ from a shader without an associated sampler.  This means 
that they
 have no support for floating point coordinates, address wrap modes or
 filtering.
 
-Shader resources are specified for all the shader stages at once using
-the ``set_shader_resources`` method.  When binding texture resources,
-the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
-specify the mipmap level and the range of layers the texture will be
-constrained to.  In the case of buffers, ``first_element`` and
-``last_element`` specify the range within the buffer that will be used
-by the shader resource.  Writes to a shader resource are only allowed
-when the ``writable`` flag is set.
+There are 2 types of shader resources: buffers and images.
+
+Buffers are specified using the ``set_shader_buffers`` method.
+
+Images are specified using the ``set_shader_images`` method. When binding
+images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
+fields specify the mipmap level and the range of layers the image will be
+constrained to.
 
 Surfaces
 
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 55d114c..c81ad66 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -403,8 +403,8 @@ resources might be created and handled quite differently.
   process.
 * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
   address space of a compute program.
-* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
-  bound to the graphics pipeline as a shader resource.
+* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader where
+  it should support reads, writes, and atomics.
 * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
   bound to the compute program as a shader resource.
 * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
diff --git a/src/gallium/drivers/galahad/glhd_context.c 
b/src/gallium/drivers/galahad/glhd_context.c
index 37ea170..383d76c 100644
--- a/src/gallium/drivers/galahad/glhd_context.c
+++ b/src/gallium/drivers/galahad/glhd_context.c
@@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen *_screen, 
struct pipe_context *pipe)
GLHD_PIPE_INIT(set_scissor_states);
GLHD_PIPE_INIT(set_viewport_states);
GLHD_PIPE_INIT(set_sampler_views);
-   //GLHD_PIPE_INIT(set_shader_resources);
+   //GLHD_PIPE_INIT(set_shader_buffers);
GLHD_PIPE_INIT(set_vertex_buffers);
GLHD_PIPE_INIT(set_index_buffer);
GLHD_PIPE_INIT(create_stream_output_target);
diff --git a/src/gallium/drivers/ilo/ilo_state.c 
b/src/gallium/drivers/ilo/ilo_state.c
index b852f9f..09209ec 100644
--- a/src/gallium/drivers/ilo/ilo_state.c
+++ b/src/gallium/drivers/ilo/ilo_state.c
@@ -1267,7 +1267,7 @@ ilo_init_state_functions(struct ilo_context *ilo)
ilo->base.set_scissor_states = ilo_set_scissor_states;
ilo->base.set_viewport_states = ilo_set_viewport_states;
ilo->base.set_sampler_views = ilo_set_sampler_views;
-   ilo->base.set_shader_resources = ilo_set_shader_resources;
+   //ilo->base.set_shader_resources = ilo_set_shader_resources;
ilo->base.set_vertex_buffers = ilo_set_vertex_buffers;
ilo->base.set_index_buffer = ilo_set_index_buffer;
 
diff --git a/src/gallium/drivers/nouveau/nouveau_buffer.c 
b/src/gallium/drivers/nouveau/nouveau_buffer.c
index 49ff100..722c516 100644
--- a/src/gallium/drivers/nouveau/nouveau_buffer.c
+++ b/src/gallium/drivers/nouveau/nouveau_buffer.c
@@ -44,7 +44,7 @@ nouveau_buffer_allocate(struct nouveau_s

Re: [Mesa-dev] [PATCH 04/13] radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders

2015-01-07 Thread Marek Olšák
How about the attached patch?

Marek

On Wed, Jan 7, 2015 at 1:23 AM, Tom Stellard  wrote:
> On Wed, Jan 07, 2015 at 01:13:37AM +0100, Marek Olšák wrote:
>> Neither. It's because we use DX10_CLAMP, which converts NaNs to 0.
>>
>
> Ok, could we add a dx10_clamp bit to si_shader and make this attribute
> conditional on that bit.  I'm concerned someone may remove DX10_CLAMP
> and forget to also remove this attribute.
>
> -Tom
>
>> Marek
>>
>> On Wed, Jan 7, 2015 at 12:51 AM, Tom Stellard  wrote:
>> > On Mon, Jan 05, 2015 at 12:18:43AM +0100, Marek Olšák wrote:
>> >> From: Marek Olšák 
>> >>
>> >> ---
>> >>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 1 +
>> >>  1 file changed, 1 insertion(+)
>> >>
>> >> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
>> >> b/src/gallium/drivers/radeon/radeon_llvm_emit.c
>> >> index dc871d7..e3be72c 100644
>> >> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
>> >> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
>> >> @@ -83,6 +83,7 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned 
>> >> type)
>> >>
>> >>   if (type != TGSI_PROCESSOR_COMPUTE) {
>> >>   LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", 
>> >> "true");
>> >> + LLVMAddTargetDependentFunctionAttr(F, 
>> >> "enable-no-nans-fp-math", "true");
>> >
>> > Is this required by the OpenGL spec or is it just to fix broken/old
>> > games?
>> >
>> > -Tom
>> >
>> >>   }
>> >>  }
>> >>
>> >> --
>> >> 2.1.0
>> >>
>> >> ___
>> >> mesa-dev mailing list
>> >> mesa-dev@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
From d960b773f3bc99928b4aab5c4344aea671595849 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Sun, 4 Jan 2015 17:08:57 +0100
Subject: [PATCH] radeonsi: enable LLVM optimizations that assume no NaNs for
 non-compute shaders

v2: complete rewrite
---
 src/gallium/drivers/radeonsi/si_shader.c| 7 +++
 src/gallium/drivers/radeonsi/si_shader.h| 1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 8 
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
index 5d61a54..cf28860 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2369,6 +2369,10 @@ static void create_function(struct si_shader_context *si_shader_ctx)
 	radeon_llvm_create_func(&si_shader_ctx->radeon_bld, params, num_params);
 	radeon_llvm_shader_type(si_shader_ctx->radeon_bld.main_fn, si_shader_ctx->type);
 
+	if (shader->dx10_clamp_mode)
+		LLVMAddTargetDependentFunctionAttr(si_shader_ctx->radeon_bld.main_fn,
+		   "enable-no-nans-fp-math", "true");
+
 	for (i = 0; i <= last_sgpr; ++i) {
 		LLVMValueRef P = LLVMGetParam(si_shader_ctx->radeon_bld.main_fn, i);
 
@@ -2723,6 +2727,9 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
 	radeon_llvm_context_init(&si_shader_ctx.radeon_bld);
 	bld_base = &si_shader_ctx.radeon_bld.soa.bld_base;
 
+	if (sel->type != PIPE_SHADER_COMPUTE)
+		shader->dx10_clamp_mode = true;
+
 	if (sel->info.uses_kill)
 		shader->db_shader_control |= S_02880C_KILL_ENABLE(1);
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h
index 21692f0..08e344a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -160,6 +160,7 @@ struct si_shader {
 	bool			uses_instanceid;
 	unsigned		nr_pos_exports;
 	bool			is_gs_copy_shader;
+	bool			dx10_clamp_mode; /* convert NaNs to 0 */
 };
 
 static inline struct tgsi_shader_info *si_get_vs_info(struct si_context *sctx)
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c b/src/gallium/drivers/radeonsi/si_state_shaders.c
index e51d50e..817a990 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -65,7 +65,7 @@ static void si_shader_es(struct si_shader *shader)
 		   S_00B328_VGPRS((shader->num_vgprs - 1) / 4) |
 		   S_00B328_SGPRS((num_sgprs - 1) / 8) |
 		   S_00B328_VGPR_COMP_CNT(vgpr_comp_cnt) |
-		   S_00B328_DX10_CLAMP(1));
+		   S_00B328_DX10_CLAMP(shader->dx10_clamp_mode));
 	si_pm4_set_reg(pm4, R_00B32C_SPI_SHADER_PGM_RSRC2_ES,
 		   S_00B32C_USER_SGPR(num_user_sgprs));
 }
@@ -134,7 +134,7 @@ static void si_shader_gs(struct si_shader *shader)
 	si_pm4_set_reg(pm4, R_00B228_SPI_SHADER_PGM_RSRC1_GS,
 		   S_00B228_VGPRS((shader->num_vgprs - 1) / 4) |
 		   S_00B228_SGPRS((num_sgprs - 1) / 8) |
-		   S_00B228_DX10_CLAMP(1));
+		   S_00B228_DX10_CLAMP(shader->dx10_clamp_mode));
 	si_pm4_set_reg(pm4, R_00B22C_SPI_SHADER_PGM_RSRC2_GS,
 		   S_00B22C_USER_SGPR(num_user_sgprs));
 }
@@ -209,7 +209,7 @@ static void si_shader_vs(struct si_shader *shader)
 		   S_00B128_VGPRS((shader->num_vgprs - 1) / 4) |
 		   S_00B128_SG

Re: [Mesa-dev] [PATCH] egl: dri2: Use present extension. (Was: Re: [RFC] egl: Add DRI3 support to the EGL backend.)

2015-01-07 Thread Axel Davy

On 07/01/2015 10:24, Joonas Lahtinenwrote :

This is still awaiting for comments.

I'd rather hear what are the desirable modifications than try guessing.



Well, ideally you would implement DRI3/Present support instead of 
complementing DRI2 support with Present.

Why improve the old solution, instead of switching to the new one ?

I know DRI3 is having a few issues to get support because of a few bugs 
in the stack, but if what you want is just small improvement to reduce 
overhead, then I woud think the answer is more implement that feature 
with DRI3.


Axel Davy
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon on Source games

2015-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=87886

--- Comment #14 from Gustaw Smolarczyk  ---
If you are using 3.18 kernel, you could also try the previous one (3.17.x). I
have a similar problem on radeon (though it's TAHITI, so radeonsi) and found
that it is a kernel regression.

In my case, I can easily reproduce it by playing Minecraft - after loading a
world, in first minute there will always be a series of 1-3s pauses.

https://bugzilla.kernel.org/show_bug.cgi?id=90741

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 14/24] mesa: Autogenerate most of format_pack.c

2015-01-07 Thread Samuel Iglesias Gonsálvez
On Monday, December 15, 2014 12:30:27 PM Samuel Iglesias Gonsálvez wrote:
> On Thursday, December 11, 2014 11:55:10 AM Jason Ekstrand wrote:
> > On Tue, Dec 9, 2014 at 4:06 AM, Iago Toral Quiroga 
> > wrote:
> > 
> > [snip]
> > 
> > > new file mode 100644
> > > index 000..5f6809e
> > > --- /dev/null
> > > +++ b/src/mesa/main/format_pack.py
> > > @@ -0,0 +1,907 @@
> > > +#!/usr/bin/env python
> > > +
> > > +from mako.template import Template
> > > +from sys import argv
> > > +
> > > +string = """/*
> > > + * Mesa 3-D graphics library
> > > + *
> > > + * Copyright (c) 2011 VMware, Inc.
> > > + * Copyright (c) 2014 Intel Corporation.
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person
> > > obtaining
> > > a
> > > + * copy of this software and associated documentation files (the
> > > "Software"),
> > > + * to deal in the Software without restriction, including without
> > > limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute,
> > > sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom
> > > the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice shall be
> > > included
> > > + * in all copies or substantial portions of the Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > > EXPRESS
> > > + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> > > SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > > OTHERWISE,
> > > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > > OR
> > > + * OTHER DEALINGS IN THE SOFTWARE.
> > > + */
> > > +
> > > +
> > > +/**
> > > + * Color, depth, stencil packing functions.
> > > + * Used to pack basic color, depth and stencil formats to specific
> > > + * hardware formats.
> > > + *
> > > + * There are both per-pixel and per-row packing functions:
> > > + * - The former will be used by swrast to write values to the color,
> > > depth,
> > > + *   stencil buffers when drawing points, lines and masked spans.
> > > + * - The later will be used for image-oriented functions like
> > > glDrawPixels,
> > > + *   glAccum, and glTexImage.
> > > + */
> > > +
> > > +#include 
> > > +
> > > +#include "colormac.h"
> > > +#include "format_pack.h"
> > > +#include "format_utils.h"
> > > +#include "macros.h"
> > > +#include "../../gallium/auxiliary/util/u_format_rgb9e5.h"
> > > +#include "../../gallium/auxiliary/util/u_format_r11g11b10f.h"
> > > +#include "util/format_srgb.h"
> > > +
> > > +#define UNPACK(SRC, OFFSET, BITS) (((SRC) >> (OFFSET)) &
> > > MAX_UINT(BITS))
> > > +#define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET))
> > > +
> > > +<%
> > > +import format_parser as parser
> > > +
> > > +formats = parser.parse(argv[1])
> > > +
> > > +rgb_formats = []
> > > +for f in formats:
> > > +   if f.name == 'MESA_FORMAT_NONE':
> > > +  continue
> > > +   if f.colorspace not in ('rgb', 'srgb'):
> > > +  continue
> > > +
> > > +   rgb_formats.append(f)
> > > +%>
> > > +
> > > +/* ubyte packing functions */
> > > +
> > > +%for f in rgb_formats:
> > > +   %if f.name in ('MESA_FORMAT_R9G9B9E5_FLOAT',
> > > 'MESA_FORMAT_R11G11B10_FLOAT'):
> > > +  <% continue %>
> > > +   %elif f.is_compressed():
> > > +  <% continue %>
> > > +   %endif
> > > +
> > > +static inline void
> > > +pack_ubyte_${f.short_name()}(const GLubyte src[4], void *dst)
> > > +{
> > > +   %for (i, c) in enumerate(f.channels):
> > > +  <% i = f.swizzle.inverse()[i] %>
> > > +  %if c.type == 'x':
> > > + <% continue %>
> > > +  %endif
> > > +
> > > +  ${c.datatype()} ${c.name} =
> > > +  %if c.type == parser.UNSIGNED:
> > > + %if f.colorspace == 'srgb' and c.name in 'rgb':
> > > +util_format_linear_to_srgb_8unorm(src[${i}]);
> > > + %else:
> > > +_mesa_unorm_to_unorm(src[${i}], 8, ${c.size});
> > > + %endif
> > > +  %elif c.type == parser.SIGNED:
> > > + _mesa_unorm_to_snorm(src[${i}], 8, ${c.size});
> > > +  %elif c.type == parser.FLOAT:
> > > + %if c.size == 32:
> > > +_mesa_unorm_to_float(src[${i}], 8);
> > > + %elif c.size == 16:
> > > +_mesa_unorm_to_half(src[${i}], 8);
> > > + %else:
> > > +<% assert False %>
> > > + %endif
> > > +  %else:
> > > + <% assert False %>
> > > +  %endif
> > > +   %endfor
> > > +
> > > +   %if f.layout == parser.ARRAY:
> > > +  ${f.datatype()} *d = (${f.datatype()} *)dst;
> > > +  %for (i, c) in enumerate(f.channels):
> > > + %if c.type == 'x':
> > > +<% continue %>
> > > + %endif
> > > +   

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 5:56 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> set_shader_resources is unused.
>
> set_shader_buffers should support shader atomic counter buffers and shader
> storage buffers from OpenGL.
>
> The plan is to use slots 0..15 for atomic counters and slots 16..31
> for storage buffers. Atomic counters are planned to be supported first.
>
> This doesn't add any interface for images. The documentation is added
> for future reference.
> ---
>
> This is the interface only. I don't plan to do anything else for now.
> Comments welcome.

Can you clarify how this is better than the set_shader_resources
interface, which can also be shared for images (which will need to
support texture buffers...)?

FWIW, there's already an impl for nve4 images using
set_shader_resources (not sure how Christoph had tested it, I think
using some preliminary OpenCL C -> TGSI converter with image support).

Are these buffers fundamentally different than images? We'll still
need atomic support for images as well...

Also how do you anticipate this will be integrated into TGSI? Right
now there's a TGSI_FILE_RESOURCE -- will there be a new
TGSI_FILE_BUFFER and TGSI_FILE_IMAGE?

>
>  src/gallium/docs/source/context.rst | 16 
>  src/gallium/docs/source/screen.rst  |  4 ++--
>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>  src/gallium/include/pipe/p_context.h| 20 +++-
>  src/gallium/include/pipe/p_defines.h|  2 +-
>  src/gallium/include/pipe/p_state.h  | 10 ++
>  11 files changed, 38 insertions(+), 26 deletions(-)
>
> diff --git a/src/gallium/docs/source/context.rst 
> b/src/gallium/docs/source/context.rst
> index 5861f46..73fd35f 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This 
> means that they
>  have no support for floating point coordinates, address wrap modes or
>  filtering.
>
> -Shader resources are specified for all the shader stages at once using
> -the ``set_shader_resources`` method.  When binding texture resources,
> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
> -specify the mipmap level and the range of layers the texture will be
> -constrained to.  In the case of buffers, ``first_element`` and
> -``last_element`` specify the range within the buffer that will be used
> -by the shader resource.  Writes to a shader resource are only allowed
> -when the ``writable`` flag is set.
> +There are 2 types of shader resources: buffers and images.
> +
> +Buffers are specified using the ``set_shader_buffers`` method.
> +
> +Images are specified using the ``set_shader_images`` method. When binding
> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
> +fields specify the mipmap level and the range of layers the image will be
> +constrained to.
>
>  Surfaces
>  
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index 55d114c..c81ad66 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -403,8 +403,8 @@ resources might be created and handled quite differently.
>process.
>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
>address space of a compute program.
> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
> -  bound to the graphics pipeline as a shader resource.
> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader where
> +  it should support reads, writes, and atomics.
>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>bound to the compute program as a shader resource.
>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
> diff --git a/src/gallium/drivers/galahad/glhd_context.c 
> b/src/gallium/drivers/galahad/glhd_context.c
> index 37ea170..383d76c 100644
> --- a/src/gallium/drivers/galahad/glhd_context.c
> +++ b/src/gallium/drivers/galahad/glhd_context.c
> @@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen *_screen, 
> struct pipe_context *pipe)
> GLHD_PIPE_INIT(set_scissor_states);
> GLHD_PIPE_INIT(set_viewport_states);
> GLHD_PIPE_INIT(set_sampler_views);
> -   //GLHD_PIPE_INIT(set_shader_resources);
> +   //GLHD_PIPE_INIT(set_shader_buffers);
> GLHD_PIPE_INIT(set_vertex_buffers);
> GLHD_PIPE_INIT(set_index_buffer);
> GLHD_PIPE_INIT(create_stream_output_target);
> diff --git a/src/gallium/drivers/ilo/ilo_state.c 
> b/src/gallium/drivers/ilo/ilo_state.c
> index b852f9f..09209ec 100644
> --- a/src/galliu

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Aditya Avinash
Hi,
Sounds great but, do you think a separate buffer pipe is required for this?
Changing Constant buffer to a generic buffer (with alu+load+store) can help.

What about for R600? Do we have to add

r600_init_atom(rctx, &rctx->shaderbuf_state[PIPE_SHADER_VERTEX].atom, id++,
r600_emit_vs_shader_buffers, 0);

to backend? Will this be specific to Atomics?

Thank you!!

On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:

> From: Marek Olšák 
>
> set_shader_resources is unused.
>
> set_shader_buffers should support shader atomic counter buffers and shader
> storage buffers from OpenGL.
>
> The plan is to use slots 0..15 for atomic counters and slots 16..31
> for storage buffers. Atomic counters are planned to be supported first.
>
> This doesn't add any interface for images. The documentation is added
> for future reference.
> ---
>
> This is the interface only. I don't plan to do anything else for now.
> Comments welcome.
>
>  src/gallium/docs/source/context.rst | 16 
>  src/gallium/docs/source/screen.rst  |  4 ++--
>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>  src/gallium/include/pipe/p_context.h| 20 +++-
>  src/gallium/include/pipe/p_defines.h|  2 +-
>  src/gallium/include/pipe/p_state.h  | 10 ++
>  11 files changed, 38 insertions(+), 26 deletions(-)
>
> diff --git a/src/gallium/docs/source/context.rst
> b/src/gallium/docs/source/context.rst
> index 5861f46..73fd35f 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
> means that they
>  have no support for floating point coordinates, address wrap modes or
>  filtering.
>
> -Shader resources are specified for all the shader stages at once using
> -the ``set_shader_resources`` method.  When binding texture resources,
> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
> -specify the mipmap level and the range of layers the texture will be
> -constrained to.  In the case of buffers, ``first_element`` and
> -``last_element`` specify the range within the buffer that will be used
> -by the shader resource.  Writes to a shader resource are only allowed
> -when the ``writable`` flag is set.
> +There are 2 types of shader resources: buffers and images.
> +
> +Buffers are specified using the ``set_shader_buffers`` method.
> +
> +Images are specified using the ``set_shader_images`` method. When binding
> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
> +fields specify the mipmap level and the range of layers the image will be
> +constrained to.
>
>  Surfaces
>  
> diff --git a/src/gallium/docs/source/screen.rst
> b/src/gallium/docs/source/screen.rst
> index 55d114c..c81ad66 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -403,8 +403,8 @@ resources might be created and handled quite
> differently.
>process.
>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
>address space of a compute program.
> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
> -  bound to the graphics pipeline as a shader resource.
> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader
> where
> +  it should support reads, writes, and atomics.
>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>bound to the compute program as a shader resource.
>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
> diff --git a/src/gallium/drivers/galahad/glhd_context.c
> b/src/gallium/drivers/galahad/glhd_context.c
> index 37ea170..383d76c 100644
> --- a/src/gallium/drivers/galahad/glhd_context.c
> +++ b/src/gallium/drivers/galahad/glhd_context.c
> @@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen *_screen,
> struct pipe_context *pipe)
> GLHD_PIPE_INIT(set_scissor_states);
> GLHD_PIPE_INIT(set_viewport_states);
> GLHD_PIPE_INIT(set_sampler_views);
> -   //GLHD_PIPE_INIT(set_shader_resources);
> +   //GLHD_PIPE_INIT(set_shader_buffers);
> GLHD_PIPE_INIT(set_vertex_buffers);
> GLHD_PIPE_INIT(set_index_buffer);
> GLHD_PIPE_INIT(create_stream_output_target);
> diff --git a/src/gallium/drivers/ilo/ilo_state.c
> b/src/gallium/drivers/ilo/ilo_state.c
> index b852f9f..09209ec 100644
> --- a/src/gallium/drivers/ilo/ilo_state.c
> +++ b/src/gallium/drivers/ilo/ilo_state.c
> @@ -1267,7 +1267,7 @@ ilo_init_state_functions(struct ilo_context *ilo)
> ilo->base.set_scissor_states = ilo_set_scissor_states;
> ilo->base.set_viewport_states = ilo_set_viewport_s

Re: [Mesa-dev] [PATCH 04/13] radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders

2015-01-07 Thread Alex Deucher
On Wed, Jan 7, 2015 at 6:58 AM, Marek Olšák  wrote:
> How about the attached patch?

Reviewed-by: Alex Deucher 

>
> Marek
>
> On Wed, Jan 7, 2015 at 1:23 AM, Tom Stellard  wrote:
>> On Wed, Jan 07, 2015 at 01:13:37AM +0100, Marek Olšák wrote:
>>> Neither. It's because we use DX10_CLAMP, which converts NaNs to 0.
>>>
>>
>> Ok, could we add a dx10_clamp bit to si_shader and make this attribute
>> conditional on that bit.  I'm concerned someone may remove DX10_CLAMP
>> and forget to also remove this attribute.
>>
>> -Tom
>>
>>> Marek
>>>
>>> On Wed, Jan 7, 2015 at 12:51 AM, Tom Stellard  wrote:
>>> > On Mon, Jan 05, 2015 at 12:18:43AM +0100, Marek Olšák wrote:
>>> >> From: Marek Olšák 
>>> >>
>>> >> ---
>>> >>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 1 +
>>> >>  1 file changed, 1 insertion(+)
>>> >>
>>> >> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
>>> >> b/src/gallium/drivers/radeon/radeon_llvm_emit.c
>>> >> index dc871d7..e3be72c 100644
>>> >> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
>>> >> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
>>> >> @@ -83,6 +83,7 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned 
>>> >> type)
>>> >>
>>> >>   if (type != TGSI_PROCESSOR_COMPUTE) {
>>> >>   LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", 
>>> >> "true");
>>> >> + LLVMAddTargetDependentFunctionAttr(F, 
>>> >> "enable-no-nans-fp-math", "true");
>>> >
>>> > Is this required by the OpenGL spec or is it just to fix broken/old
>>> > games?
>>> >
>>> > -Tom
>>> >
>>> >>   }
>>> >>  }
>>> >>
>>> >> --
>>> >> 2.1.0
>>> >>
>>> >> ___
>>> >> mesa-dev mailing list
>>> >> mesa-dev@lists.freedesktop.org
>>> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Marek Olšák
On Wed, Jan 7, 2015 at 2:41 PM, Ilia Mirkin  wrote:
> On Wed, Jan 7, 2015 at 5:56 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> set_shader_resources is unused.
>>
>> set_shader_buffers should support shader atomic counter buffers and shader
>> storage buffers from OpenGL.
>>
>> The plan is to use slots 0..15 for atomic counters and slots 16..31
>> for storage buffers. Atomic counters are planned to be supported first.
>>
>> This doesn't add any interface for images. The documentation is added
>> for future reference.
>> ---
>>
>> This is the interface only. I don't plan to do anything else for now.
>> Comments welcome.
>
> Can you clarify how this is better than the set_shader_resources
> interface, which can also be shared for images (which will need to
> support texture buffers...)?

1) You don't need to create any views for these. Creating,
initializing, referencing, and destroying views is work that should be
avoided if it's unnecessary.

2) It saves space for resource descriptions on SI (both memory and
cache). A buffer slot needs 4 dwords, but a texture (image) slot needs
8 dwords.

Original DX11 AMD hardware (Evergreen) will have to merge
set_shader_buffers, set_shader_images, and set_framebuffer_state
anyway. One less function won't make it much easier. Post-DX11
hardware (SI) can do pretty much anything, but this solution is more
efficient for that hardware.

>
> FWIW, there's already an impl for nve4 images using
> set_shader_resources (not sure how Christoph had tested it, I think
> using some preliminary OpenCL C -> TGSI converter with image support).
>
> Are these buffers fundamentally different than images? We'll still
> need atomic support for images as well...

The main difference is:
- shader buffers don't have a view and format. pipe_resources are set directly.
- shader images have a view and format, this also includes buffers
that have a format.

>
> Also how do you anticipate this will be integrated into TGSI? Right
> now there's a TGSI_FILE_RESOURCE -- will there be a new
> TGSI_FILE_BUFFER and TGSI_FILE_IMAGE?

Yes, this needs to be changed as well.

Opinions?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Aditya Avinash
On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:

> From: Marek Olšák 
>
> set_shader_resources is unused.
>
> set_shader_buffers should support shader atomic counter buffers and shader
> storage buffers from OpenGL.
>
> The plan is to use slots 0..15 for atomic counters and slots 16..31
> for storage buffers. Atomic counters are planned to be supported first.
>
> This doesn't add any interface for images. The documentation is added
> for future reference.
> ---
>
> This is the interface only. I don't plan to do anything else for now.
> Comments welcome.
>
>  src/gallium/docs/source/context.rst | 16 
>  src/gallium/docs/source/screen.rst  |  4 ++--
>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>  src/gallium/include/pipe/p_context.h| 20 +++-
>  src/gallium/include/pipe/p_defines.h|  2 +-
>  src/gallium/include/pipe/p_state.h  | 10 ++
>  11 files changed, 38 insertions(+), 26 deletions(-)
>
> diff --git a/src/gallium/docs/source/context.rst
> b/src/gallium/docs/source/context.rst
> index 5861f46..73fd35f 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
> means that they
>  have no support for floating point coordinates, address wrap modes or
>  filtering.
>
> -Shader resources are specified for all the shader stages at once using
> -the ``set_shader_resources`` method.  When binding texture resources,
> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
> -specify the mipmap level and the range of layers the texture will be
> -constrained to.  In the case of buffers, ``first_element`` and
> -``last_element`` specify the range within the buffer that will be used
> -by the shader resource.  Writes to a shader resource are only allowed
> -when the ``writable`` flag is set.
> +There are 2 types of shader resources: buffers and images.
> +
> +Buffers are specified using the ``set_shader_buffers`` method.
> +
> +Images are specified using the ``set_shader_images`` method. When binding
> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
> +fields specify the mipmap level and the range of layers the image will be
> +constrained to.
>
>  Surfaces
>  
>

set_shader_images are not defined in this patch.
Will it look similar to pipe_surface or pipe_sampler_view?


> diff --git a/src/gallium/docs/source/screen.rst
> b/src/gallium/docs/source/screen.rst
> index 55d114c..c81ad66 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -403,8 +403,8 @@ resources might be created and handled quite
> differently.
>process.
>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
>address space of a compute program.
> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
> -  bound to the graphics pipeline as a shader resource.
> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader
> where
> +  it should support reads, writes, and atomics.
>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>bound to the compute program as a shader resource.
>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
> diff --git a/src/gallium/drivers/galahad/glhd_context.c
> b/src/gallium/drivers/galahad/glhd_context.c
> index 37ea170..383d76c 100644
> --- a/src/gallium/drivers/galahad/glhd_context.c
> +++ b/src/gallium/drivers/galahad/glhd_context.c
> @@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen *_screen,
> struct pipe_context *pipe)
> GLHD_PIPE_INIT(set_scissor_states);
> GLHD_PIPE_INIT(set_viewport_states);
> GLHD_PIPE_INIT(set_sampler_views);
> -   //GLHD_PIPE_INIT(set_shader_resources);
> +   //GLHD_PIPE_INIT(set_shader_buffers);
> GLHD_PIPE_INIT(set_vertex_buffers);
> GLHD_PIPE_INIT(set_index_buffer);
> GLHD_PIPE_INIT(create_stream_output_target);
> diff --git a/src/gallium/drivers/ilo/ilo_state.c
> b/src/gallium/drivers/ilo/ilo_state.c
> index b852f9f..09209ec 100644
> --- a/src/gallium/drivers/ilo/ilo_state.c
> +++ b/src/gallium/drivers/ilo/ilo_state.c
> @@ -1267,7 +1267,7 @@ ilo_init_state_functions(struct ilo_context *ilo)
> ilo->base.set_scissor_states = ilo_set_scissor_states;
> ilo->base.set_viewport_states = ilo_set_viewport_states;
> ilo->base.set_sampler_views = ilo_set_sampler_views;
> -   ilo->base.set_shader_resources = ilo_set_shader_resources;
> +   //ilo->base.set_shader_resources = ilo_set_shader_resources;
> ilo->base.set_vertex_buffers = ilo_set_vertex_buffe

Re: [Mesa-dev] [PATCH v3 14/24] mesa: Autogenerate most of format_pack.c

2015-01-07 Thread Jason Ekstrand
On Jan 7, 2015 4:45 AM, "Samuel Iglesias Gonsálvez" 
wrote:
>
> On Monday, December 15, 2014 12:30:27 PM Samuel Iglesias Gonsálvez wrote:
> > On Thursday, December 11, 2014 11:55:10 AM Jason Ekstrand wrote:
> > > On Tue, Dec 9, 2014 at 4:06 AM, Iago Toral Quiroga 
> > > wrote:
> > >
> > > [snip]
> > >
> > > > new file mode 100644
> > > > index 000..5f6809e
> > > > --- /dev/null
> > > > +++ b/src/mesa/main/format_pack.py
> > > > @@ -0,0 +1,907 @@
> > > > +#!/usr/bin/env python
> > > > +
> > > > +from mako.template import Template
> > > > +from sys import argv
> > > > +
> > > > +string = """/*
> > > > + * Mesa 3-D graphics library
> > > > + *
> > > > + * Copyright (c) 2011 VMware, Inc.
> > > > + * Copyright (c) 2014 Intel Corporation.
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person
> > > > obtaining
> > > > a
> > > > + * copy of this software and associated documentation files (the
> > > > "Software"),
> > > > + * to deal in the Software without restriction, including without
> > > > limitation
> > > > + * the rights to use, copy, modify, merge, publish, distribute,
> > > > sublicense,
> > > > + * and/or sell copies of the Software, and to permit persons to
whom
> > > > the
> > > > + * Software is furnished to do so, subject to the following
conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice shall be
> > > > included
> > > > + * in all copies or substantial portions of the Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > > > EXPRESS
> > > > + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > > MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
EVENT
> > > > SHALL
> > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR
> > > > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > > > OTHERWISE,
> > > > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
USE
> > > > OR
> > > > + * OTHER DEALINGS IN THE SOFTWARE.
> > > > + */
> > > > +
> > > > +
> > > > +/**
> > > > + * Color, depth, stencil packing functions.
> > > > + * Used to pack basic color, depth and stencil formats to specific
> > > > + * hardware formats.
> > > > + *
> > > > + * There are both per-pixel and per-row packing functions:
> > > > + * - The former will be used by swrast to write values to the
color,
> > > > depth,
> > > > + *   stencil buffers when drawing points, lines and masked spans.
> > > > + * - The later will be used for image-oriented functions like
> > > > glDrawPixels,
> > > > + *   glAccum, and glTexImage.
> > > > + */
> > > > +
> > > > +#include 
> > > > +
> > > > +#include "colormac.h"
> > > > +#include "format_pack.h"
> > > > +#include "format_utils.h"
> > > > +#include "macros.h"
> > > > +#include "../../gallium/auxiliary/util/u_format_rgb9e5.h"
> > > > +#include "../../gallium/auxiliary/util/u_format_r11g11b10f.h"
> > > > +#include "util/format_srgb.h"
> > > > +
> > > > +#define UNPACK(SRC, OFFSET, BITS) (((SRC) >> (OFFSET)) &
> > > > MAX_UINT(BITS))
> > > > +#define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) <<
(OFFSET))
> > > > +
> > > > +<%
> > > > +import format_parser as parser
> > > > +
> > > > +formats = parser.parse(argv[1])
> > > > +
> > > > +rgb_formats = []
> > > > +for f in formats:
> > > > +   if f.name == 'MESA_FORMAT_NONE':
> > > > +  continue
> > > > +   if f.colorspace not in ('rgb', 'srgb'):
> > > > +  continue
> > > > +
> > > > +   rgb_formats.append(f)
> > > > +%>
> > > > +
> > > > +/* ubyte packing functions */
> > > > +
> > > > +%for f in rgb_formats:
> > > > +   %if f.name in ('MESA_FORMAT_R9G9B9E5_FLOAT',
> > > > 'MESA_FORMAT_R11G11B10_FLOAT'):
> > > > +  <% continue %>
> > > > +   %elif f.is_compressed():
> > > > +  <% continue %>
> > > > +   %endif
> > > > +
> > > > +static inline void
> > > > +pack_ubyte_${f.short_name()}(const GLubyte src[4], void *dst)
> > > > +{
> > > > +   %for (i, c) in enumerate(f.channels):
> > > > +  <% i = f.swizzle.inverse()[i] %>
> > > > +  %if c.type == 'x':
> > > > + <% continue %>
> > > > +  %endif
> > > > +
> > > > +  ${c.datatype()} ${c.name} =
> > > > +  %if c.type == parser.UNSIGNED:
> > > > + %if f.colorspace == 'srgb' and c.name in 'rgb':
> > > > +util_format_linear_to_srgb_8unorm(src[${i}]);
> > > > + %else:
> > > > +_mesa_unorm_to_unorm(src[${i}], 8, ${c.size});
> > > > + %endif
> > > > +  %elif c.type == parser.SIGNED:
> > > > + _mesa_unorm_to_snorm(src[${i}], 8, ${c.size});
> > > > +  %elif c.type == parser.FLOAT:
> > > > + %if c.size == 32:
> > > > +_mesa_unorm_to_float(src[${i}], 8);
> > > > + %elif c.size == 16:
> > > > +_mesa_unorm_to_half(src[${i}], 8);
> > > > + %else:
> > > > +<% assert False %>
> > > > + %endif
> > > > +  %else:
> 

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Marek Olšák
On Wed, Jan 7, 2015 at 3:44 PM, Aditya Avinash  wrote:
>
>
> On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:
>>
>> From: Marek Olšák 
>>
>> set_shader_resources is unused.
>>
>> set_shader_buffers should support shader atomic counter buffers and shader
>> storage buffers from OpenGL.
>>
>> The plan is to use slots 0..15 for atomic counters and slots 16..31
>> for storage buffers. Atomic counters are planned to be supported first.
>>
>> This doesn't add any interface for images. The documentation is added
>> for future reference.
>> ---
>>
>> This is the interface only. I don't plan to do anything else for now.
>> Comments welcome.
>>
>>  src/gallium/docs/source/context.rst | 16 
>>  src/gallium/docs/source/screen.rst  |  4 ++--
>>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>>  src/gallium/include/pipe/p_context.h| 20 +++-
>>  src/gallium/include/pipe/p_defines.h|  2 +-
>>  src/gallium/include/pipe/p_state.h  | 10 ++
>>  11 files changed, 38 insertions(+), 26 deletions(-)
>>
>> diff --git a/src/gallium/docs/source/context.rst
>> b/src/gallium/docs/source/context.rst
>> index 5861f46..73fd35f 100644
>> --- a/src/gallium/docs/source/context.rst
>> +++ b/src/gallium/docs/source/context.rst
>> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
>> means that they
>>  have no support for floating point coordinates, address wrap modes or
>>  filtering.
>>
>> -Shader resources are specified for all the shader stages at once using
>> -the ``set_shader_resources`` method.  When binding texture resources,
>> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
>> -specify the mipmap level and the range of layers the texture will be
>> -constrained to.  In the case of buffers, ``first_element`` and
>> -``last_element`` specify the range within the buffer that will be used
>> -by the shader resource.  Writes to a shader resource are only allowed
>> -when the ``writable`` flag is set.
>> +There are 2 types of shader resources: buffers and images.
>> +
>> +Buffers are specified using the ``set_shader_buffers`` method.
>> +
>> +Images are specified using the ``set_shader_images`` method. When binding
>> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
>> +fields specify the mipmap level and the range of layers the image will be
>> +constrained to.
>>
>>  Surfaces
>>  
>
>
> set_shader_images are not defined in this patch.
> Will it look similar to pipe_surface or pipe_sampler_view?

There will be a separate view for images if this is approved.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Aditya Avinash
Oh. So, we get better performance if we use atomic counters as buffers
rather than textures (images) [manipulating views are expensive].

Am I right?

On Wed, Jan 7, 2015 at 8:52 AM, Marek Olšák  wrote:

> On Wed, Jan 7, 2015 at 3:44 PM, Aditya Avinash 
> wrote:
> >
> >
> > On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:
> >>
> >> From: Marek Olšák 
> >>
> >> set_shader_resources is unused.
> >>
> >> set_shader_buffers should support shader atomic counter buffers and
> shader
> >> storage buffers from OpenGL.
> >>
> >> The plan is to use slots 0..15 for atomic counters and slots 16..31
> >> for storage buffers. Atomic counters are planned to be supported first.
> >>
> >> This doesn't add any interface for images. The documentation is added
> >> for future reference.
> >> ---
> >>
> >> This is the interface only. I don't plan to do anything else for now.
> >> Comments welcome.
> >>
> >>  src/gallium/docs/source/context.rst | 16 
> >>  src/gallium/docs/source/screen.rst  |  4 ++--
> >>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
> >>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
> >>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
> >>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
> >>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
> >>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
> >>  src/gallium/include/pipe/p_context.h| 20
> +++-
> >>  src/gallium/include/pipe/p_defines.h|  2 +-
> >>  src/gallium/include/pipe/p_state.h  | 10 ++
> >>  11 files changed, 38 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/src/gallium/docs/source/context.rst
> >> b/src/gallium/docs/source/context.rst
> >> index 5861f46..73fd35f 100644
> >> --- a/src/gallium/docs/source/context.rst
> >> +++ b/src/gallium/docs/source/context.rst
> >> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
> >> means that they
> >>  have no support for floating point coordinates, address wrap modes or
> >>  filtering.
> >>
> >> -Shader resources are specified for all the shader stages at once using
> >> -the ``set_shader_resources`` method.  When binding texture resources,
> >> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
> >> -specify the mipmap level and the range of layers the texture will be
> >> -constrained to.  In the case of buffers, ``first_element`` and
> >> -``last_element`` specify the range within the buffer that will be used
> >> -by the shader resource.  Writes to a shader resource are only allowed
> >> -when the ``writable`` flag is set.
> >> +There are 2 types of shader resources: buffers and images.
> >> +
> >> +Buffers are specified using the ``set_shader_buffers`` method.
> >> +
> >> +Images are specified using the ``set_shader_images`` method. When
> binding
> >> +images, the ``level``, ``first_layer`` and ``last_layer``
> pipe_image_view
> >> +fields specify the mipmap level and the range of layers the image will
> be
> >> +constrained to.
> >>
> >>  Surfaces
> >>  
> >
> >
> > set_shader_images are not defined in this patch.
> > Will it look similar to pipe_surface or pipe_sampler_view?
>
> There will be a separate view for images if this is approved.
>
> Marek
>



-- 
Regards,

*Aditya Atluri,*

*USA.*
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/16] glsl: fix assignment of multiple scalar and vecs to matrices.

2015-01-07 Thread Samuel Iglesias Gonsálvez
This is still awaiting for comments.

Sam

On Thursday, December 11, 2014 11:34:12 PM Eduardo Lima Mitev wrote:
> From: Samuel Iglesias Gonsalvez 
> 
> When a vec has more elements than row components in a matrix, the
> code could end up failing an assert inside assign_to_matrix_column().
> 
> This patch makes sure that when there is still room in the matrix for
> more elements (but in other columns of the matrix), the data is actually
> assigned.
> 
> This patch fixes the following dEQP test:
> 
>  
> dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_
> bool_to_mat4x2_vertex
> 
> No piglit regressions observed
> 
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/ast_function.cpp | 121
> ++ 1 file changed, 68
> insertions(+), 53 deletions(-)
> 
> diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
> index cbff9d8..451e3be 100644
> --- a/src/glsl/ast_function.cpp
> +++ b/src/glsl/ast_function.cpp
> @@ -1334,67 +1334,82 @@ emit_inline_matrix_constructor(const glsl_type
> *type, unsigned row_idx = 0;
> 
>foreach_in_list(ir_rvalue, rhs, parameters) {
> -  const unsigned components_remaining_this_column = rows - row_idx;
> +  unsigned components_remaining_this_column;
>unsigned rhs_components = rhs->type->components();
>unsigned rhs_base = 0;
> 
> -  /* Since the parameter might be used in the RHS of two assignments,
> -   * generate a temporary and copy the paramter there.
> -   */
> -  ir_variable *rhs_var =
> - new(ctx) ir_variable(rhs->type, "mat_ctor_vec", ir_var_temporary);
> -  instructions->push_tail(rhs_var);
> -
> -  ir_dereference *rhs_var_ref =
> - new(ctx) ir_dereference_variable(rhs_var);
> -  ir_instruction *inst = new(ctx) ir_assignment(rhs_var_ref, rhs, NULL);
> -  instructions->push_tail(inst);
> -
> -  /* Assign the current parameter to as many components of the matrix
> -   * as it will fill.
> -   *
> -   * NOTE: A single vector parameter can span two matrix columns.  A
> -   * single vec4, for example, can completely fill a mat2.
> -   */
> -  if (rhs_components >= components_remaining_this_column) {
> - const unsigned count = MIN2(rhs_components,
> - components_remaining_this_column);
> -
> - rhs_var_ref = new(ctx) ir_dereference_variable(rhs_var);
> -
> - ir_instruction *inst = assign_to_matrix_column(var, col_idx,
> -row_idx,
> -rhs_var_ref, 0,
> -count, ctx);
> - instructions->push_tail(inst);
> -
> - rhs_base = count;
> + /* Since the parameter might be used in the RHS of two
> assignments, +  * generate a temporary and copy the paramter there.
> +  */
> + ir_variable *rhs_var =
> +new(ctx) ir_variable(rhs->type, "mat_ctor_vec",
> ir_var_temporary); + instructions->push_tail(rhs_var);
> +
> + ir_dereference *rhs_var_ref =
> +new(ctx) ir_dereference_variable(rhs_var);
> + ir_instruction *inst = new(ctx) ir_assignment(rhs_var_ref, rhs,
> NULL); + instructions->push_tail(inst);
> +
> + do {
> +components_remaining_this_column = rows - row_idx;
> +/* Assign the current parameter to as many components of the
> matrix + * as it will fill.
> + *
> + * NOTE: A single vector parameter can span two matrix columns.
>  A + * single vec4, for example, can completely fill a mat2. + 
>*/
> +if (components_remaining_this_column > 0 &&
> +(rhs_components - rhs_base) >=
> components_remaining_this_column) { +   const unsigned count =
> MIN2(rhs_components - rhs_base, +  
> components_remaining_this_column);
> 
> - col_idx++;
> - row_idx = 0;
> -  }
> +   rhs_var_ref = new(ctx) ir_dereference_variable(rhs_var);
> 
> -  /* If there is data left in the parameter and components left to be
> -   * set in the destination, emit another assignment.  It is possible
> -   * that the assignment could be of a vec4 to the last element of the
> -   * matrix.  In this case col_idx==cols, but there is still data
> -   * left in the source parameter.  Obviously, don't emit an assignment
> -   * to data outside the destination matrix.
> -   */
> -  if ((col_idx < cols) && (rhs_base < rhs_components)) {
> - const unsigned count = rhs_components - rhs_base;
> +   ir_instruction *inst = assign_to_matrix_column(var, col_idx,
> +  row_idx, +  
> 

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 9:35 AM, Marek Olšák  wrote:
> On Wed, Jan 7, 2015 at 2:41 PM, Ilia Mirkin  wrote:
>> On Wed, Jan 7, 2015 at 5:56 AM, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> set_shader_resources is unused.
>>>
>>> set_shader_buffers should support shader atomic counter buffers and shader
>>> storage buffers from OpenGL.
>>>
>>> The plan is to use slots 0..15 for atomic counters and slots 16..31
>>> for storage buffers. Atomic counters are planned to be supported first.
>>>
>>> This doesn't add any interface for images. The documentation is added
>>> for future reference.
>>> ---
>>>
>>> This is the interface only. I don't plan to do anything else for now.
>>> Comments welcome.
>>
>> Can you clarify how this is better than the set_shader_resources
>> interface, which can also be shared for images (which will need to
>> support texture buffers...)?
>
> 1) You don't need to create any views for these. Creating,
> initializing, referencing, and destroying views is work that should be
> avoided if it's unnecessary.

I guess you mean surfaces? You still have to bind a reference to the
backing buffer _somewhere_...

>
> 2) It saves space for resource descriptions on SI (both memory and
> cache). A buffer slot needs 4 dwords, but a texture (image) slot needs
> 8 dwords.
>
> Original DX11 AMD hardware (Evergreen) will have to merge
> set_shader_buffers, set_shader_images, and set_framebuffer_state
> anyway. One less function won't make it much easier. Post-DX11
> hardware (SI) can do pretty much anything, but this solution is more
> efficient for that hardware.
>
>>
>> FWIW, there's already an impl for nve4 images using
>> set_shader_resources (not sure how Christoph had tested it, I think
>> using some preliminary OpenCL C -> TGSI converter with image support).
>>
>> Are these buffers fundamentally different than images? We'll still
>> need atomic support for images as well...
>
> The main difference is:
> - shader buffers don't have a view and format. pipe_resources are set 
> directly.
> - shader images have a view and format, this also includes buffers
> that have a format.
>
>>
>> Also how do you anticipate this will be integrated into TGSI? Right
>> now there's a TGSI_FILE_RESOURCE -- will there be a new
>> TGSI_FILE_BUFFER and TGSI_FILE_IMAGE?
>
> Yes, this needs to be changed as well.
>
> Opinions?

OK, well, this interface also seems workable. From what I can tell,
nve0 (kepler) is more similar to radeonsi in this regard, and nv50
isn't realistically going to gain support for this (blob driver
doesn't either). The wildcard is nvc0, which I haven't really traced
for image stuff yet. I guess instructions like LOAD/STORE/ATOM* would
be able to take either IMAGE or BUFFER things? Or separate instruction
variants?

This does, however, present an asymmetry to the compute interface,
which currently just has

   void (*set_compute_resources)(struct pipe_context *,
 unsigned start, unsigned count,
 struct pipe_surface **resources);

Should that be changed over to the buffer/image interface as well?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 118/133] nir: Add a sampler index indirect to nir_tex_instr

2015-01-07 Thread Connor Abbott
On Tue, Jan 6, 2015 at 6:36 PM, Jason Ekstrand  wrote:
>
>
> On Mon, Jan 5, 2015 at 10:45 PM, Connor Abbott  wrote:
>>
>> I created nir_tex_src_sampler_index for exactly this purpose, which
>> fits in with the "stick all the sources in an array so we can easily
>> iterate over them" philosophy. If you decide to keep with this
>> solution, though, at least remove that.
>
>
> Sorry, I completely missed that.  My only gripe is that it doesn't really
> follow the rest of our base_offset + indirect philosophy.  Is that the way
> you were intending to use it?  i.e. direct just has sampler_index and
> indirect is sampler_index + nir_tex_src_sampler_index.  If so, maybe we
> should rename it to nir_tex_src_sampler_indirect.
>
> I'm 100% ok with that, It just isn't at all clear how the two work together.
> --Jason

Well, when I added nir_tex_src_sampler_index, it was more of a "I know
we'll need something like this eventually so I'll stick it here to
remind myself/other people when the time comes" thing, and I wasn't
sure which option would be better. So you can keep it and always set
sampler_index to 0 when it's indirect, or rename it - whatever's
easier to do, so long as it's consistent.

>
>>
>>
>> On Tue, Dec 16, 2014 at 1:13 AM, Jason Ekstrand 
>> wrote:
>> > ---
>> >  src/glsl/nir/nir.c  | 11 +++
>> >  src/glsl/nir/nir.h  | 10 ++
>> >  src/glsl/nir/nir_print.c|  4 
>> >  src/glsl/nir/nir_validate.c |  3 +++
>> >  4 files changed, 28 insertions(+)
>> >
>> > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> > index 60c9cff..8bcc64a 100644
>> > --- a/src/glsl/nir/nir.c
>> > +++ b/src/glsl/nir/nir.c
>> > @@ -461,6 +461,13 @@ nir_tex_instr_create(void *mem_ctx, unsigned
>> > num_srcs)
>> > instr->has_predicate = false;
>> > src_init(&instr->predicate);
>> >
>> > +   instr->sampler_index = 0;
>> > +   instr->has_sampler_indirect = false;
>> > +   src_init(&instr->sampler_indirect);
>> > +   instr->sampler_indirect_max = 0;
>> > +
>> > +   instr->sampler = NULL;
>> > +
>> > return instr;
>> >  }
>> >
>> > @@ -1529,6 +1536,10 @@ visit_tex_src(nir_tex_instr *instr,
>> > nir_foreach_src_cb cb, void *state)
>> >if (!visit_src(&instr->predicate, cb, state))
>> >   return false;
>> >
>> > +   if (instr->has_sampler_indirect)
>> > +  if (!visit_src(&instr->sampler_indirect, cb, state))
>> > + return false;
>> > +
>> > if (instr->sampler != NULL)
>> >if (!visit_deref_src(instr->sampler, cb, state))
>> >   return false;
>> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> > index 32bf634..bc7a226 100644
>> > --- a/src/glsl/nir/nir.h
>> > +++ b/src/glsl/nir/nir.h
>> > @@ -838,7 +838,17 @@ typedef struct {
>> > /* gather component selector */
>> > unsigned component : 2;
>> >
>> > +   /** The sampler index
>> > +*
>> > +* If has_indirect is true, then the sampler index is given by
>> > +* sampler_index + sampler_indirect where sampler_indirect has a
>> > maximum
>> > +* possible value of sampler_indirect_max.
>> > +*/
>> > unsigned sampler_index;
>> > +   bool has_sampler_indirect;
>> > +   nir_src sampler_indirect;
>> > +   unsigned sampler_indirect_max;
>> > +
>> > nir_deref_var *sampler; /* if this is NULL, use sampler_index
>> > instead */
>> >  } nir_tex_instr;
>> >
>> > diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
>> > index 962e408..67df9a5 100644
>> > --- a/src/glsl/nir/nir_print.c
>> > +++ b/src/glsl/nir/nir_print.c
>> > @@ -498,6 +498,10 @@ print_tex_instr(nir_tex_instr *instr,
>> > print_var_state *state, FILE *fp)
>> >print_deref(instr->sampler, state, fp);
>> > } else {
>> >fprintf(fp, "%u", instr->sampler_index);
>> > +  if (instr->has_sampler_indirect) {
>> > + fprintf(fp, " + ");
>> > + print_src(&instr->sampler_indirect, fp);
>> > +  }
>> > }
>> >
>> > fprintf(fp, " (sampler)");
>> > diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
>> > index e565b3c..ed6e482 100644
>> > --- a/src/glsl/nir/nir_validate.c
>> > +++ b/src/glsl/nir/nir_validate.c
>> > @@ -399,6 +399,9 @@ validate_tex_instr(nir_tex_instr *instr,
>> > validate_state *state)
>> >validate_src(&instr->src[i], state);
>> > }
>> >
>> > +   if (instr->has_sampler_indirect)
>> > +  validate_src(&instr->sampler_indirect, state);
>> > +
>> > if (instr->sampler != NULL)
>> >validate_deref_var(instr->sampler, state);
>> >  }
>> > --
>> > 2.2.0
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/13] radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders

2015-01-07 Thread Tom Stellard
On Wed, Jan 07, 2015 at 12:58:12PM +0100, Marek Olšák wrote:
> How about the attached patch?
> 

Looks good, thanks.

Reviewed-by: Tom Stellard 

> Marek
> 
> On Wed, Jan 7, 2015 at 1:23 AM, Tom Stellard  wrote:
> > On Wed, Jan 07, 2015 at 01:13:37AM +0100, Marek Olšák wrote:
> >> Neither. It's because we use DX10_CLAMP, which converts NaNs to 0.
> >>
> >
> > Ok, could we add a dx10_clamp bit to si_shader and make this attribute
> > conditional on that bit.  I'm concerned someone may remove DX10_CLAMP
> > and forget to also remove this attribute.
> >
> > -Tom
> >
> >> Marek
> >>
> >> On Wed, Jan 7, 2015 at 12:51 AM, Tom Stellard  wrote:
> >> > On Mon, Jan 05, 2015 at 12:18:43AM +0100, Marek Olšák wrote:
> >> >> From: Marek Olšák 
> >> >>
> >> >> ---
> >> >>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 1 +
> >> >>  1 file changed, 1 insertion(+)
> >> >>
> >> >> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
> >> >> b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> >> >> index dc871d7..e3be72c 100644
> >> >> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> >> >> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> >> >> @@ -83,6 +83,7 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned 
> >> >> type)
> >> >>
> >> >>   if (type != TGSI_PROCESSOR_COMPUTE) {
> >> >>   LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", 
> >> >> "true");
> >> >> + LLVMAddTargetDependentFunctionAttr(F, 
> >> >> "enable-no-nans-fp-math", "true");
> >> >
> >> > Is this required by the OpenGL spec or is it just to fix broken/old
> >> > games?
> >> >
> >> > -Tom
> >> >
> >> >>   }
> >> >>  }
> >> >>
> >> >> --
> >> >> 2.1.0
> >> >>
> >> >> ___
> >> >> mesa-dev mailing list
> >> >> mesa-dev@lists.freedesktop.org
> >> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

> From d960b773f3bc99928b4aab5c4344aea671595849 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
> Date: Sun, 4 Jan 2015 17:08:57 +0100
> Subject: [PATCH] radeonsi: enable LLVM optimizations that assume no NaNs for
>  non-compute shaders
> 
> v2: complete rewrite
> ---
>  src/gallium/drivers/radeonsi/si_shader.c| 7 +++
>  src/gallium/drivers/radeonsi/si_shader.h| 1 +
>  src/gallium/drivers/radeonsi/si_state_shaders.c | 8 
>  3 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 5d61a54..cf28860 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -2369,6 +2369,10 @@ static void create_function(struct si_shader_context 
> *si_shader_ctx)
>   radeon_llvm_create_func(&si_shader_ctx->radeon_bld, params, num_params);
>   radeon_llvm_shader_type(si_shader_ctx->radeon_bld.main_fn, 
> si_shader_ctx->type);
>  
> + if (shader->dx10_clamp_mode)
> + 
> LLVMAddTargetDependentFunctionAttr(si_shader_ctx->radeon_bld.main_fn,
> +"enable-no-nans-fp-math", 
> "true");
> +
>   for (i = 0; i <= last_sgpr; ++i) {
>   LLVMValueRef P = 
> LLVMGetParam(si_shader_ctx->radeon_bld.main_fn, i);
>  
> @@ -2723,6 +2727,9 @@ int si_shader_create(struct si_screen *sscreen, struct 
> si_shader *shader)
>   radeon_llvm_context_init(&si_shader_ctx.radeon_bld);
>   bld_base = &si_shader_ctx.radeon_bld.soa.bld_base;
>  
> + if (sel->type != PIPE_SHADER_COMPUTE)
> + shader->dx10_clamp_mode = true;
> +
>   if (sel->info.uses_kill)
>   shader->db_shader_control |= S_02880C_KILL_ENABLE(1);
>  
> diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
> b/src/gallium/drivers/radeonsi/si_shader.h
> index 21692f0..08e344a 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.h
> +++ b/src/gallium/drivers/radeonsi/si_shader.h
> @@ -160,6 +160,7 @@ struct si_shader {
>   booluses_instanceid;
>   unsignednr_pos_exports;
>   boolis_gs_copy_shader;
> + booldx10_clamp_mode; /* convert NaNs to 0 */
>  };
>  
>  static inline struct tgsi_shader_info *si_get_vs_info(struct si_context 
> *sctx)
> diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
> b/src/gallium/drivers/radeonsi/si_state_shaders.c
> index e51d50e..817a990 100644
> --- a/src/gallium/drivers/radeonsi/si_state_shaders.c
> +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
> @@ -65,7 +65,7 @@ static void si_shader_es(struct si_shader *shader)
>  S_00B328_VGPRS((shader->num_vgprs - 1) / 4) |
>  S_00B328_SGPRS((num_sgprs - 1) / 8) |
>  S_00B328_VGPR_COMP_CNT(vgpr_comp_cnt) |
> -S_00B328_DX10_CLAMP(1));
> +S_00B328_DX10_CLAMP(shader->dx10_clamp_mode));
>   si_pm4_set_reg(pm4, R_00B32C_SPI_SHADER_PGM_RSRC2_ES,

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Marek Olšák
On Wed, Jan 7, 2015 at 2:42 PM, Aditya Avinash  wrote:
> Hi,
> Sounds great but, do you think a separate buffer pipe is required for this?
> Changing Constant buffer to a generic buffer (with alu+load+store) can help.

No, constant buffers should remain unchanged.

>
> What about for R600? Do we have to add
>
> r600_init_atom(rctx, &rctx->shaderbuf_state[PIPE_SHADER_VERTEX].atom, id++,
> r600_emit_vs_shader_buffers, 0);
>
> to backend? Will this be specific to Atomics?

No, atomic buffers should be set in the exact same way as colorbuffers
on r600 except that the RAT bit should be set. Search the r600g driver
for "RAT(1)". I think it supports them already. The shader
instructions for accessing such buffers begin with "MEM_RAT".

Marek

>
> Thank you!!
>
> On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:
>>
>> From: Marek Olšák 
>>
>> set_shader_resources is unused.
>>
>> set_shader_buffers should support shader atomic counter buffers and shader
>> storage buffers from OpenGL.
>>
>> The plan is to use slots 0..15 for atomic counters and slots 16..31
>> for storage buffers. Atomic counters are planned to be supported first.
>>
>> This doesn't add any interface for images. The documentation is added
>> for future reference.
>> ---
>>
>> This is the interface only. I don't plan to do anything else for now.
>> Comments welcome.
>>
>>  src/gallium/docs/source/context.rst | 16 
>>  src/gallium/docs/source/screen.rst  |  4 ++--
>>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>>  src/gallium/include/pipe/p_context.h| 20 +++-
>>  src/gallium/include/pipe/p_defines.h|  2 +-
>>  src/gallium/include/pipe/p_state.h  | 10 ++
>>  11 files changed, 38 insertions(+), 26 deletions(-)
>>
>> diff --git a/src/gallium/docs/source/context.rst
>> b/src/gallium/docs/source/context.rst
>> index 5861f46..73fd35f 100644
>> --- a/src/gallium/docs/source/context.rst
>> +++ b/src/gallium/docs/source/context.rst
>> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
>> means that they
>>  have no support for floating point coordinates, address wrap modes or
>>  filtering.
>>
>> -Shader resources are specified for all the shader stages at once using
>> -the ``set_shader_resources`` method.  When binding texture resources,
>> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
>> -specify the mipmap level and the range of layers the texture will be
>> -constrained to.  In the case of buffers, ``first_element`` and
>> -``last_element`` specify the range within the buffer that will be used
>> -by the shader resource.  Writes to a shader resource are only allowed
>> -when the ``writable`` flag is set.
>> +There are 2 types of shader resources: buffers and images.
>> +
>> +Buffers are specified using the ``set_shader_buffers`` method.
>> +
>> +Images are specified using the ``set_shader_images`` method. When binding
>> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
>> +fields specify the mipmap level and the range of layers the image will be
>> +constrained to.
>>
>>  Surfaces
>>  
>> diff --git a/src/gallium/docs/source/screen.rst
>> b/src/gallium/docs/source/screen.rst
>> index 55d114c..c81ad66 100644
>> --- a/src/gallium/docs/source/screen.rst
>> +++ b/src/gallium/docs/source/screen.rst
>> @@ -403,8 +403,8 @@ resources might be created and handled quite
>> differently.
>>process.
>>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
>>address space of a compute program.
>> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
>> -  bound to the graphics pipeline as a shader resource.
>> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader
>> where
>> +  it should support reads, writes, and atomics.
>>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>>bound to the compute program as a shader resource.
>>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
>> diff --git a/src/gallium/drivers/galahad/glhd_context.c
>> b/src/gallium/drivers/galahad/glhd_context.c
>> index 37ea170..383d76c 100644
>> --- a/src/gallium/drivers/galahad/glhd_context.c
>> +++ b/src/gallium/drivers/galahad/glhd_context.c
>> @@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen *_screen,
>> struct pipe_context *pipe)
>> GLHD_PIPE_INIT(set_scissor_states);
>> GLHD_PIPE_INIT(set_viewport_states);
>> GLHD_PIPE_INIT(set_sampler_views);
>> -   //GLHD_PIPE_INIT(set_shader_resources);
>> +   //GLHD_PIPE_INIT(set_shader_buffers);
>> GLHD_PIPE_INIT(set_vertex_buffers);
>> 

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Alex Deucher
On Wed, Jan 7, 2015 at 11:29 AM, Marek Olšák  wrote:
> On Wed, Jan 7, 2015 at 2:42 PM, Aditya Avinash  
> wrote:
>> Hi,
>> Sounds great but, do you think a separate buffer pipe is required for this?
>> Changing Constant buffer to a generic buffer (with alu+load+store) can help.
>
> No, constant buffers should remain unchanged.
>
>>
>> What about for R600? Do we have to add
>>
>> r600_init_atom(rctx, &rctx->shaderbuf_state[PIPE_SHADER_VERTEX].atom, id++,
>> r600_emit_vs_shader_buffers, 0);
>>
>> to backend? Will this be specific to Atomics?
>
> No, atomic buffers should be set in the exact same way as colorbuffers
> on r600 except that the RAT bit should be set. Search the r600g driver
> for "RAT(1)". I think it supports them already. The shader
> instructions for accessing such buffers begin with "MEM_RAT".
>

For reference, RAT = Random Access Target

Alex

> Marek
>
>>
>> Thank you!!
>>
>> On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> set_shader_resources is unused.
>>>
>>> set_shader_buffers should support shader atomic counter buffers and shader
>>> storage buffers from OpenGL.
>>>
>>> The plan is to use slots 0..15 for atomic counters and slots 16..31
>>> for storage buffers. Atomic counters are planned to be supported first.
>>>
>>> This doesn't add any interface for images. The documentation is added
>>> for future reference.
>>> ---
>>>
>>> This is the interface only. I don't plan to do anything else for now.
>>> Comments welcome.
>>>
>>>  src/gallium/docs/source/context.rst | 16 
>>>  src/gallium/docs/source/screen.rst  |  4 ++--
>>>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>>>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>>>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>>>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>>>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>>>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>>>  src/gallium/include/pipe/p_context.h| 20 +++-
>>>  src/gallium/include/pipe/p_defines.h|  2 +-
>>>  src/gallium/include/pipe/p_state.h  | 10 ++
>>>  11 files changed, 38 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/src/gallium/docs/source/context.rst
>>> b/src/gallium/docs/source/context.rst
>>> index 5861f46..73fd35f 100644
>>> --- a/src/gallium/docs/source/context.rst
>>> +++ b/src/gallium/docs/source/context.rst
>>> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
>>> means that they
>>>  have no support for floating point coordinates, address wrap modes or
>>>  filtering.
>>>
>>> -Shader resources are specified for all the shader stages at once using
>>> -the ``set_shader_resources`` method.  When binding texture resources,
>>> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
>>> -specify the mipmap level and the range of layers the texture will be
>>> -constrained to.  In the case of buffers, ``first_element`` and
>>> -``last_element`` specify the range within the buffer that will be used
>>> -by the shader resource.  Writes to a shader resource are only allowed
>>> -when the ``writable`` flag is set.
>>> +There are 2 types of shader resources: buffers and images.
>>> +
>>> +Buffers are specified using the ``set_shader_buffers`` method.
>>> +
>>> +Images are specified using the ``set_shader_images`` method. When binding
>>> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
>>> +fields specify the mipmap level and the range of layers the image will be
>>> +constrained to.
>>>
>>>  Surfaces
>>>  
>>> diff --git a/src/gallium/docs/source/screen.rst
>>> b/src/gallium/docs/source/screen.rst
>>> index 55d114c..c81ad66 100644
>>> --- a/src/gallium/docs/source/screen.rst
>>> +++ b/src/gallium/docs/source/screen.rst
>>> @@ -403,8 +403,8 @@ resources might be created and handled quite
>>> differently.
>>>process.
>>>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
>>>address space of a compute program.
>>> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
>>> -  bound to the graphics pipeline as a shader resource.
>>> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader
>>> where
>>> +  it should support reads, writes, and atomics.
>>>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>>>bound to the compute program as a shader resource.
>>>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
>>> diff --git a/src/gallium/drivers/galahad/glhd_context.c
>>> b/src/gallium/drivers/galahad/glhd_context.c
>>> index 37ea170..383d76c 100644
>>> --- a/src/gallium/drivers/galahad/glhd_context.c
>>> +++ b/src/gallium/drivers/galahad/glhd_context.c
>>> @@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen *_screen,
>>> struct pipe_context *pipe)
>>> GLHD_PIPE_INIT(set_scissor_states);

[Mesa-dev] [PATCH 09/53] st/nine: CubeTexture: fix GetLevelDesc

2015-01-07 Thread Axel Davy
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
Signed-off-by: Xavier Bouchoux 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/cubetexture9.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/cubetexture9.c 
b/src/gallium/state_trackers/nine/cubetexture9.c
index 9f5d8e2..2c607c0 100644
--- a/src/gallium/state_trackers/nine/cubetexture9.c
+++ b/src/gallium/state_trackers/nine/cubetexture9.c
@@ -146,7 +146,7 @@ NineCubeTexture9_GetLevelDesc( struct NineCubeTexture9 
*This,
 user_assert(Level == 0 || !(This->base.base.usage & 
D3DUSAGE_AUTOGENMIPMAP),
 D3DERR_INVALIDCALL);
 
-*pDesc = This->surfaces[Level]->desc;
+*pDesc = This->surfaces[Level * 6]->desc;
 
 return D3D_OK;
 }
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/53] st/nine: NineBaseTexture9: fix setting of last_layer

2015-01-07 Thread Axel Davy
Use same similar settings as u_sampler_view_default_template

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/basetexture9.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index 12da1e0..af4778b 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -480,8 +480,8 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 
*This,
 
 templ.format = sRGB ? util_format_srgb(resource->format) : 
resource->format;
 templ.u.tex.first_layer = 0;
-templ.u.tex.last_layer = (resource->target == PIPE_TEXTURE_CUBE) ?
-5 : (This->base.info.depth0 - 1);
+templ.u.tex.last_layer = resource->target == PIPE_TEXTURE_3D ?
+ resource->depth0 - 1 : resource->array_size - 1;
 templ.u.tex.first_level = 0;
 templ.u.tex.last_level = resource->last_level;
 templ.swizzle_r = swizzle[0];
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/53] st/nine: Fix crash when deleting non-implicit swapchain

2015-01-07 Thread Axel Davy
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.

Fixes problems with battle.net launcher.

Cc: "10.4" 
Tested-by: Nick Sarnie 
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/swapchain9.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index bf87aaf..24ff905 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -467,7 +467,7 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
 
 if (This->buffers) {
 for (i = 0; i < This->params.BackBufferCount; i++) {
-NineUnknown_Destroy(NineUnknown(This->buffers[i]));
+NineUnknown_Release(NineUnknown(This->buffers[i]));
 ID3DPresent_DestroyD3DWindowBuffer(This->present, 
This->present_handles[i]);
 if (This->present_buffers)
 pipe_resource_reference(&(This->present_buffers[i]), NULL);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/53] st/nine: query: remove unused variable (trivial)

2015-01-07 Thread Axel Davy
From: David Heidelberg 

Signed-off-by: David Heidelberg 
Reviewed-by: Axel Davy 
---
 src/gallium/state_trackers/nine/query9.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/query9.c 
b/src/gallium/state_trackers/nine/query9.c
index 6df4ead..466b4ba 100644
--- a/src/gallium/state_trackers/nine/query9.c
+++ b/src/gallium/state_trackers/nine/query9.c
@@ -205,7 +205,6 @@ NineQuery9_GetData( struct NineQuery9 *This,
 {
 struct pipe_context *pipe = This->base.device->pipe;
 boolean ok, wait_query_result = FALSE;
-unsigned i;
 union pipe_query_result presult;
 union nine_query_result nresult;
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/53] st/nine: Fix D3DRS_POINTSPRITE support

2015-01-07 Thread Axel Davy
From: xavier 

It's done by testing the existence of the point sprite output register *after* 
parsing the vertex shader.

Reviewed-by: David Heidelberg 
Reviewed-by: Axel Davy 
Signed-off-by: Xavier Bouchoux 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index c2a0f4d..fcc1c68 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2842,9 +2842,6 @@ nine_translate_shader(struct NineDevice9 *device, struct 
nine_shader_info *info)
 ureg_property(tx->ureg, TGSI_PROPERTY_FS_COORD_PIXEL_CENTER, 
TGSI_FS_COORD_PIXEL_CENTER_INTEGER);
 }
 
-if (!ureg_dst_is_undef(tx->regs.oPts))
-info->point_size = TRUE;
-
 while (!sm1_parse_eof(tx))
 sm1_parse_instruction(tx);
 tx->parse++; /* for byte_size */
@@ -2860,6 +2857,9 @@ nine_translate_shader(struct NineDevice9 *device, struct 
nine_shader_info *info)
 
 ureg_END(tx->ureg);
 
+if (IS_VS && !ureg_dst_is_undef(tx->regs.oPts))
+info->point_size = TRUE;
+
 if (debug_get_bool_option("NINE_TGSI_DUMP", FALSE)) {
 unsigned count;
 const struct tgsi_token *toks = ureg_get_tokens(tx->ureg, &count);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/53] st/nine: Hack to generate resource if it doesn't exist when getting view

2015-01-07 Thread Axel Davy
From: Stanislaw Halik 

Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).

This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.

Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which 
prevents
crashing, but doesn't help with the resource content.

This fixes several games crashing at launch.

Acked-by: Axel Davy 
Acked-by: David Heidelberg 
Signed-off-by: Stanislaw Halik 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/basetexture9.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index fb5a61a..ccfd199 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -457,6 +457,9 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 
*This,
if (unlikely(This->format == D3DFMT_NULL))
 return D3D_OK;
 NineBaseTexture9_Dump(This);
+/* hack due to incorrect POOL_MANAGED handling */
+NineBaseTexture9_GenerateMipSubLevels(This);
+resource = This->base.resource;
 }
 assert(resource);
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/53] st/nine: Rework of boolean constants

2015-01-07 Thread Axel Davy
Convert them to shader booleans at earlier stage

Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/device9.c| 35 +---
 src/gallium/state_trackers/nine/device9.h|  6 ++---
 src/gallium/state_trackers/nine/nine_state.c | 13 +++
 3 files changed, 22 insertions(+), 32 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 1d97688..d747a7a 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -293,13 +293,6 @@ NineDevice9_ctor( struct NineDevice9 *This,
 return E_OUTOFMEMORY;
 }
 
-This->vs_bool_true = pScreen->get_shader_param(pScreen,
-PIPE_SHADER_VERTEX,
-PIPE_SHADER_CAP_INTEGERS) ? 0x : fui(1.0f);
-This->ps_bool_true = pScreen->get_shader_param(pScreen,
-PIPE_SHADER_FRAGMENT,
-PIPE_SHADER_CAP_INTEGERS) ? 0x : fui(1.0f);
-
 /* Allocate upload helper for drivers that suck (from st pov ;). */
 {
 unsigned bind = 0;
@@ -314,6 +307,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
 }
 
 This->driver_caps.window_space_position_support = 
GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
+This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
+This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
 
 nine_ff_init(This); /* initialize fixed function code */
 
@@ -2981,6 +2976,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 
*This,
   UINT BoolCount )
 {
 struct nine_state *state = This->update;
+int i;
+uint32_t bool_true = This->driver_caps.vs_integer ? 0x : fui(1.0f);
 
 DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
 This, StartRegister, pConstantData, BoolCount);
@@ -2989,9 +2986,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 
*This,
 user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(&state->vs_const_b[StartRegister],
-   pConstantData,
-   BoolCount * sizeof(state->vs_const_b[0]));
+for (i = 0; i < BoolCount; i++)
+state->vs_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 
0;
 
 state->changed.vs_const_b |= ((1 << BoolCount) - 1) << StartRegister;
 state->changed.group |= NINE_STATE_VS_CONST;
@@ -3006,14 +3002,14 @@ NineDevice9_GetVertexShaderConstantB( struct 
NineDevice9 *This,
   UINT BoolCount )
 {
 const struct nine_state *state = &This->state;
+int i;
 
 user_assert(StartRegister  < NINE_MAX_CONST_B, 
D3DERR_INVALIDCALL);
 user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(pConstantData,
-   &state->vs_const_b[StartRegister],
-   BoolCount * sizeof(state->vs_const_b[0]));
+for (i = 0; i < BoolCount; i++)
+pConstantData[i] = state->vs_const_b[StartRegister + i] != 0 ? TRUE : 
FALSE;
 
 return D3D_OK;
 }
@@ -3286,6 +3282,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 
*This,
  UINT BoolCount )
 {
 struct nine_state *state = This->update;
+int i;
+uint32_t bool_true = This->driver_caps.ps_integer ? 0x : fui(1.0f);
 
 DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
 This, StartRegister, pConstantData, BoolCount);
@@ -3294,9 +3292,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 
*This,
 user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(&state->ps_const_b[StartRegister],
-   pConstantData,
-   BoolCount * sizeof(state->ps_const_b[0]));
+for (i = 0; i < BoolCount; i++)
+state->ps_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 
0;
 
 state->changed.ps_const_b |= ((1 << BoolCount) - 1) << StartRegister;
 state->changed.group |= NINE_STATE_PS_CONST;
@@ -3311,14 +3308,14 @@ NineDevice9_GetPixelShaderConstantB( struct NineDevice9 
*This,
  UINT BoolCount )
 {
 const struct nine_state *state = &This->state;
+int i;
 
 user_assert(StartRegister  < NINE_MAX_CONST_B, 
D3DERR_INVALIDCALL);
 user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(pConstantData,
-   &state->ps_const_b[StartRegister],
-   BoolCount * sizeof(state->ps_const_b[0]));
+for (i = 0; i < BoolCount; i++)
+pConstantData[i] = state->ps_const_b[StartRegister + i] != 0 ? T

[Mesa-dev] [PATCH 07/53] st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS

2015-01-07 Thread Axel Davy
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/adapter9.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index e409d5f..871a9a3 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -549,7 +549,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPMISCCAPS_CULLCCW |
D3DPMISCCAPS_COLORWRITEENABLE |
D3DPMISCCAPS_CLIPPLANESCALEDPOINTS |
-   D3DPMISCCAPS_CLIPTLVERTS |
+   /*D3DPMISCCAPS_CLIPTLVERTS |*/
D3DPMISCCAPS_TSSARGTEMP |
D3DPMISCCAPS_BLENDOP |
D3DPIPECAP(INDEP_BLEND_ENABLE, 
D3DPMISCCAPS_INDEPENDENTWRITEMASKS) |
@@ -560,6 +560,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPIPECAP(MIXED_COLORBUFFER_FORMATS, 
D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS) |
D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING |
/*D3DPMISCCAPS_FOGVERTEXCLAMPED*/0;
+if (!screen->get_param(screen, PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION))
+pCaps->PrimitiveMiscCaps |= D3DPMISCCAPS_CLIPTLVERTS;
 
 pCaps->RasterCaps =
 D3DPIPECAP(ANISOTROPIC_FILTER, D3DPRASTERCAPS_ANISOTROPY) |
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/53] st/nine: Additional defines to d3dtypes.h

2015-01-07 Thread Axel Davy
From: xavier 

Reviewed-by: David Heidelberg 
Reviewed-by: Axel Davy 
Signed-off-by: Xavier Bouchoux 

Cc: "10.4" 
---
 include/D3D9/d3d9types.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/D3D9/d3d9types.h b/include/D3D9/d3d9types.h
index 0a8f9e5..e53e389 100644
--- a/include/D3D9/d3d9types.h
+++ b/include/D3D9/d3d9types.h
@@ -224,6 +224,8 @@ typedef struct _RGNDATA {
 #define D3DERR_INVALIDDEVICE MAKE_D3DHRESULT(2155)
 #define D3DERR_INVALIDCALL   MAKE_D3DHRESULT(2156)
 #define D3DERR_DRIVERINVALIDCALL MAKE_D3DHRESULT(2157)
+#define D3DERR_DEVICEREMOVED MAKE_D3DHRESULT(2160)
+#define D3DERR_DEVICEHUNGMAKE_D3DHRESULT(2164)
 
 /
  * Bitmasks *
@@ -331,6 +333,7 @@ typedef struct _RGNDATA {
 
 #define D3DPRESENT_DONOTWAIT  0x0001
 #define D3DPRESENT_LINEAR_CONTENT 0x0002
+#define D3DPRESENT_RATE_DEFAULT0
 
 #define D3DCREATE_FPU_PRESERVE  0x0002
 #define D3DCREATE_MULTITHREADED 0x0004
@@ -344,6 +347,13 @@ typedef struct _RGNDATA {
 #define D3DSTREAMSOURCE_INDEXEDDATA  (1 << 30)
 #define D3DSTREAMSOURCE_INSTANCEDATA (2 << 30)
 
+/* D3DRS_COLORWRITEENABLE */
+#define D3DCOLORWRITEENABLE_RED (1L << 0)
+#define D3DCOLORWRITEENABLE_GREEN   (1L << 1)
+#define D3DCOLORWRITEENABLE_BLUE(1L << 2)
+#define D3DCOLORWRITEENABLE_ALPHA   (1L << 3)
+
+
 /
  * Function macros  *
  ***/
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/53] st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format

2015-01-07 Thread Axel Davy
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/cubetexture9.c   |  8 
 src/gallium/state_trackers/nine/texture9.c   |  9 -
 src/gallium/state_trackers/nine/volumetexture9.c | 10 +-
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/cubetexture9.c 
b/src/gallium/state_trackers/nine/cubetexture9.c
index 2c607c0..43db8cb 100644
--- a/src/gallium/state_trackers/nine/cubetexture9.c
+++ b/src/gallium/state_trackers/nine/cubetexture9.c
@@ -38,6 +38,8 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
HANDLE *pSharedHandle )
 {
 struct pipe_resource *info = &This->base.base.info;
+struct pipe_screen *screen = pParams->device->screen;
+enum pipe_format pf;
 unsigned i;
 D3DSURFACE_DESC sfdesc;
 HRESULT hr;
@@ -55,6 +57,12 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
 if (Usage & D3DUSAGE_AUTOGENMIPMAP)
 Levels = 0;
 
+pf = d3d9_to_pipe_format(Format);
+if (pf == PIPE_FORMAT_NONE ||
+!screen->is_format_supported(screen, pf, PIPE_TEXTURE_CUBE, 0, 
PIPE_BIND_SAMPLER_VIEW)) {
+return D3DERR_INVALIDCALL;
+}
+
 info->screen = pParams->device->screen;
 info->target = PIPE_TEXTURE_CUBE;
 info->format = d3d9_to_pipe_format(Format);
diff --git a/src/gallium/state_trackers/nine/texture9.c 
b/src/gallium/state_trackers/nine/texture9.c
index 8852142..4d7e950 100644
--- a/src/gallium/state_trackers/nine/texture9.c
+++ b/src/gallium/state_trackers/nine/texture9.c
@@ -47,6 +47,7 @@ NineTexture9_ctor( struct NineTexture9 *This,
 struct pipe_screen *screen = pParams->device->screen;
 struct pipe_resource *info = &This->base.base.info;
 struct pipe_resource *resource;
+enum pipe_format pf;
 unsigned l;
 D3DSURFACE_DESC sfdesc;
 HRESULT hr;
@@ -92,9 +93,15 @@ NineTexture9_ctor( struct NineTexture9 *This,
 if (Usage & D3DUSAGE_AUTOGENMIPMAP)
 Levels = 0;
 
+pf = d3d9_to_pipe_format(Format);
+if (Format != D3DFMT_NULL && (pf == PIPE_FORMAT_NONE ||
+!screen->is_format_supported(screen, pf, PIPE_TEXTURE_2D, 0, 
PIPE_BIND_SAMPLER_VIEW))) {
+return D3DERR_INVALIDCALL;
+}
+
 info->screen = screen;
 info->target = PIPE_TEXTURE_2D;
-info->format = d3d9_to_pipe_format(Format);
+info->format = pf;
 info->width0 = Width;
 info->height0 = Height;
 info->depth0 = 1;
diff --git a/src/gallium/state_trackers/nine/volumetexture9.c 
b/src/gallium/state_trackers/nine/volumetexture9.c
index 9366dc9..f116899 100644
--- a/src/gallium/state_trackers/nine/volumetexture9.c
+++ b/src/gallium/state_trackers/nine/volumetexture9.c
@@ -37,6 +37,8 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
  HANDLE *pSharedHandle )
 {
 struct pipe_resource *info = &This->base.base.info;
+struct pipe_screen *screen = pParams->device->screen;
+enum pipe_format pf;
 unsigned l;
 D3DVOLUME_DESC voldesc;
 HRESULT hr;
@@ -57,9 +59,15 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
 if (Usage & D3DUSAGE_AUTOGENMIPMAP)
 Levels = 0;
 
+pf = d3d9_to_pipe_format(Format);
+if (pf == PIPE_FORMAT_NONE ||
+!screen->is_format_supported(screen, pf, PIPE_TEXTURE_3D, 0, 
PIPE_BIND_SAMPLER_VIEW)) {
+return D3DERR_INVALIDCALL;
+}
+
 info->screen = pParams->device->screen;
 info->target = PIPE_TEXTURE_3D;
-info->format = d3d9_to_pipe_format(Format);
+info->format = pf;
 info->width0 = Width;
 info->height0 = Height;
 info->depth0 = Depth;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/53] Gallium Nine fixes

2015-01-07 Thread Axel Davy
Most of the patches of this serie are fixes,
that's why most of them are CC 10.4

The patches can be retrived here:
https://github.com/iXit/Mesa-3D/tree/submit_mesa

Some of them may not appear directly as fixes, so little
explanation is required:

"st/nine: Add ATI1 and ATI2 support"

These two compression formats are supported by all three
vendors since very long ago. Some apps like Mass Effect 2
just use them without checking their support, and crash.
This patch adds support for these format, and then fixes
these games. It makes however a little difference for Unigine
Heaven which was checking properly the support of the format and
that uses it after the patch.

"st/nine: Fix POW implementation" and other similar short patches to nine_shader

On some games, a little difference on the behaviour of corner cases of the
instructions will make some textures black, some shadows red, etc.
Several games are fixed by these patches (Though we still have some games with
black textures or red shadows :-) )

"st/nine: implement TEXM3x2DEPTH" and similar

These are ps 1.X instructions. Few games (mostly old ones) use these
instructions. All graphic cards are supposed to support them, and
any game trying to use them before these patches will result in a crash.
So implementing these missing instructions is a fix.


The patches of this series have been tested by several testers the last
few months. We do not expect any regression. That doesn't mean there is
no mistake in these patches, so please review :-)

Axel Davy (48):
  st/nine: Fix clip state logic
  st/nine: Add new texture format strings
  st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
  st/nine: NineBaseTexture9: fix setting of last_layer
  st/nine: CubeTexture: fix GetLevelDesc
  st/nine: Fix crash when deleting non-implicit swapchain
  st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of
bad format
  st/nine: NineBaseTexture9: update sampler view creation
  st/nine: Check if srgb format is supported before trying to use it.
  st/nine: Add ATI1 and ATI2 support
  st/nine: Rework of boolean constants
  st/nine: Convert integer constants to floats before storing them when
cards don't support integers
  st/nine: Remove some shader unused code
  st/nine: Clamp color inputs for ps <= 2.0 at ps level instead of vs
  st/nine: Saturate oFog and oPts vs outputs
  st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
  st/nine: Fix typo for M4x4
  st/nine: Fix POW implementation
  st/nine: Handle RSQ special cases
  st/nine: Handle NRM with input of null norm
  st/nine: Correct LOG on negative values
  st/nine: Rewrite LOOP implementation, and a0 aL handling
  st/nine: Match REP implementation to LOOP
  st/nine: Fix CND implementation
  st/nine: Remove duplicated code for ps texcoord input declaration
  st/nine: Clamp ps 1.X constants
  st/nine: Fix some fixed function pipeline operation
  st/nine: Fix CALLNZ implementation
  st/nine: Implement TEXCOORD special behaviours
  st/nine: Fill missing dst and src number for some instructions.
  st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
  st/nine: implement TEXM3x2DEPTH
  st/nine: Implement TEXM3x2TEX
  st/nine: Implement TEXM3x3SPEC
  st/nine: Implement TEXDEPTH
  st/nine: Implement TEXDP3
  st/nine: Implement TEXDP3TEX
  st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
  st/nine: Implement ps3 advanced input definition feature
  st/nine: Correct rules for relative adressing and constants.
  st/nine: Remove unused code for ps
  st/nine: Fix sm3 relative addressing for non-debug build
  st/nine: Add variables containing the size of the constant buffers
  st/nine: Allocate the correct size for the user constant buffer
  st/nine: Allocate vs constbuf buffer for indirect addressing once.
  st/nine: Explicit nine requirements
  st/nine: Change comment relating to vertex shader inputs not matching
declaration
  st/nine: Correctly handle when ff vs should have no texture coord
input/output

David Heidelberg (1):
  st/nine: query: remove unused variable (trivial)

Stanislaw Halik (1):
  st/nine: Hack to generate resource if it doesn't exist when getting
view

xavier (3):
  st/nine: Additional defines to d3dtypes.h
  st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
  st/nine: Fix D3DRS_POINTSPRITE support

 include/D3D9/d3d9.h  |  10 +
 include/D3D9/d3d9types.h |  13 +
 src/gallium/state_trackers/nine/adapter9.c   | 113 ++--
 src/gallium/state_trackers/nine/basetexture9.c   |  60 +-
 src/gallium/state_trackers/nine/cubetexture9.c   |  14 +-
 src/gallium/state_trackers/nine/device9.c| 130 +++--
 src/gallium/state_trackers/nine/device9.h|   8 +-
 src/gallium/state_trackers/nine/nine_ff.c|  37 +-
 src/gallium/state_trackers/nine/nine_pipe.h  |   5 +
 src/gallium/state_trackers/nine/nine_shader.c| 699 ---
 src/gallium/state_trackers/nine/nine_state.c  

[Mesa-dev] [PATCH 05/53] st/nine: Add new texture format strings

2015-01-07 Thread Axel Davy
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 include/D3D9/d3d9types.h| 3 +++
 src/gallium/state_trackers/nine/nine_pipe.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/include/D3D9/d3d9types.h b/include/D3D9/d3d9types.h
index e53e389..456ae9f 100644
--- a/include/D3D9/d3d9types.h
+++ b/include/D3D9/d3d9types.h
@@ -649,10 +649,13 @@ typedef enum _D3DFORMAT {
 D3DFMT_A1 = 118,
 D3DFMT_A2B10G10R10_XR_BIAS = 119,
 D3DFMT_BINARYBUFFER = 199,
+D3DFMT_ATI1 = MAKEFOURCC('A', 'T', 'I', '1'),
+D3DFMT_ATI2 = MAKEFOURCC('A', 'T', 'I', '2'),
 D3DFMT_DF16 = MAKEFOURCC('D', 'F', '1', '6'),
 D3DFMT_DF24 = MAKEFOURCC('D', 'F', '2', '4'),
 D3DFMT_INTZ = MAKEFOURCC('I', 'N', 'T', 'Z'),
 D3DFMT_NULL = MAKEFOURCC('N', 'U', 'L', 'L'),
+D3DFMT_NVDB = MAKEFOURCC('N', 'V', 'D', 'B'),
 D3DFMT_NV11 = MAKEFOURCC('N', 'V', '1', '1'),
 D3DFMT_NV12 = MAKEFOURCC('N', 'V', '1', '2'),
 D3DFMT_Y210 = MAKEFOURCC('Y', '2', '1', '0'),
diff --git a/src/gallium/state_trackers/nine/nine_pipe.h 
b/src/gallium/state_trackers/nine/nine_pipe.h
index 1fd1694..06e4dc9 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.h
+++ b/src/gallium/state_trackers/nine/nine_pipe.h
@@ -249,6 +249,8 @@ d3dformat_to_string(D3DFORMAT fmt)
 case D3DFMT_DXT3: return "D3DFMT_DXT3";
 case D3DFMT_DXT4: return "D3DFMT_DXT4";
 case D3DFMT_DXT5: return "D3DFMT_DXT5";
+case D3DFMT_ATI1: return "D3DFMT_ATI1";
+case D3DFMT_ATI2: return "D3DFMT_ATI2";
 case D3DFMT_D16_LOCKABLE: return "D3DFMT_D16_LOCKABLE";
 case D3DFMT_D32: return "D3DFMT_D32";
 case D3DFMT_D15S1: return "D3DFMT_D15S1";
@@ -279,6 +281,7 @@ d3dformat_to_string(D3DFORMAT fmt)
 case D3DFMT_DF16: return "D3DFMT_DF16";
 case D3DFMT_DF24: return "D3DFMT_DF24";
 case D3DFMT_INTZ: return "D3DFMT_INTZ";
+case D3DFMT_NVDB: return "D3DFMT_NVDB";
 case D3DFMT_NULL: return "D3DFMT_NULL";
 default:
 break;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/53] st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9

2015-01-07 Thread Axel Davy
From: xavier 

Reviewed-by: David Heidelberg 
Reviewed-by: Axel Davy 
Signed-off-by: Xavier Bouchoux 

Cc: "10.4" 
---
 include/D3D9/d3d9.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/D3D9/d3d9.h b/include/D3D9/d3d9.h
index f872be7..e8b5214 100644
--- a/include/D3D9/d3d9.h
+++ b/include/D3D9/d3d9.h
@@ -399,6 +399,16 @@ struct IDirect3DVolume9 : public IUnknown
virtual HRESULT WINAPI UnlockBox() = 0;
 };
 
+struct IDirect3DVolumeTexture9 : public IDirect3DBaseTexture9
+{
+virtual HRESULT WINAPI GetLevelDesc(UINT Level, D3DVOLUME_DESC *pDesc) = 0;
+virtual HRESULT WINAPI GetVolumeLevel(UINT Level, IDirect3DVolume9 
**ppVolumeLevel) = 0;
+virtual HRESULT WINAPI LockBox(UINT Level, D3DLOCKED_BOX *pLockedVolume, 
const D3DBOX *pBox, DWORD Flags) = 0;
+virtual HRESULT WINAPI UnlockBox(UINT Level) = 0;
+virtual HRESULT WINAPI AddDirtyBox(const D3DBOX *pDirtyBox) = 0;
+};
+
+
 #else /* __cplusplus */
 
 extern const GUID IID_IDirect3D9;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/53] st/nine: Fix clip state logic

2015-01-07 Thread Axel Davy
The clip state was reset everytime, incurring an overhead.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 4175803..e4e6788 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -839,8 +839,10 @@ nine_update_state(struct NineDevice9 *device, uint32_t 
mask)
 }
 }
 
-if (state->changed.ucp)
+if (state->changed.ucp) {
 pipe->set_clip_state(pipe, &state->clip);
+state->changed.ucp = 0;
+}
 
 if (group & (NINE_STATE_FREQ_GROUP_1 | NINE_STATE_VS)) {
 if (group & (NINE_STATE_TEXTURE | NINE_STATE_SAMPLER))
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/53] st/nine: NineBaseTexture9: update sampler view creation

2015-01-07 Thread Axel Davy
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/basetexture9.c | 39 +-
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index af4778b..fb5a61a 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -436,6 +436,10 @@ NineBaseTexture9_CreatePipeResource( struct 
NineBaseTexture9 *This,
 return D3D_OK;
 }
 
+#define SWIZZLE_TO_REPLACE(s) (s == UTIL_FORMAT_SWIZZLE_0 || \
+   s == UTIL_FORMAT_SWIZZLE_1 || \
+   s == UTIL_FORMAT_SWIZZLE_NONE)
+
 HRESULT
 NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
 const int sRGB )
@@ -444,6 +448,7 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 
*This,
 struct pipe_context *pipe = This->pipe;
 struct pipe_resource *resource = This->base.resource;
 struct pipe_sampler_view templ;
+unsigned i;
 uint8_t swizzle[4];
 
 DBG("This=%p sRGB=%d\n", This, sRGB);
@@ -463,20 +468,28 @@ NineBaseTexture9_UpdateSamplerView( struct 
NineBaseTexture9 *This,
 swizzle[3] = PIPE_SWIZZLE_ALPHA;
 desc = util_format_description(resource->format);
 if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
-/* ZZZ1 -> 0Z01 (see end of docs/source/tgsi.rst)
- * XXX: but it's wrong
-swizzle[0] = PIPE_SWIZZLE_ZERO;
-swizzle[2] = PIPE_SWIZZLE_ZERO; */
-} else
-if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
-desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1) {
-/* R001/RG01 -> R111/RG11 */
-if (desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_0)
-swizzle[1] = PIPE_SWIZZLE_ONE;
-if (desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_0)
-swizzle[2] = PIPE_SWIZZLE_ONE;
+/* msdn doc says default values are R = B = 0.0,
+ * A = 1.0. This implictly indicates the green channel
+ * is always filled with content. However games seem to
+ * look for depth in the r channel, like gallium does.
+ * Moreover it's what dx10 states. In addition, some documentation
+ * seems to indicate depth is the only thing given for depth-stencil
+ * formats. Thus reword the spec by: R should contain the depth.
+ * R, G and B default values are 0.0, while A default value is 1.0 */
+if (SWIZZLE_TO_REPLACE(desc->swizzle[0]))
+swizzle[0] = PIPE_SWIZZLE_ZERO;
+swizzle[1] = PIPE_SWIZZLE_ZERO;
+swizzle[2] = PIPE_SWIZZLE_ZERO;
+swizzle[3] = PIPE_SWIZZLE_ONE;
+} else if (resource->format != PIPE_FORMAT_A8_UNORM) {
+/* A8 is the only exception that should have 0.0 as default values
+ * for RGB. It is already what gallium does. All the other ones
+ * should have 1.0 for non-defined values */
+for (i = 0; i < 4; i++) {
+if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
+swizzle[i] = PIPE_SWIZZLE_ONE;
+}
 }
-/* but 000A remains unchanged */
 
 templ.format = sRGB ? util_format_srgb(resource->format) : 
resource->format;
 templ.u.tex.first_layer = 0;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 28/53] st/nine: Match REP implementation to LOOP

2015-01-07 Thread Axel Davy
Previous implementation was fine,
just instead of having increasing counter,
have a decreasing counter.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 41 +++
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 21b06ce..88d4c07 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1562,9 +1562,7 @@ DECL_SPECIAL(REP)
 unsigned *label;
 struct ureg_src rep = tx_src_param(tx, &tx->insn.src[0]);
 struct ureg_dst ctr;
-struct ureg_dst tmp = tx_scratch_scalar(tx);
-struct ureg_src imm =
-tx->native_integers ? ureg_imm1u(ureg, 0) : ureg_imm1f(ureg, 0.0f);
+struct ureg_dst tmp;
 
 label = tx_bgnloop(tx);
 ctr = tx_get_loopctr(tx, FALSE);
@@ -1572,33 +1570,40 @@ DECL_SPECIAL(REP)
 /* NOTE: rep must be constant, so we don't have to save the count */
 assert(rep.File == TGSI_FILE_CONSTANT || rep.File == TGSI_FILE_IMMEDIATE);
 
-ureg_MOV(ureg, ctr, imm);
+ureg_MOV(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), rep);
+/* in the case ctr is float, remove 0.5 to avoid precision issues for 
comparisons */
+if (!tx->native_integers)
+ureg_ADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), 
ureg_src(ctr), ureg_imm1f(ureg, -0.5f));
+
 ureg_BGNLOOP(ureg, label);
-if (tx->native_integers)
-{
-ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
-ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
-}
-else
-{
-ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
+tmp = tx_scratch_scalar(tx);
+
+/* stop when crt.x <= 0 */
+if (!tx->native_integers) {
+ureg_SLE(ureg, tmp, ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X), 
ureg_imm1f(ureg, 0.0f));
 ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
+} else {
+ureg_ISGE(ureg, tmp, ureg_imm1i(ureg, 0), ureg_scalar(ureg_src(ctr), 
TGSI_SWIZZLE_X));
+ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
 }
 ureg_BRK(ureg);
 tx_endcond(tx);
 ureg_ENDIF(ureg);
 
-if (tx->native_integers) {
-ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
-} else {
-ureg_ADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1f(ureg, 1.0f));
-}
-
 return D3D_OK;
 }
 
 DECL_SPECIAL(ENDREP)
 {
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst ctr = tx_get_loopctr(tx, FALSE);
+
+if (!tx->native_integers) {
+ureg_ADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), 
ureg_src(ctr), ureg_imm1f(ureg, -1.0f));
+} else {
+ureg_UADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), 
ureg_src(ctr), ureg_imm1i(ureg, -1.0));
+}
+
 ureg_ENDLOOP(tx->ureg, tx_endloop(tx));
 return D3D_OK;
 }
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 35/53] st/nine: Fill missing dst and src number for some instructions.

2015-01-07 Thread Axel Davy
Not filling them correctly results in bad padding and later crash.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 46 +--
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index cf3f646..2b0349f 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2124,7 +2124,7 @@ DECL_SPECIAL(TEXREG2GB)
 
 DECL_SPECIAL(TEXM3x2PAD)
 {
-STUB(D3DERR_INVALIDCALL);
+return D3D_OK; /* this is just padding */
 }
 
 DECL_SPECIAL(TEXM3x2TEX)
@@ -2361,12 +2361,12 @@ struct sm1_op_info inst_table[] =
 _OPI(M3x3, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x3)),
 _OPI(M3x2, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x2)),
 
-_OPI(CALL,CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, 
SPECIAL(CALL)),
-_OPI(CALLNZ,  CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, 
SPECIAL(CALLNZ)),
+_OPI(CALL,CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, 
SPECIAL(CALL)),
+_OPI(CALLNZ,  CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 2, 
SPECIAL(CALLNZ)),
 _OPI(LOOP,BGNLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 2, 
SPECIAL(LOOP)),
 _OPI(RET, RET, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(RET)),
 _OPI(ENDLOOP, ENDLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 0, 
SPECIAL(ENDLOOP)),
-_OPI(LABEL,   NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, 
SPECIAL(LABEL)),
+_OPI(LABEL,   NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, 
SPECIAL(LABEL)),
 
 _OPI(DCL, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(DCL)),
 
@@ -2401,16 +2401,16 @@ struct sm1_op_info inst_table[] =
 _OPI(TEX,  TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 0, 
SPECIAL(TEX)),
 _OPI(TEX,  TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 1, 
SPECIAL(TEXLD_14)),
 _OPI(TEX,  TEX, V(0,0), V(0,0), V(2,0), V(3,0), 1, 2, 
SPECIAL(TEXLD)),
-_OPI(TEXBEM,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXBEM)),
-_OPI(TEXBEML,  TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXBEML)),
-_OPI(TEXREG2AR,TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXREG2AR)),
-_OPI(TEXREG2GB,TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXREG2GB)),
-_OPI(TEXM3x2PAD,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXM3x2PAD)),
-_OPI(TEXM3x2TEX,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXM3x2TEX)),
-_OPI(TEXM3x3PAD,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXM3x3PAD)),
-_OPI(TEXM3x3TEX,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXM3x3)),
-_OPI(TEXM3x3SPEC,  TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXM3x3SPEC)),
-_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, 
SPECIAL(TEXM3x3VSPEC)),
+_OPI(TEXBEM,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXBEM)),
+_OPI(TEXBEML,  TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXBEML)),
+_OPI(TEXREG2AR,TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXREG2AR)),
+_OPI(TEXREG2GB,TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXREG2GB)),
+_OPI(TEXM3x2PAD,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x2PAD)),
+_OPI(TEXM3x2TEX,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x2TEX)),
+_OPI(TEXM3x3PAD,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3PAD)),
+_OPI(TEXM3x3TEX,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3)),
+_OPI(TEXM3x3SPEC,  TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 2, 
SPECIAL(TEXM3x3SPEC)),
+_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3VSPEC)),
 
 _OPI(EXPP, EXP, V(0,0), V(1,1), V(0,0), V(0,0), 1, 1, NULL),
 _OPI(EXPP, EX2, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
@@ -2420,23 +2420,23 @@ struct sm1_op_info inst_table[] =
 _OPI(DEF, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 0, SPECIAL(DEF)),
 
 /* More tex stuff */
-_OPI(TEXREG2RGB,   TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, 
SPECIAL(TEXREG2RGB)),
-_OPI(TEXDP3TEX,TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, 
SPECIAL(TEXDP3TEX)),
-_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 0, 0, 
SPECIAL(TEXM3x2DEPTH)),
-_OPI(TEXDP3,   TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, 
SPECIAL(TEXDP3)),
-_OPI(TEXM3x3,  TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, 
SPECIAL(TEXM3x3)),
-_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, 
SPECIAL(TEXDEPTH)),
+_OPI(TEXREG2RGB,   TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, 
SPECIAL(TEXREG2RGB)),
+_OPI(TEXDP3TEX,TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, 
SPECIAL(TEXDP3TEX)),
+_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 1, 1, 
SPECIAL(TEXM3x2DEPTH)),
+_OPI(TEXDP3,   TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 

[Mesa-dev] [PATCH 47/53] st/nine: Fix sm3 relative addressing for non-debug build

2015-01-07 Thread Axel Davy
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.

The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.

Cc: "10.4" 
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 870b1b0..0137a78 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -496,7 +496,6 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
 state->changed.vs_const_b = 0;
 }
 
-#ifdef DEBUG
 if (device->state.vs->lconstf.ranges) {
 /* TODO: Can we make it so that we don't have to copy everything ? */
 const struct nine_lconstf *lconstf =  &device->state.vs->lconstf;
@@ -514,14 +513,11 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
 }
 cb.user_buffer = dst;
 }
-#endif
 
 pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
 
-#ifdef DEBUG
 if (device->state.vs->lconstf.ranges)
 FREE((void *)cb.user_buffer);
-#endif
 
 if (device->state.changed.vs_const_f) {
 struct nine_range *r = device->state.changed.vs_const_f;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 41/53] st/nine: Implement TEXDP3

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index c484e7c..02fb69e 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2239,7 +2239,17 @@ DECL_SPECIAL(TEXM3x2DEPTH)
 
 DECL_SPECIAL(TEXDP3)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+const int m = tx->insn.dst[0].idx;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+tx_texcoord_alloc(tx, m);
+
+ureg_DP3(ureg, dst, tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXM3x3)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/53] st/nine: Check if srgb format is supported before trying to use it.

2015-01-07 Thread Axel Davy
According to msdn, we must act as if user didn't ask srgb if we don't
support it.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/basetexture9.c | 11 ++-
 src/gallium/state_trackers/nine/surface9.c | 10 +-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index ccfd199..ffccafd 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -446,8 +446,10 @@ NineBaseTexture9_UpdateSamplerView( struct 
NineBaseTexture9 *This,
 {
 const struct util_format_description *desc;
 struct pipe_context *pipe = This->pipe;
+struct pipe_screen *screen = pipe->screen;
 struct pipe_resource *resource = This->base.resource;
 struct pipe_sampler_view templ;
+enum pipe_format srgb_format;
 unsigned i;
 uint8_t swizzle[4];
 
@@ -494,7 +496,14 @@ NineBaseTexture9_UpdateSamplerView( struct 
NineBaseTexture9 *This,
 }
 }
 
-templ.format = sRGB ? util_format_srgb(resource->format) : 
resource->format;
+/* if requested and supported, convert to the sRGB format */
+srgb_format = util_format_srgb(resource->format);
+if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
+screen->is_format_supported(screen, srgb_format,
+resource->target, 0, resource->bind))
+templ.format = srgb_format;
+else
+templ.format = resource->format;
 templ.u.tex.first_layer = 0;
 templ.u.tex.last_layer = resource->target == PIPE_TEXTURE_3D ?
  resource->depth0 - 1 : resource->array_size - 1;
diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index e19d24b..5928892 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -150,14 +150,22 @@ struct pipe_surface *
 NineSurface9_CreatePipeSurface( struct NineSurface9 *This, const int sRGB )
 {
 struct pipe_context *pipe = This->pipe;
+struct pipe_screen *screen = pipe->screen;
 struct pipe_resource *resource = This->base.resource;
 struct pipe_surface templ;
+enum pipe_format srgb_format;
 
 assert(This->desc.Pool == D3DPOOL_DEFAULT ||
This->desc.Pool == D3DPOOL_MANAGED);
 assert(resource);
 
-templ.format = sRGB ? util_format_srgb(resource->format) : 
resource->format;
+srgb_format = util_format_srgb(resource->format);
+if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
+screen->is_format_supported(screen, srgb_format,
+resource->target, 0, resource->bind))
+templ.format = srgb_format;
+else
+templ.format = resource->format;
 templ.u.tex.level = This->level;
 templ.u.tex.first_layer = This->layer;
 templ.u.tex.last_layer = This->layer;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 32/53] st/nine: Fix some fixed function pipeline operation

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_ff.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index a6bd360..d2b30f8 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1151,10 +1151,10 @@ ps_do_ts_op(struct ps_build_ctx *ps, unsigned top, 
struct ureg_dst dst, struct u
 ureg_MUL(ureg, ureg_saturate(dst), ureg_src(tmp), 
ureg_imm4f(ureg,4.0,4.0,4.0,4.0));
 break;
 case D3DTOP_MULTIPLYADD:
-ureg_MAD(ureg, dst, arg[2], arg[0], arg[1]);
+ureg_MAD(ureg, dst, arg[1], arg[2], arg[0]);
 break;
 case D3DTOP_LERP:
-ureg_LRP(ureg, dst, arg[1], arg[2], arg[0]);
+ureg_LRP(ureg, dst, arg[0], arg[1], arg[2]);
 break;
 case D3DTOP_DISABLE:
 /* no-op ? */
@@ -1278,6 +1278,8 @@ nine_ff_build_ps(struct NineDevice9 *device, struct 
nine_ff_ps_key *key)
 (key->ts[0].resultarg != 0 /* not current */ ||
  key->ts[0].colorop == D3DTOP_DISABLE ||
  key->ts[0].alphaop == D3DTOP_DISABLE ||
+ key->ts[0].colorop == D3DTOP_BLENDCURRENTALPHA ||
+ key->ts[0].alphaop == D3DTOP_BLENDCURRENTALPHA ||
  key->ts[0].colorarg0 == D3DTA_CURRENT ||
  key->ts[0].colorarg1 == D3DTA_CURRENT ||
  key->ts[0].colorarg2 == D3DTA_CURRENT ||
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 48/53] st/nine: Add variables containing the size of the constant buffers

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/device9.c | 10 ++
 src/gallium/state_trackers/nine/device9.h |  2 ++
 src/gallium/state_trackers/nine/stateblock9.c |  4 ++--
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 5d1a507..cae9239 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -109,7 +109,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, 
unsigned mask )
 cb.buffer = This->constbuf_vs;
 cb.user_buffer = NULL;
 }
-cb.buffer_size = This->constbuf_vs->width0;
+cb.buffer_size = This->vs_const_size;
 pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
 
 if (This->prefer_user_constbuf) {
@@ -117,7 +117,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, 
unsigned mask )
 } else {
 cb.buffer = This->constbuf_ps;
 }
-cb.buffer_size = This->constbuf_ps->width0;
+cb.buffer_size = This->ps_const_size;
 pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
 }
 
@@ -262,6 +262,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->max_ps_const_f = max_const_ps -
(NINE_MAX_CONST_I + NINE_MAX_CONST_B / 4);
 
+This->vs_const_size = max_const_vs * sizeof(float[4]);
+This->ps_const_size = max_const_ps * sizeof(float[4]);
 /* Include space for I,B constants for user constbuf. */
 This->state.vs_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
 This->state.ps_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
@@ -283,10 +285,10 @@ NineDevice9_ctor( struct NineDevice9 *This,
 tmpl.bind = PIPE_BIND_CONSTANT_BUFFER;
 tmpl.flags = 0;
 
-tmpl.width0 = max_const_vs * sizeof(float[4]);
+tmpl.width0 = This->vs_const_size;
 This->constbuf_vs = pScreen->resource_create(pScreen, &tmpl);
 
-tmpl.width0 = max_const_ps * sizeof(float[4]);
+tmpl.width0 = This->ps_const_size;
 This->constbuf_ps = pScreen->resource_create(pScreen, &tmpl);
 
 if (!This->constbuf_vs || !This->constbuf_ps)
diff --git a/src/gallium/state_trackers/nine/device9.h 
b/src/gallium/state_trackers/nine/device9.h
index cf2138a..65e39f0 100644
--- a/src/gallium/state_trackers/nine/device9.h
+++ b/src/gallium/state_trackers/nine/device9.h
@@ -77,6 +77,8 @@ struct NineDevice9
 
 struct pipe_resource *constbuf_vs;
 struct pipe_resource *constbuf_ps;
+uint16_t vs_const_size;
+uint16_t ps_const_size;
 uint16_t max_vs_const_f;
 uint16_t max_ps_const_f;;
 
diff --git a/src/gallium/state_trackers/nine/stateblock9.c 
b/src/gallium/state_trackers/nine/stateblock9.c
index 36b5e77..220b196 100644
--- a/src/gallium/state_trackers/nine/stateblock9.c
+++ b/src/gallium/state_trackers/nine/stateblock9.c
@@ -43,8 +43,8 @@ NineStateBlock9_ctor( struct NineStateBlock9 *This,
 
 This->type = type;
 
-This->state.vs_const_f = MALLOC(pParams->device->constbuf_vs->width0);
-This->state.ps_const_f = MALLOC(pParams->device->constbuf_ps->width0);
+This->state.vs_const_f = MALLOC(This->base.device->vs_const_size);
+This->state.ps_const_f = MALLOC(This->base.device->ps_const_size);
 if (!This->state.vs_const_f || !This->state.ps_const_f)
 return E_OUTOFMEMORY;
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 37/53] st/nine: implement TEXM3x2DEPTH

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 440f6f7..ac86237 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2154,7 +2154,32 @@ DECL_SPECIAL(TEXDP3TEX)
 
 DECL_SPECIAL(TEXM3x2DEPTH)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst tmp;
+const int m = tx->insn.dst[0].idx - 1;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+tx_texcoord_alloc(tx, m);
+tx_texcoord_alloc(tx, m+1);
+
+tmp = tx_scratch(tx);
+
+/* performs the matrix multiplication */
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], 
ureg_src(tx->regs.tS[n]));
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], 
ureg_src(tx->regs.tS[n]));
+
+ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Z), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
+/* tmp.x = 'z', tmp.y = 'w', tmp.z = 1/'w'. */
+ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), 
TGSI_SWIZZLE_Z));
+/* res = 'w' == 0 ? 1.0 : z/w */
+ureg_CMP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_negate(ureg_abs(ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y))),
+ ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 
1.0f));
+/* replace the depth for depth testing with the result */
+tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, 
TGSI_WRITEMASK_Z);
+ureg_MOV(ureg, tx->regs.oDepth, ureg_scalar(ureg_src(tmp), 
TGSI_SWIZZLE_X));
+/* note that we write nothing to the destination, since it's disallowed to 
use it afterward */
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXDP3)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/53] st/nine: Convert integer constants to floats before storing them when cards don't support integers

2015-01-07 Thread Axel Davy
The shader code is already behaving as if they are floats when the the card 
doesn't support integers

Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/device9.c | 65 ---
 1 file changed, 52 insertions(+), 13 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index d747a7a..5d1a507 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2932,6 +2932,7 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 
*This,
   UINT Vector4iCount )
 {
 struct nine_state *state = This->update;
+int i;
 
 DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
 This, StartRegister, pConstantData, Vector4iCount);
@@ -2940,9 +2941,18 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 
*This,
 user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(&state->vs_const_i[StartRegister][0],
-   pConstantData,
-   Vector4iCount * sizeof(state->vs_const_i[0]));
+if (This->driver_caps.vs_integer) {
+memcpy(&state->vs_const_i[StartRegister][0],
+   pConstantData,
+   Vector4iCount * sizeof(state->vs_const_i[0]));
+} else {
+for (i = 0; i < Vector4iCount; i++) {
+state->vs_const_i[StartRegister+i][0] = 
fui((float)(pConstantData[4*i]));
+state->vs_const_i[StartRegister+i][1] = 
fui((float)(pConstantData[4*i+1]));
+state->vs_const_i[StartRegister+i][2] = 
fui((float)(pConstantData[4*i+2]));
+state->vs_const_i[StartRegister+i][3] = 
fui((float)(pConstantData[4*i+3]));
+}
+}
 
 state->changed.vs_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
 state->changed.group |= NINE_STATE_VS_CONST;
@@ -2957,14 +2967,24 @@ NineDevice9_GetVertexShaderConstantI( struct 
NineDevice9 *This,
   UINT Vector4iCount )
 {
 const struct nine_state *state = &This->state;
+int i;
 
 user_assert(StartRegister  < NINE_MAX_CONST_I, 
D3DERR_INVALIDCALL);
 user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(pConstantData,
-   &state->vs_const_i[StartRegister][0],
-   Vector4iCount * sizeof(state->vs_const_i[0]));
+if (This->driver_caps.vs_integer) {
+memcpy(pConstantData,
+   &state->vs_const_i[StartRegister][0],
+   Vector4iCount * sizeof(state->vs_const_i[0]));
+} else {
+for (i = 0; i < Vector4iCount; i++) {
+pConstantData[4*i] = (int32_t) 
uif(state->vs_const_i[StartRegister+i][0]);
+pConstantData[4*i+1] = (int32_t) 
uif(state->vs_const_i[StartRegister+i][1]);
+pConstantData[4*i+2] = (int32_t) 
uif(state->vs_const_i[StartRegister+i][2]);
+pConstantData[4*i+3] = (int32_t) 
uif(state->vs_const_i[StartRegister+i][3]);
+}
+}
 
 return D3D_OK;
 }
@@ -3238,6 +3258,7 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 
*This,
  UINT Vector4iCount )
 {
 struct nine_state *state = This->update;
+int i;
 
 DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
 This, StartRegister, pConstantData, Vector4iCount);
@@ -3246,10 +3267,18 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 
*This,
 user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, 
D3DERR_INVALIDCALL);
 user_assert(pConstantData, D3DERR_INVALIDCALL);
 
-memcpy(&state->ps_const_i[StartRegister][0],
-   pConstantData,
-   Vector4iCount * sizeof(state->ps_const_i[0]));
-
+if (This->driver_caps.ps_integer) {
+memcpy(&state->ps_const_i[StartRegister][0],
+   pConstantData,
+   Vector4iCount * sizeof(state->ps_const_i[0]));
+} else {
+for (i = 0; i < Vector4iCount; i++) {
+state->ps_const_i[StartRegister+i][0] = 
fui((float)(pConstantData[4*i]));
+state->ps_const_i[StartRegister+i][1] = 
fui((float)(pConstantData[4*i+1]));
+state->ps_const_i[StartRegister+i][2] = 
fui((float)(pConstantData[4*i+2]));
+state->ps_const_i[StartRegister+i][3] = 
fui((float)(pConstantData[4*i+3]));
+}
+}
 state->changed.ps_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
 state->changed.group |= NINE_STATE_PS_CONST;
 
@@ -3263,14 +3292,24 @@ NineDevice9_GetPixelShaderConstantI( struct NineDevice9 
*This,
  UINT Vector4iCount )
 {
 const struct nine_state *state = &This->state;
+int i;
 
 user_assert(StartRegister  < NINE_MAX_CONST_I, 
D3DERR_INVALIDCALL);
 

[Mesa-dev] [PATCH 23/53] st/nine: Fix POW implementation

2015-01-07 Thread Axel Davy
POW doesn't match directly TGSI, since we should
take the absolute value of src0.

Fixes black textures in some games

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 6f8ddcc..da77da5 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1946,6 +1946,17 @@ DECL_SPECIAL(DEFI)
 return D3D_OK;
 }
 
+DECL_SPECIAL(POW)
+{
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src src[2] = {
+tx_src_param(tx, &tx->insn.src[0]),
+tx_src_param(tx, &tx->insn.src[1])
+};
+ureg_POW(tx->ureg, dst, ureg_abs(src[0]), src[1]);
+return D3D_OK;
+}
+
 DECL_SPECIAL(NRM)
 {
 struct ureg_program *ureg = tx->ureg;
@@ -2288,7 +2299,7 @@ struct sm1_op_info inst_table[] =
 
 _OPI(DCL, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(DCL)),
 
-_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL),
+_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(POW)),
 _OPI(CRS, XPD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* XXX: .w */
 _OPI(SGN, SSG, V(2,0), V(3,0), V(0,0), V(0,0), 1, 3, SPECIAL(SGN)), /* 
ignore src1,2 */
 _OPI(ABS, ABS, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL),
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 36/53] st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC

2015-01-07 Thread Axel Davy
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.

Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 53 ++-
 1 file changed, 36 insertions(+), 17 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 2b0349f..440f6f7 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2142,11 +2142,6 @@ DECL_SPECIAL(TEXM3x3SPEC)
 STUB(D3DERR_INVALIDCALL);
 }
 
-DECL_SPECIAL(TEXM3x3VSPEC)
-{
-STUB(D3DERR_INVALIDCALL);
-}
-
 DECL_SPECIAL(TEXREG2RGB)
 {
 STUB(D3DERR_INVALIDCALL);
@@ -2171,28 +2166,52 @@ DECL_SPECIAL(TEXM3x3)
 {
 struct ureg_program *ureg = tx->ureg;
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
-struct ureg_src src[4];
-int s;
+struct ureg_src sample;
+struct ureg_dst E, tmp;
 const int m = tx->insn.dst[0].idx - 2;
 const int n = tx->insn.src[0].idx;
 assert(m >= 0 && m > n);
 
-for (s = m; s <= (m + 2); ++s) {
-tx_texcoord_alloc(tx, s);
-src[s] = tx->regs.vT[s];
-}
-ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), src[0], 
ureg_src(tx->regs.tS[n]));
-ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), src[1], 
ureg_src(tx->regs.tS[n]));
-ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), src[2], 
ureg_src(tx->regs.tS[n]));
+tx_texcoord_alloc(tx, m);
+tx_texcoord_alloc(tx, m+1);
+tx_texcoord_alloc(tx, m+2);
+
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], 
ureg_src(tx->regs.tS[n]));
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], 
ureg_src(tx->regs.tS[n]));
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], 
ureg_src(tx->regs.tS[n]));
 
 switch (tx->insn.opcode) {
 case D3DSIO_TEXM3x3:
 ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(ureg, 
1.0f));
 break;
 case D3DSIO_TEXM3x3TEX:
-src[3] = ureg_DECL_sampler(ureg, m + 2);
+sample = ureg_DECL_sampler(ureg, m + 2);
+tx->info->sampler_mask |= 1 << (m + 2);
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), 
sample);
+break;
+case D3DSIO_TEXM3x3VSPEC:
+sample = ureg_DECL_sampler(ureg, m + 2);
 tx->info->sampler_mask |= 1 << (m + 2);
-ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), 
src[3]);
+E = tx_scratch(tx);
+tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
+ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_X), 
ureg_scalar(tx->regs.vT[m], TGSI_SWIZZLE_W));
+ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Y), 
ureg_scalar(tx->regs.vT[m+1], TGSI_SWIZZLE_W));
+ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Z), 
ureg_scalar(tx->regs.vT[m+2], TGSI_SWIZZLE_W));
+/* At this step, dst = N = (u', w', z').
+ * We want dst to be the texture sampled at (u'', w'', z''), with
+ * (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), 
ureg_src(dst));
+ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
+/* at this step tmp.x = 1/N.N */
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), 
ureg_src(E));
+/* at this step tmp.y = N.E */
+ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), 
TGSI_SWIZZLE_Y));
+/* at this step tmp.x = N.E/N.N */
+ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
+ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), 
ureg_src(dst));
+/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
+ureg_SUB(ureg, tmp, ureg_src(tmp), ureg_src(E));
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), 
sample);
 break;
 default:
 return D3DERR_INVALIDCALL;
@@ -2410,7 +2429,7 @@ struct sm1_op_info inst_table[] =
 _OPI(TEXM3x3PAD,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3PAD)),
 _OPI(TEXM3x3TEX,   TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3)),
 _OPI(TEXM3x3SPEC,  TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 2, 
SPECIAL(TEXM3x3SPEC)),
-_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3VSPEC)),
+_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, 
SPECIAL(TEXM3x3)),
 
 _OPI(EXPP, EXP, V(0,0), V(1,1), V(0,0), V(0,0), 1, 1, NULL),
 _OPI(EXPP, EX2, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
-- 
2.1.3

___
mesa-de

[Mesa-dev] [PATCH 31/53] st/nine: Clamp ps 1.X constants

2015-01-07 Thread Axel Davy
This is wine (and windows) behaviour.

Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 3fefce4..fb01408 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -854,6 +854,13 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 nine_info_mark_const_f_used(tx->info, param->idx);
 src = ureg_src_register(TGSI_FILE_CONSTANT, param->idx);
 }
+if (!IS_VS && tx->version.major < 2) {
+/* ps 1.X clamps constants */
+tmp = tx_scratch(tx);
+ureg_MIN(ureg, tmp, src, ureg_imm1f(ureg, 1.0f));
+ureg_MAX(ureg, tmp, ureg_src(tmp), ureg_imm1f(ureg, -1.0f));
+src = ureg_src(tmp);
+}
 break;
 case D3DSPR_CONST2:
 case D3DSPR_CONST3:
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/53] st/nine: Clamp color inputs for ps <= 2.0 at ps level instead of vs

2015-01-07 Thread Axel Davy
Nine code was clamping color outputs for vs < 3,
but msdn docs says it is done in the ps.
Wine seems to clamp them at the vs level.

It makes more sense to clamp at vs level for performance,
but according to doc, ps 2.x shouldn't see clamping.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 8b96673..b0c08ad 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -800,9 +800,21 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 } else {
 if (tx->version.major < 3) {
 assert(!param->rel);
-src = ureg_DECL_fs_input(tx->ureg, TGSI_SEMANTIC_COLOR,
- param->idx,
- TGSI_INTERPOLATE_PERSPECTIVE);
+assert(param->idx < 2);
+if (ureg_src_is_undef(tx->regs.vC[param->idx])) {
+src = ureg_DECL_fs_input(ureg,
+ TGSI_SEMANTIC_COLOR,
+ param->idx,
+ TGSI_INTERPOLATE_PERSPECTIVE);
+/* ps <= 2.0: diffuse and specular are clamped to [0, 1] */
+if (tx->version.major < 2 || tx->version.minor == 0) {
+tmp = ureg_DECL_temporary(ureg);
+ureg_MOV(ureg, ureg_saturate(tmp), src);
+tx->regs.vC[param->idx] = ureg_src(tmp);
+} else
+tx->regs.vC[param->idx] = src;
+}
+src = tx->regs.vC[param->idx];
 } else {
 assert(!param->rel); /* TODO */
 assert(param->idx < Elements(tx->regs.v));
@@ -1045,8 +1057,6 @@ _tx_dst_param(struct shader_translator *tx, const struct 
sm1_dst_param *param)
 tx->regs.oCol[param->idx] =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_COLOR, param->idx);
 dst = tx->regs.oCol[param->idx];
-if (IS_VS && tx->version.major < 3)
-dst = ureg_saturate(dst);
 break;
 case D3DSPR_DEPTHOUT:
 assert(!param->rel);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/53] st/nine: Fix CND implementation

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Signed-off-by: Tiziano Bacocco 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 88d4c07..8bcf67b 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1394,7 +1394,7 @@ DECL_SPECIAL(CND)
 struct ureg_dst cgt;
 struct ureg_src cnd;
 
-if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4) {
+if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4 && 
tx->insn.dst[0].mask != NINED3DSP_WRITEMASK_3) {
 ureg_MOV(tx->ureg,
  dst, tx_src_param(tx, &tx->insn.src[1]));
 return D3D_OK;
@@ -1403,16 +1403,14 @@ DECL_SPECIAL(CND)
 cnd = tx_src_param(tx, &tx->insn.src[0]);
 cgt = tx_scratch(tx);
 
-if (tx->version.major == 1 && tx->version.minor < 4) {
-cgt.WriteMask = TGSI_WRITEMASK_W;
-ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
+if (tx->version.major == 1 && tx->version.minor < 4)
 cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
-} else {
-ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
-}
-ureg_CMP(tx->ureg, dst,
+
+ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
+
+ureg_CMP(tx->ureg, dst, ureg_negate(ureg_src(cgt)),
  tx_src_param(tx, &tx->insn.src[1]),
- tx_src_param(tx, &tx->insn.src[2]), ureg_negate(cnd));
+ tx_src_param(tx, &tx->insn.src[2]));
 return D3D_OK;
 }
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 44/53] st/nine: Implement ps3 advanced input definition feature

2015-01-07 Thread Axel Davy
ps3 allows definitions of the inputs like:
DCL_TEXCOORD0 v0.xy;
DCL_NORMAL2 v0.z;
DCL_NORMAL3 v0.w;

Nine wouldn't have handled this situation properly.

Apparently very few applications use this feature.

Still remain an issue with this new implementation:
It is allowed to do indirect addressing on the ps inputs.

Since here the inputs are not contiguous (we allocate temps)
it cannot be implemented (we have an assert for that currently
in the code, and at least one app was reported to need this to
work)

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 9d4fb2f..69e35a2 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1900,6 +1900,29 @@ nine_tgsi_to_interp_mode(struct 
tgsi_declaration_semantic *sem)
 }
 }
 
+static void ps3_concat_inputs(struct shader_translator *tx,
+  struct sm1_semantic *sem,
+  struct ureg_src src)
+{
+unsigned idx = sem->reg.idx;
+struct ureg_src previous_reg = tx->regs.v[idx];
+struct ureg_dst tmp = ureg_DECL_temporary(tx->ureg);
+BYTE swizzle[4] = {TGSI_SWIZZLE_X, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Z, 
TGSI_SWIZZLE_W};
+BYTE mask = sem->reg.mask;
+
+assert (mask);
+while (!(mask & 0x01)) {
+swizzle[0] = swizzle[1];
+swizzle[1] = swizzle[2];
+swizzle[2] = swizzle[3];
+mask = mask > 1;
+}
+ureg_MOV(tx->ureg, tmp, previous_reg);
+ureg_MOV(tx->ureg, ureg_writemask(tmp, sem->reg.mask),
+ ureg_swizzle(src, swizzle[0], swizzle[1], swizzle[2], 
swizzle[3]));
+tx->regs.v[idx] = ureg_src(tmp);
+}
+
 DECL_SPECIAL(DCL)
 {
 struct ureg_program *ureg = tx->ureg;
@@ -1907,6 +1930,7 @@ DECL_SPECIAL(DCL)
 boolean is_sampler;
 struct tgsi_declaration_semantic tgsi;
 struct sm1_semantic sem;
+struct ureg_src src;
 sm1_read_semantic(tx, &sem);
 
 is_input = sem.reg.file == D3DSPR_INPUT;
@@ -1962,11 +1986,16 @@ DECL_SPECIAL(DCL)
 if (is_input && tx->version.major >= 3) {
 /* SM3 only, SM2 input semantic determined by file */
 assert(sem.reg.idx < Elements(tx->regs.v));
-tx->regs.v[sem.reg.idx] = ureg_DECL_fs_input_cyl_centroid(
+src = ureg_DECL_fs_input_cyl_centroid(
 ureg, tgsi.Name, tgsi.Index,
 nine_tgsi_to_interp_mode(&tgsi),
 0, /* cylwrap */
 sem.reg.mod & NINED3DSPDM_CENTROID);
+if (sem.reg.mask == NINED3DSP_WRITEMASK_ALL ||
+ureg_src_is_undef(tx->regs.v[sem.reg.idx]))
+tx->regs.v[sem.reg.idx] = src;
+else
+ps3_concat_inputs(tx, &sem, src);
 } else
 if (!is_input && 0) { /* declare in COLOROUT/DEPTHOUT case */
 /* FragColor or FragDepth */
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/53] st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs

2015-01-07 Thread Axel Davy
Let's say we have c1 and c2 declared in the shader and c0 given by the app

Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.

This correction fixes several issues in some games.

Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 6320f36..0b378d5 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1168,16 +1168,19 @@ NineTranslateInstruction_Mkxn(struct shader_translator 
*tx, const unsigned k, co
 struct ureg_program *ureg = tx->ureg;
 struct ureg_dst dst;
 struct ureg_src src[2];
+struct sm1_src_param *src_mat = &tx->insn.src[1];
 unsigned i;
 
 dst = tx_dst_param(tx, &tx->insn.dst[0]);
 src[0] = tx_src_param(tx, &tx->insn.src[0]);
-src[1] = tx_src_param(tx, &tx->insn.src[1]);
 
-for (i = 0; i < n; i++, src[1].Index++)
+for (i = 0; i < n; i++)
 {
 const unsigned m = (1 << i);
 
+src[1] = tx_src_param(tx, src_mat);
+src_mat->idx++;
+
 if (!(dst.WriteMask & m))
 continue;
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 45/53] st/nine: Correct rules for relative adressing and constants.

2015-01-07 Thread Axel Davy
relative adressing for constants is possible only for vs float
constants.

Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 69e35a2..4d59cba 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -847,6 +847,7 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_src_register(TGSI_FILE_SAMPLER, param->idx);
 break;
 case D3DSPR_CONST:
+assert(!param->rel || IS_VS);
 if (param->rel)
 tx->indirect_const_access = TRUE;
 if (param->rel || !tx_lconstf(tx, &src, param->idx)) {
@@ -870,19 +871,20 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_imm1f(ureg, 0.0f);
 break;
 case D3DSPR_CONSTINT:
-if (param->rel || !tx_lconsti(tx, &src, param->idx)) {
-if (!param->rel)
-nine_info_mark_const_i_used(tx->info, param->idx);
+/* relative adressing only possible for float constants in vs */
+assert(!param->rel);
+if (!tx_lconsti(tx, &src, param->idx)) {
+nine_info_mark_const_i_used(tx->info, param->idx);
 src = ureg_src_register(TGSI_FILE_CONSTANT,
 tx->info->const_i_base + param->idx);
 }
 break;
 case D3DSPR_CONSTBOOL:
-if (param->rel || !tx_lconstb(tx, &src, param->idx)) {
+assert(!param->rel);
+if (!tx_lconstb(tx, &src, param->idx)) {
char r = param->idx / 4;
char s = param->idx & 3;
-   if (!param->rel)
-   nine_info_mark_const_b_used(tx->info, param->idx);
+   nine_info_mark_const_b_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_b_base + r);
src = ureg_swizzle(src, s, s, s, s);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 40/53] st/nine: Implement TEXDEPTH

2015-01-07 Thread Axel Davy
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 3b29f58..c484e7c 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2301,7 +2301,28 @@ DECL_SPECIAL(TEXM3x3)
 
 DECL_SPECIAL(TEXDEPTH)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst r5;
+struct ureg_src r5r, r5g;
+
+assert(tx->insn.dst[0].idx == 5); /* instruction must get r5 here */
+
+/* we must replace the depth by r5.g == 0 ? 1.0f : r5.r/r5.g.
+ * r5 won't be used afterward, thus we can use r5.ba */
+r5 = tx->regs.r[5];
+r5r = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_X);
+r5g = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Y);
+
+ureg_RCP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_Z), r5g);
+ureg_MUL(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), r5r, 
ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Z));
+/* r5.r = r/g */
+ureg_CMP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), 
ureg_negate(ureg_abs(r5g)),
+ r5r, ureg_imm1f(ureg, 1.0f));
+/* replace the depth for depth testing with the result */
+tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, 
TGSI_WRITEMASK_Z);
+ureg_MOV(ureg, tx->regs.oDepth, r5r);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(BEM)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 46/53] st/nine: Remove unused code for ps

2015-01-07 Thread Axel Davy
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.

Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_state.c   | 43 +++---
 src/gallium/state_trackers/nine/pixelshader9.c | 10 +++---
 src/gallium/state_trackers/nine/pixelshader9.h |  2 --
 3 files changed, 15 insertions(+), 40 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 00da62b..870b1b0 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -352,8 +352,8 @@ update_constants(struct NineDevice9 *device, unsigned 
shader_type)
 const unsigned usage = PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE;
 unsigned x = 0; /* silence warning */
 unsigned i, c, n;
-const struct nine_lconstf *lconstf;
-struct nine_range *r, *p;
+struct nine_range *r, *p, *lconstf_ranges;
+float *lconstf_data;
 
 box.y = 0;
 box.z = 0;
@@ -381,7 +381,9 @@ update_constants(struct NineDevice9 *device, unsigned 
shader_type)
 device->state.changed.vs_const_b = 0;
 const_b = device->state.vs_const_b;
 
-lconstf = &device->state.vs->lconstf;
+lconstf_ranges = device->state.vs->lconstf.ranges;
+lconstf_data = device->state.vs->lconstf.data;
+
 device->state.ff.clobber.vs_const = TRUE;
 device->state.changed.group &= ~NINE_STATE_VS_CONST;
 } else {
@@ -405,7 +407,9 @@ update_constants(struct NineDevice9 *device, unsigned 
shader_type)
 device->state.changed.ps_const_b = 0;
 const_b = device->state.ps_const_b;
 
-lconstf = &device->state.ps->lconstf;
+lconstf_ranges = NULL;
+lconstf_data = NULL;
+
 device->state.ff.clobber.ps_const = TRUE;
 device->state.changed.group &= ~NINE_STATE_PS_CONST;
 }
@@ -452,14 +456,14 @@ update_constants(struct NineDevice9 *device, unsigned 
shader_type)
 }
 
 /* TODO: only upload these when shader itself changes */
-if (lconstf->ranges) {
+if (lconstf_ranges) {
 unsigned n = 0;
-struct nine_range *r = lconstf->ranges;
+struct nine_range *r = lconstf_ranges;
 while (r) {
 box.x = r->bgn * 4 * sizeof(float);
 n += r->end - r->bgn;
 box.width = (r->end - r->bgn) * 4 * sizeof(float);
-data = &lconstf->data[4 * n];
+data = &lconstf_data[4 * n];
 pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data, 0, 0);
 r = r->next;
 }
@@ -556,33 +560,8 @@ update_ps_constants_userbuf(struct NineDevice9 *device)
 state->changed.ps_const_b = 0;
 }
 
-#ifdef DEBUG
-if (device->state.ps->lconstf.ranges) {
-/* TODO: Can we make it so that we don't have to copy everything ? */
-const struct nine_lconstf *lconstf =  &device->state.ps->lconstf;
-const struct nine_range *r = lconstf->ranges;
-unsigned n = 0;
-float *dst = (float *)MALLOC(cb.buffer_size);
-float *src = (float *)cb.user_buffer;
-memcpy(dst, src, cb.buffer_size);
-while (r) {
-unsigned p = r->bgn;
-unsigned c = r->end - r->bgn;
-memcpy(&dst[p * 4], &lconstf->data[n * 4], c * 4 * sizeof(float));
-n += c;
-r = r->next;
-}
-cb.user_buffer = dst;
-}
-#endif
-
 pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
 
-#ifdef DEBUG
-if (device->state.ps->lconstf.ranges)
-FREE((void *)cb.user_buffer);
-#endif
-
 if (device->state.changed.ps_const_f) {
 struct nine_range *r = device->state.changed.ps_const_f;
 struct nine_range *p = r;
diff --git a/src/gallium/state_trackers/nine/pixelshader9.c 
b/src/gallium/state_trackers/nine/pixelshader9.c
index ac204ff..dcd2346 100644
--- a/src/gallium/state_trackers/nine/pixelshader9.c
+++ b/src/gallium/state_trackers/nine/pixelshader9.c
@@ -72,9 +72,10 @@ NinePixelShader9_ctor( struct NinePixelShader9 *This,
 This->sampler_mask = info.sampler_mask;
 This->rt_mask = info.rt_mask;
 This->const_used_size = info.const_used_size;
-if (info.const_used_size == ~0)
-This->const_used_size = NINE_CONSTBUF_SIZE(device->max_ps_const_f);
-This->lconstf = info.lconstf;
+/* no constant relative addressing for ps */
+assert(info.const_used_size != ~0);
+assert(info.lconstf.data == NULL);
+assert(info.lconstf.ranges == NULL);
 
 return D3D_OK;
 }
@@ -100,9 +101,6 @@ NinePixelShader9_dtor( struct NinePixelShader9 *This )
 
 FREE((void *)This->byte_code.tokens); /* const_cast */
 
-FREE(This->lconstf.data);
-FREE(This->lconstf.ranges);
-
 NineUnknown_dtor(&This->base);
 }
 
diff --git a/src/gallium/state_trackers/nine/pixelshader9.h 
b/src/gallium/state_trackers/nine/pixelshader9.h
index 5e00b46..5e2219c 100644
--- a/src/

[Mesa-dev] [PATCH 24/53] st/nine: Handle RSQ special cases

2015-01-07 Thread Axel Davy
We should use the absolute value of the input as input to ureg_RSQ.

Moreover, an input of 0.0 should return FLT_MAX.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index da77da5..4dee5f5 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1957,6 +1957,17 @@ DECL_SPECIAL(POW)
 return D3D_OK;
 }
 
+DECL_SPECIAL(RSQ)
+{
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
+struct ureg_dst tmp = tx_scratch(tx);
+ureg_RSQ(ureg, tmp, ureg_abs(src));
+ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
+return D3D_OK;
+}
+
 DECL_SPECIAL(NRM)
 {
 struct ureg_program *ureg = tx->ureg;
@@ -2270,7 +2281,7 @@ struct sm1_op_info inst_table[] =
 _OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
 _OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
 _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
-_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 7 */
+_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
 _OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
 _OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
 _OPI(MIN, MIN, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 10 */
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 53/53] st/nine: Correctly handle when ff vs should have no texture coord input/output

2015-01-07 Thread Axel Davy
Previous code semantic was:

. if ff ps will not run a ff stage, then do not output texture coords for this 
stage
for vs
. if XYZRHW is used (position_t), use only the mode where input coordinates are 
copied
to the outputs.

Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it 
means
copy texture coord input to texture coord output if there is such input. The 
case
where there is no texture coord input wasn't handled correctly.

Drivers like r300 dislike when vs has inputs that are not fed.

Moreover if the app uses ff vs with a programmable ps, we shouldn't look at
what are the parameters of the ff ps to decide to output or not texture
coordinates.

The new code semantic is:

. if XYZRHW is used, restrict to PASSTHRU
. if PASSTHRU is used and no texture input is declared, then do not output
texture coords for this stage

The case where ff ps needs a texture coord input and ff vs doesn't output
it is not handled, and should probably be a runtime error.

This fixes 3Dmark05, which uses ff vs with programmable ps.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index d2b30f8..2e3470f 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1377,11 +1377,13 @@ nine_ff_get_vs(struct NineDevice9 *device)
 struct vs_build_ctx bld;
 struct nine_ff_vs_key key;
 unsigned s, i;
+char input_texture_coord[8];
 
 assert(sizeof(key) <= sizeof(key.value32));
 
 memset(&key, 0, sizeof(key));
 memset(&bld, 0, sizeof(bld));
+memset(&input_texture_coord, 0, sizeof(input_texture_coord));
 
 bld.key = &key;
 
@@ -1399,6 +1401,13 @@ nine_ff_get_vs(struct NineDevice9 *device)
 key.color1in_one = 0;
 else if (usage == NINE_DECLUSAGE_PSIZE)
 key.vertexpointsize = 1;
+else if (usage % NINE_DECLUSAGE_COUNT == NINE_DECLUSAGE_TEXCOORD) {
+s = usage / NINE_DECLUSAGE_COUNT;
+if (s < 8)
+input_texture_coord[s] = 1;
+else
+DBG("FF given texture coordinate >= 8. Ignoring\n");
+}
 }
 }
 if (!key.vertexpointsize)
@@ -1436,18 +1445,18 @@ nine_ff_get_vs(struct NineDevice9 *device)
 }
 
 for (s = 0; s < 8; ++s) {
-if (state->ff.tex_stage[s][D3DTSS_COLOROP] == D3DTOP_DISABLE &&
-state->ff.tex_stage[s][D3DTSS_ALPHAOP] == D3DTOP_DISABLE)
-break;
+unsigned gen = (state->ff.tex_stage[s][D3DTSS_TEXCOORDINDEX] >> 16) + 
1;
+unsigned dim = 
MIN2(state->ff.tex_stage[s][D3DTSS_TEXTURETRANSFORMFLAGS] & 0x7, 4);
+
+if (key.position_t && gen > NINED3DTSS_TCI_PASSTHRU)
+gen = NINED3DTSS_TCI_PASSTHRU;
+
+if (!input_texture_coord[s] && gen == NINED3DTSS_TCI_PASSTHRU)
+gen = NINED3DTSS_TCI_DISABLE;
+
+key.tc_gen |= gen << (s * 3);
 key.tc_idx |= (state->ff.tex_stage[s][D3DTSS_TEXCOORDINDEX] & 7) << (s 
* 3);
-if (!key.position_t) {
-unsigned gen = (state->ff.tex_stage[s][D3DTSS_TEXCOORDINDEX] >> 
16) + 1;
-unsigned dim = 
MIN2(state->ff.tex_stage[s][D3DTSS_TEXTURETRANSFORMFLAGS] & 0x7, 4);
-key.tc_gen |= gen << (s * 3);
-key.tc_dim |= dim << (s * 3);
-} else {
-key.tc_gen |= NINED3DTSS_TCI_PASSTHRU << (s * 3);
-}
+key.tc_dim |= dim << (s * 3);
 }
 
 vs = util_hash_table_get(device->ff.ht_vs, &key);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/53] st/nine: Fix typo for M4x4

2015-01-07 Thread Axel Davy
Cc: "10.4" 
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 0b378d5..6f8ddcc 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1337,7 +1337,7 @@ NineTranslateInstruction_Generic(struct shader_translator 
*);
 
 DECL_SPECIAL(M4x4)
 {
-return NineTranslateInstruction_Mkxn(tx, 4, 3);
+return NineTranslateInstruction_Mkxn(tx, 4, 4);
 }
 
 DECL_SPECIAL(M4x3)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 38/53] st/nine: Implement TEXM3x2TEX

2015-01-07 Thread Axel Davy
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index ac86237..20a8e8a 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2129,7 +2129,25 @@ DECL_SPECIAL(TEXM3x2PAD)
 
 DECL_SPECIAL(TEXM3x2TEX)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src sample;
+const int m = tx->insn.dst[0].idx - 1;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+tx_texcoord_alloc(tx, m);
+tx_texcoord_alloc(tx, m+1);
+
+/* performs the matrix multiplication */
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], 
ureg_src(tx->regs.tS[n]));
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], 
ureg_src(tx->regs.tS[n]));
+
+sample = ureg_DECL_sampler(ureg, m + 1);
+tx->info->sampler_mask |= 1 << (m + 1);
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 1), ureg_src(dst), 
sample);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXM3x3PAD)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/53] st/nine: Rewrite LOOP implementation, and a0 aL handling

2015-01-07 Thread Axel Davy
Previous implementation didn't work well with nested loops.

Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.

Wine tests loop_index_test() and nested_loop_test() now pass correctly.

Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696

Tested-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 156 +++---
 1 file changed, 92 insertions(+), 64 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 327dd2c..21b06ce 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -466,14 +466,14 @@ struct shader_translator
 struct ureg_src vFace;
 struct ureg_src s;
 struct ureg_dst p;
-struct ureg_dst a;
+struct ureg_dst address;
+struct ureg_dst a0;
 struct ureg_dst tS[8]; /* texture stage registers */
 struct ureg_dst tdst; /* scratch dst if we need extra modifiers */
 struct ureg_dst t[5]; /* scratch TEMPs */
 struct ureg_src vC[2]; /* PS color in */
 struct ureg_src vT[8]; /* PS texcoord in */
 struct ureg_dst rL[NINE_MAX_LOOP_DEPTH]; /* loop ctr */
-struct ureg_dst aL[NINE_MAX_LOOP_DEPTH]; /* loop ctr ADDR register */
 } regs;
 unsigned num_temp; /* Elements(regs.r) */
 unsigned num_scratch;
@@ -482,6 +482,7 @@ struct shader_translator
 unsigned cond_depth;
 unsigned loop_labels[NINE_MAX_LOOP_DEPTH];
 unsigned cond_labels[NINE_MAX_COND_DEPTH];
+boolean loop_or_rep[NINE_MAX_LOOP_DEPTH]; /* true: loop, false: rep */
 
 unsigned *inst_labels; /* LABEL op */
 unsigned num_inst_labels;
@@ -659,8 +660,10 @@ static INLINE void
 tx_addr_alloc(struct shader_translator *tx, INT idx)
 {
 assert(idx == 0);
-if (ureg_dst_is_undef(tx->regs.a))
-tx->regs.a = ureg_DECL_address(tx->ureg);
+if (ureg_dst_is_undef(tx->regs.address))
+tx->regs.address = ureg_DECL_address(tx->ureg);
+if (ureg_dst_is_undef(tx->regs.a0))
+tx->regs.a0 = ureg_DECL_temporary(tx->ureg);
 }
 
 static INLINE void
@@ -702,7 +705,7 @@ tx_endloop(struct shader_translator *tx)
 }
 
 static struct ureg_dst
-tx_get_loopctr(struct shader_translator *tx)
+tx_get_loopctr(struct shader_translator *tx, boolean loop_or_rep)
 {
 const unsigned l = tx->loop_depth - 1;
 
@@ -712,26 +715,32 @@ tx_get_loopctr(struct shader_translator *tx)
 return ureg_dst_undef();
 }
 
-if (ureg_dst_is_undef(tx->regs.aL[l]))
-{
-struct ureg_dst rreg = ureg_DECL_local_temporary(tx->ureg);
-struct ureg_dst areg = ureg_DECL_address(tx->ureg);
-unsigned c;
-
-assert(l % 4 == 0);
-for (c = l; c < (l + 4) && c < Elements(tx->regs.aL); ++c) {
-tx->regs.rL[c] = ureg_writemask(rreg, 1 << (c & 3));
-tx->regs.aL[c] = ureg_writemask(areg, 1 << (c & 3));
-}
+if (ureg_dst_is_undef(tx->regs.rL[l])) {
+/* loop or rep ctr creation */
+tx->regs.rL[l] = ureg_DECL_local_temporary(tx->ureg);
+tx->loop_or_rep[l] = loop_or_rep;
 }
+/* loop - rep - endloop - endrep not allowed */
+assert(tx->loop_or_rep[l] == loop_or_rep);
+
 return tx->regs.rL[l];
 }
-static struct ureg_dst
-tx_get_aL(struct shader_translator *tx)
+
+static struct ureg_src
+tx_get_loopal(struct shader_translator *tx)
 {
-if (!ureg_dst_is_undef(tx_get_loopctr(tx)))
-return tx->regs.aL[tx->loop_depth - 1];
-return ureg_dst_undef();
+int loop_level = tx->loop_depth - 1;
+
+while (loop_level >= 0) {
+/* handle loop - rep - endrep - endloop case */
+if (tx->loop_or_rep[loop_level])
+/* the value is in the loop counter y component (nine 
implementation) */
+return ureg_scalar(ureg_src(tx->regs.rL[loop_level]), 
TGSI_SWIZZLE_Y);
+loop_level--;
+}
+
+DBG("aL counter requested outside of loop\n");
+return ureg_src_undef();
 }
 
 static INLINE unsigned *
@@ -782,8 +791,12 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 case D3DSPR_ADDR:
 assert(!param->rel);
 if (IS_VS) {
-tx_addr_alloc(tx, param->idx);
-src = ureg_src(tx->regs.a);
+assert(param->idx == 0);
+/* the address register (vs only) must be
+ * assigned before use */
+assert(!ureg_dst_is_undef(tx->regs.a0));
+ureg_ARR(ureg, tx->regs.address, ureg_src(tx->regs.a0));
+src = ureg_src(tx->regs.address);
 } else {
 if (tx->version.major < 2 && tx->version.minor < 4) {
 /* no subroutines, so should be defined */
@@ -869,7 +882,13 @@ tx_src_param(struct shader_translator *tx, c

[Mesa-dev] [PATCH 51/53] st/nine: Explicit nine requirements

2015-01-07 Thread Axel Davy
This patch raises nine requirements and disables nine for old
hw that don't match them.

It would be possible to make a lot of things work with these hw,
though not everything, but it needs special care for them in the
code, and since they are very old, it's better to drop explicitly
support for them. We are already having hard time supporting r500,
which is the most flexible dx9-only card apparently.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/adapter9.c | 106 +
 src/gallium/state_trackers/nine/device9.c  |   9 +--
 2 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index 481f863..b339e6e 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -39,6 +39,7 @@ NineAdapter9_ctor( struct NineAdapter9 *This,
struct NineUnknownParams *pParams,
struct d3dadapter9_context *pCTX )
 {
+struct pipe_screen *hal = pCTX->hal;
 HRESULT hr = NineUnknown_ctor(&This->base, pParams);
 if (FAILED(hr)) { return hr; }
 
@@ -46,7 +47,7 @@ NineAdapter9_ctor( struct NineAdapter9 *This,
 nine_dump_D3DADAPTER_IDENTIFIER9(DBG_CHANNEL, &pCTX->identifier);
 
 This->ctx = pCTX;
-if (!This->ctx->hal->get_param(This->ctx->hal, PIPE_CAP_CLIP_HALFZ)) {
+if (!hal->get_param(hal, PIPE_CAP_CLIP_HALFZ)) {
 ERR("Driver doesn't support d3d9 coordinates\n");
 return D3DERR_DRIVERINTERNALERROR;
 }
@@ -54,7 +55,44 @@ NineAdapter9_ctor( struct NineAdapter9 *This,
 !This->ctx->ref->get_param(This->ctx->ref, PIPE_CAP_CLIP_HALFZ)) {
 ERR("Warning: Sotware rendering driver doesn't support d3d9 
coordinates\n");
 }
-
+/* Old cards had tricks to bypass some restrictions to implement
+ * everything and fit tight the requirements: number of constants,
+ * number of temp registers, special behaviours, etc. Since we don't
+ * have access to all this, we need a bit more than what dx9 required.
+ * For example we have to use more than 32 temp registers to emulate
+ * behaviours, while some dx9 hw don't have more. As for sm2 hardware,
+ * we could support vs2 / ps2 for them but it needs some more care, and
+ * as these are very old, we choose to drop support for them */
+
+/* checks minimum requirements, most are vs3/ps3 strict requirements */
+if (!hal->get_param(hal, PIPE_CAP_SM3) ||
+hal->get_shader_param(hal, PIPE_SHADER_VERTEX,
+  PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE) < 256 * 
sizeof(float[4]) ||
+hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
+  PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE) < 244 * 
sizeof(float[4]) ||
+hal->get_shader_param(hal, PIPE_SHADER_VERTEX,
+  PIPE_SHADER_CAP_MAX_TEMPS) < 32 ||
+hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
+  PIPE_SHADER_CAP_MAX_TEMPS) < 32 ||
+hal->get_shader_param(hal, PIPE_SHADER_VERTEX,
+  PIPE_SHADER_CAP_MAX_INPUTS) < 16 ||
+hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
+  PIPE_SHADER_CAP_MAX_INPUTS) < 10) {
+ERR("Your card is not supported by Gallium Nine. Minimum requirement"
+"is >= r500, >= nv50, >= i965\n");
+return D3DERR_DRIVERINTERNALERROR;
+}
+/* for r500 */
+if (hal->get_shader_param(hal, PIPE_SHADER_VERTEX,
+  PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE) < 276 * 
sizeof(float[4]) || /* we put bool and int constants with float constants */
+hal->get_shader_param(hal, PIPE_SHADER_VERTEX,
+  PIPE_SHADER_CAP_MAX_TEMPS) < 40 || /* we use 
some more temp registers */
+hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
+  PIPE_SHADER_CAP_MAX_TEMPS) < 40 ||
+hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
+  PIPE_SHADER_CAP_MAX_INPUTS) < 20) /* we don't 
pack inputs as much as we could */
+ERR("Your card is at the limit of Gallium Nine requirements. Some 
games"
+"may run into issues because requirements are too tight\n");
 return D3D_OK;
 }
 
@@ -472,7 +510,6 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 D3DCAPS9 *pCaps )
 {
 struct pipe_screen *screen;
-boolean sm3, vs;
 HRESULT hr;
 
 DBG("This=%p DeviceType=%s pCaps=%p\n", This,
@@ -492,10 +529,6 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 #define D3DNPIPECAP(pcap, d3dcap) \
 (screen->get_param(screen, PIPE_CAP_##pcap) ? 0 : (d3dcap))
 
-sm3 = screen->get_param(screen, PIPE_CAP_SM3);
-vs = !!(screen->get_shader_param(screen, PIPE_SHADER_VERTEX,
- PIPE_SHADER_CAP_MAX_INSTRUCTIONS));
-
  

[Mesa-dev] [PATCH 39/53] st/nine: Implement TEXM3x3SPEC

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 20a8e8a..3b29f58 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2157,7 +2157,44 @@ DECL_SPECIAL(TEXM3x3PAD)
 
 DECL_SPECIAL(TEXM3x3SPEC)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src E = tx_src_param(tx, &tx->insn.src[1]);
+struct ureg_src sample;
+struct ureg_dst tmp;
+const int m = tx->insn.dst[0].idx - 2;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+tx_texcoord_alloc(tx, m);
+tx_texcoord_alloc(tx, m+1);
+tx_texcoord_alloc(tx, m+2);
+
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], 
ureg_src(tx->regs.tS[n]));
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], 
ureg_src(tx->regs.tS[n]));
+ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], 
ureg_src(tx->regs.tS[n]));
+
+sample = ureg_DECL_sampler(ureg, m + 2);
+tx->info->sampler_mask |= 1 << (m + 2);
+tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
+
+/* At this step, dst = N = (u', w', z').
+ * We want dst to be the texture sampled at (u'', w'', z''), with
+ * (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), 
ureg_src(dst));
+ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
+/* at this step tmp.x = 1/N.N */
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), E);
+/* at this step tmp.y = N.E */
+ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), 
TGSI_SWIZZLE_Y));
+/* at this step tmp.x = N.E/N.N */
+ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), 
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
+ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), 
ureg_src(dst));
+/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
+ureg_SUB(ureg, tmp, ureg_src(tmp), E);
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), 
sample);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXREG2RGB)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/53] st/nine: Correct LOG on negative values

2015-01-07 Thread Axel Davy
We should take the absolute value of the input.

Also return -FLT_MAX instead of -Inf for an input of 0.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 48492b4..327dd2c 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1968,6 +1968,17 @@ DECL_SPECIAL(RSQ)
 return D3D_OK;
 }
 
+DECL_SPECIAL(LOG)
+{
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst tmp = tx_scratch_scalar(tx);
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
+ureg_LG2(ureg, tmp, ureg_abs(src));
+ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), tx_src_scalar(tmp));
+return D3D_OK;
+}
+
 DECL_SPECIAL(NRM)
 {
 struct ureg_program *ureg = tx->ureg;
@@ -2291,7 +2302,7 @@ struct sm1_op_info inst_table[] =
 _OPI(SLT, SLT, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 12 */
 _OPI(SGE, SGE, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 13 */
 _OPI(EXP, EX2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 14 */
-_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 15 */
+_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(LOG)), /* 15 
*/
 _OPI(LIT, LIT, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL), /* 16 */
 _OPI(DST, DST, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 17 */
 _OPI(LRP, LRP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 18 */
@@ -2355,7 +2366,7 @@ struct sm1_op_info inst_table[] =
 
 _OPI(EXPP, EXP, V(0,0), V(1,1), V(0,0), V(0,0), 1, 1, NULL),
 _OPI(EXPP, EX2, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
-_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
+_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, SPECIAL(LOG)),
 _OPI(CND,  NOP, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
 
 _OPI(DEF, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 0, SPECIAL(DEF)),
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 50/53] st/nine: Allocate vs constbuf buffer for indirect addressing once.

2015-01-07 Thread Axel Davy
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.

This patch makes this buffer be allocated once.

Signed-off-by: Axel Davy 
Signed-off-by: Tiziano Bacocco 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/device9.c| 5 -
 src/gallium/state_trackers/nine/nine_state.c | 5 +
 src/gallium/state_trackers/nine/nine_state.h | 1 +
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 68036c0..98f0965 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -267,7 +267,9 @@ NineDevice9_ctor( struct NineDevice9 *This,
 /* Include space for I,B constants for user constbuf. */
 This->state.vs_const_f = CALLOC(This->vs_const_size, 1);
 This->state.ps_const_f = CALLOC(This->ps_const_size, 1);
-if (!This->state.vs_const_f || !This->state.ps_const_f)
+This->state.vs_lconstf_temp = CALLOC(This->vs_const_size,1);
+if (!This->state.vs_const_f || !This->state.ps_const_f ||
+!This->state.vs_lconstf_temp)
 return E_OUTOFMEMORY;
 
 if (strstr(pScreen->get_name(pScreen), "AMD") ||
@@ -347,6 +349,7 @@ NineDevice9_dtor( struct NineDevice9 *This )
 pipe_resource_reference(&This->constbuf_ps, NULL);
 FREE(This->state.vs_const_f);
 FREE(This->state.ps_const_f);
+FREE(This->state.vs_lconstf_temp);
 
 if (This->swapchains) {
 for (i = 0; i < This->nswapchains; ++i)
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index dc97529..09b4401 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -501,7 +501,7 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
 const struct nine_lconstf *lconstf =  &device->state.vs->lconstf;
 const struct nine_range *r = lconstf->ranges;
 unsigned n = 0;
-float *dst = (float *)MALLOC(cb.buffer_size);
+float *dst = device->state.vs_lconstf_temp;
 float *src = (float *)cb.user_buffer;
 memcpy(dst, src, cb.buffer_size);
 while (r) {
@@ -516,9 +516,6 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
 
 pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
 
-if (device->state.vs->lconstf.ranges)
-FREE((void *)cb.user_buffer);
-
 if (device->state.changed.vs_const_f) {
 struct nine_range *r = device->state.changed.vs_const_f;
 struct nine_range *p = r;
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index 742c6f6..58ca8c9 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -144,6 +144,7 @@ struct nine_state
 float *vs_const_f;
 intvs_const_i[NINE_MAX_CONST_I][4];
 BOOL   vs_const_b[NINE_MAX_CONST_B];
+float *vs_lconstf_temp;
 uint32_t vs_key;
 
 struct NinePixelShader9 *ps;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 34/53] st/nine: Implement TEXCOORD special behaviours

2015-01-07 Thread Axel Davy
texcoord for ps < 1_4 should clamp between 0 and 1 the values.

texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.

Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 29 ++-
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 899aee8..cf3f646 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -937,6 +937,23 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 if (param->rel)
 src = ureg_src_indirect(src, tx_src_param(tx, param->rel));
 
+switch (param->mod) {
+case NINED3DSPSM_DW:
+tmp = tx_scratch(tx);
+ureg_MOV(ureg, tmp, src);
+ureg_DIV(ureg, tmp, ureg_src(tmp), ureg_swizzle(ureg_src(tmp), 
NINE_SWIZZLE4(W,W,W,W)));
+src = ureg_src(tmp);
+break;
+case NINED3DSPSM_DZ:
+tmp = tx_scratch(tx);
+ureg_MOV(ureg, tmp, src);
+ureg_DIV(ureg, tmp, ureg_src(tmp), ureg_swizzle(ureg_src(tmp), 
NINE_SWIZZLE4(Z,Z,Z,Z)));
+src = ureg_src(tmp);
+break;
+default:
+break;
+}
+
 if (param->swizzle != NINED3DSP_NOSWIZZLE)
 src = ureg_swizzle(src,
(param->swizzle >> 0) & 0x3,
@@ -979,7 +996,7 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 break;
 case NINED3DSPSM_DZ:
 case NINED3DSPSM_DW:
-/* handled in instruction */
+/* Already handled*/
 break;
 case NINED3DSPSM_SIGN:
 tmp = tx_scratch(tx);
@@ -2049,7 +2066,8 @@ DECL_SPECIAL(TEXCOORD)
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
 
 tx_texcoord_alloc(tx, s);
-ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
+ureg_MOV(ureg, ureg_writemask(ureg_saturate(dst), TGSI_WRITEMASK_XYZ), 
tx->regs.vT[s]);
+ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(tx->ureg, 
1.0f));
 
 return D3D_OK;
 }
@@ -2057,11 +2075,12 @@ DECL_SPECIAL(TEXCOORD)
 DECL_SPECIAL(TEXCOORD_ps14)
 {
 struct ureg_program *ureg = tx->ureg;
-const unsigned s = tx->insn.src[0].idx;
+struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
 
-tx_texcoord_alloc(tx, s);
-ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
+assert(tx->insn.src[0].file == D3DSPR_TEXTURE);
+
+ureg_MOV(ureg, dst, src);
 
 return D3D_OK;
 }
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 52/53] st/nine: Change comment related to vertex shader inputs not matching declaration

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 09b4401..99173fa 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -214,11 +214,12 @@ update_vertex_elements(struct NineDevice9 *device)
 if (state->stream_freq[b] & D3DSTREAMSOURCE_INSTANCEDATA)
 ve[n].instance_divisor = state->stream_freq[b] & 0x7F;
 } else {
-/* TODO:
- * If drivers don't want to handle this, insert a dummy buffer.
- * But on which stream ?
- */
-/* no data, disable */
+/* TODO: msdn doesn't precise what should happen when the vertex
+ * declaration doesn't match the vertex shader inputs.
+ * Some website say the code will pass but nothing will get 
rendered.
+ * We should check and implement the correct behaviour. */
+/* Put PIPE_FORMAT_NONE.
+ * Some drivers (r300) are very unhappy with that */
 ve[n].src_format = PIPE_FORMAT_NONE;
 ve[n].src_offset = 0;
 ve[n].instance_divisor = 0;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 33/53] st/nine: Fix CALLNZ implementation

2015-01-07 Thread Axel Davy
Nothing seems to indicates the negation modifier would be stored in the
instruction flags instead of the source modifier. tx_src_param has
already handled it if it is in the source modifier.

In addition,
when the card supports native integers, the boolean
are stored in 32 bits int and are equal to
0 or 0x.

Given 0x is NaN if it was a float, better use
UIF than IF.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index fb01408..899aee8 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1431,17 +1431,12 @@ DECL_SPECIAL(CALL)
 DECL_SPECIAL(CALLNZ)
 {
 struct ureg_program *ureg = tx->ureg;
-struct ureg_dst tmp = tx_scratch_scalar(tx);
 struct ureg_src src = tx_src_param(tx, &tx->insn.src[1]);
 
-/* NOTE: source should be const bool, so we can use NOT/SUB instead of 
[U]SNE 0 */
-if (!tx->insn.flags) {
-if (tx->native_integers)
-ureg_NOT(ureg, tmp, src);
-else
-ureg_SUB(ureg, tmp, ureg_imm1f(ureg, 1.0f), src);
-}
-ureg_IF(ureg, tx->insn.flags ? src : tx_src_scalar(tmp), tx_cond(tx));
+if (!tx->native_integers)
+ureg_IF(ureg, src, tx_cond(tx));
+else
+ureg_UIF(ureg, src, tx_cond(tx));
 ureg_CAL(ureg, &tx->inst_labels[tx->insn.src[0].idx]);
 tx_endcond(tx);
 ureg_ENDIF(ureg);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 30/53] st/nine: Remove duplicated code for ps texcoord input declaration

2015-01-07 Thread Axel Davy
Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 8bcf67b..3fefce4 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2046,8 +2046,7 @@ DECL_SPECIAL(TEXCOORD)
 const unsigned s = tx->insn.dst[0].idx;
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
 
-if (ureg_src_is_undef(tx->regs.vT[s]))
-tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, 
TGSI_INTERPOLATE_PERSPECTIVE);
+tx_texcoord_alloc(tx, s);
 ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
 
 return D3D_OK;
@@ -2059,8 +2058,7 @@ DECL_SPECIAL(TEXCOORD_ps14)
 const unsigned s = tx->insn.src[0].idx;
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
 
-if (ureg_src_is_undef(tx->regs.vT[s]))
-tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, 
TGSI_INTERPOLATE_PERSPECTIVE);
+tx_texcoord_alloc(tx, s);
 ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
 
 return D3D_OK;
@@ -2159,8 +2157,7 @@ DECL_SPECIAL(TEXM3x3)
 assert(m >= 0 && m > n);
 
 for (s = m; s <= (m + 2); ++s) {
-if (ureg_src_is_undef(tx->regs.vT[s]))
-tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, 
TGSI_INTERPOLATE_PERSPECTIVE);
+tx_texcoord_alloc(tx, s);
 src[s] = tx->regs.vT[s];
 }
 ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), src[0], 
ureg_src(tx->regs.tS[n]));
@@ -2244,8 +2241,7 @@ DECL_SPECIAL(TEX)
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
 struct ureg_src src[2];
 
-if (ureg_src_is_undef(tx->regs.vT[s]))
-tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, 
TGSI_INTERPOLATE_PERSPECTIVE);
+tx_texcoord_alloc(tx, s);
 
 src[0] = tx->regs.vT[s];
 src[1] = ureg_DECL_sampler(ureg, s);
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 43/53] st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 39 ---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 794a1db..9d4fb2f 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2114,12 +2114,34 @@ DECL_SPECIAL(TEXBEML)
 
 DECL_SPECIAL(TEXREG2AR)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src sample;
+const int m = tx->insn.dst[0].idx;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+sample = ureg_DECL_sampler(ureg, m);
+tx->info->sampler_mask |= 1 << m;
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), 
ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(W,X,X,X)), sample);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXREG2GB)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src sample;
+const int m = tx->insn.dst[0].idx;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+sample = ureg_DECL_sampler(ureg, m);
+tx->info->sampler_mask |= 1 << m;
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), 
ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(Y,Z,Z,Z)), sample);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXM3x2PAD)
@@ -2199,7 +2221,18 @@ DECL_SPECIAL(TEXM3x3SPEC)
 
 DECL_SPECIAL(TEXREG2RGB)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_src sample;
+const int m = tx->insn.dst[0].idx;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+sample = ureg_DECL_sampler(ureg, m);
+tx->info->sampler_mask |= 1 << m;
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), 
ureg_src(tx->regs.tS[n]), sample);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXDP3TEX)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/53] st/nine: Remove some shader unused code

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 23 +--
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index fcc1c68..8b96673 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -35,11 +35,6 @@
 
 #define DBG_CHANNEL DBG_SHADER
 
-#if 1
-#define NINE_TGSI_LAZY_DEVS /* don't use TGSI_OPCODE_BREAKC */
-#endif
-#define NINE_TGSI_LAZY_R600 /* don't use TGSI_OPCODE_DP2A */
-
 #define DUMP(args...) _nine_debug_printf(DBG_CHANNEL, NULL, args)
 
 
@@ -1542,24 +1537,16 @@ DECL_SPECIAL(REP)
 if (tx->native_integers)
 {
 ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
-#ifdef NINE_TGSI_LAZY_DEVS
 ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
-#endif
 }
 else
 {
 ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
-#ifdef NINE_TGSI_LAZY_DEVS
 ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
-#endif
 }
-#ifdef NINE_TGSI_LAZY_DEVS
 ureg_BRK(ureg);
 tx_endcond(tx);
 ureg_ENDIF(ureg);
-#else
-ureg_BREAKC(ureg, tx_src_scalar(tmp));
-#endif
 
 if (tx->native_integers) {
 ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
@@ -1637,14 +1624,10 @@ DECL_SPECIAL(BREAKC)
 src[0] = tx_src_param(tx, &tx->insn.src[0]);
 src[1] = tx_src_param(tx, &tx->insn.src[1]);
 ureg_insn(tx->ureg, cmp_op, &tmp, 1, src, 2);
-#ifdef NINE_TGSI_LAZY_DEVS
 ureg_IF(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), tx_cond(tx));
 ureg_BRK(tx->ureg);
 tx_endcond(tx);
 ureg_ENDIF(tx->ureg);
-#else
-ureg_BREAKC(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
-#endif
 return D3D_OK;
 }
 
@@ -1964,7 +1947,6 @@ DECL_SPECIAL(NRM)
 
 DECL_SPECIAL(DP2ADD)
 {
-#ifdef NINE_TGSI_LAZY_R600
 struct ureg_dst tmp = tx_scratch_scalar(tx);
 struct ureg_src dp2 = tx_src_scalar(tmp);
 struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
@@ -1978,9 +1960,6 @@ DECL_SPECIAL(DP2ADD)
 ureg_ADD(tx->ureg, dst, src[2], dp2);
 
 return D3D_OK;
-#else
-return NineTranslateInstruction_Generic(tx);
-#endif
 }
 
 DECL_SPECIAL(TEXCOORD)
@@ -2355,7 +2334,7 @@ struct sm1_op_info inst_table[] =
 /* Misc */
 _OPI(CMP,CMP,  V(0,0), V(0,0), V(1,2), V(3,0), 1, 3, SPECIAL(CMP)), /* 
reversed */
 _OPI(BEM,NOP,  V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(BEM)),
-_OPI(DP2ADD, DP2A, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)), 
/* for radeons */
+_OPI(DP2ADD, NOP,  V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)), 
/* for radeons */
 _OPI(DSX,DDX,  V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
 _OPI(DSY,DDY,  V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
 _OPI(TEXLDD, TXD,  V(0,0), V(0,0), V(2,1), V(3,0), 1, 4, SPECIAL(TEXLDD)),
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/53] st/nine: Saturate oFog and oPts vs outputs

2015-01-07 Thread Axel Davy
According to docs and Wine, these two vs outputs have
to be saturated.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index b0c08ad..6320f36 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1023,13 +1023,13 @@ _tx_dst_param(struct shader_translator *tx, const 
struct sm1_dst_param *param)
 case 1:
 if (ureg_dst_is_undef(tx->regs.oFog))
 tx->regs.oFog =
-ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0);
+ureg_saturate(ureg_DECL_output(tx->ureg, 
TGSI_SEMANTIC_FOG, 0));
 dst = tx->regs.oFog;
 break;
 case 2:
 if (ureg_dst_is_undef(tx->regs.oPts))
 tx->regs.oPts =
-ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0);
+ureg_saturate(ureg_DECL_output(tx->ureg, 
TGSI_SEMANTIC_PSIZE, 0));
 dst = tx->regs.oPts;
 break;
 default:
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/53] st/nine: Add ATI1 and ATI2 support

2015-01-07 Thread Axel Davy
Adds ATI1 and ATI2 support to nine.

They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 
Signed-off-by: Xavier Bouchoux 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/adapter9.c   |  3 +++
 src/gallium/state_trackers/nine/basetexture9.c   |  9 ++---
 src/gallium/state_trackers/nine/cubetexture9.c   |  4 
 src/gallium/state_trackers/nine/nine_pipe.h  |  2 ++
 src/gallium/state_trackers/nine/surface9.c   | 19 +++
 src/gallium/state_trackers/nine/volumetexture9.c |  4 
 6 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index 871a9a3..481f863 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -302,6 +302,9 @@ NineAdapter9_CheckDeviceFormat( struct NineAdapter9 *This,
 return D3DERR_NOTAVAILABLE;
 }
 
+/* we support ATI1 and ATI2 hack only for 2D textures */
+if (RType != D3DRTYPE_TEXTURE && (CheckFormat == D3DFMT_ATI1 || 
CheckFormat == D3DFMT_ATI2))
+return D3DERR_NOTAVAILABLE;
 /* if (Usage & D3DUSAGE_NONSECURE) { don't know the implications of this } 
*/
 /* if (Usage & D3DUSAGE_SOFTWAREPROCESSING) { we can always support this } 
*/
 
diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index ffccafd..ea9af94 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -486,9 +486,12 @@ NineBaseTexture9_UpdateSamplerView( struct 
NineBaseTexture9 *This,
 swizzle[1] = PIPE_SWIZZLE_ZERO;
 swizzle[2] = PIPE_SWIZZLE_ZERO;
 swizzle[3] = PIPE_SWIZZLE_ONE;
-} else if (resource->format != PIPE_FORMAT_A8_UNORM) {
-/* A8 is the only exception that should have 0.0 as default values
- * for RGB. It is already what gallium does. All the other ones
+} else if (resource->format != PIPE_FORMAT_A8_UNORM &&
+   resource->format != PIPE_FORMAT_RGTC1_UNORM) {
+/* exceptions:
+ * A8 should have 0.0 as default values for RGB.
+ * ATI1/RGTC1 should be r 0 0 1 (tested on windows).
+ * It is already what gallium does. All the other ones
  * should have 1.0 for non-defined values */
 for (i = 0; i < 4; i++) {
 if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
diff --git a/src/gallium/state_trackers/nine/cubetexture9.c 
b/src/gallium/state_trackers/nine/cubetexture9.c
index 43db8cb..32635ad 100644
--- a/src/gallium/state_trackers/nine/cubetexture9.c
+++ b/src/gallium/state_trackers/nine/cubetexture9.c
@@ -63,6 +63,10 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
 return D3DERR_INVALIDCALL;
 }
 
+/* We support ATI1 and ATI2 hacks only for 2D textures */
+if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
+return D3DERR_INVALIDCALL;
+
 info->screen = pParams->device->screen;
 info->target = PIPE_TEXTURE_CUBE;
 info->format = d3d9_to_pipe_format(Format);
diff --git a/src/gallium/state_trackers/nine/nine_pipe.h 
b/src/gallium/state_trackers/nine/nine_pipe.h
index 06e4dc9..41792f0 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.h
+++ b/src/gallium/state_trackers/nine/nine_pipe.h
@@ -185,6 +185,8 @@ d3d9_to_pipe_format(D3DFORMAT format)
 case D3DFMT_DXT3: return PIPE_FORMAT_DXT3_RGBA;
 case D3DFMT_DXT4: return PIPE_FORMAT_DXT5_RGBA; /* XXX */
 case D3DFMT_DXT5: return PIPE_FORMAT_DXT5_RGBA;
+case D3DFMT_ATI1: return PIPE_FORMAT_RGTC1_UNORM;
+case D3DFMT_ATI2: return PIPE_FORMAT_RGTC2_UNORM;
 case D3DFMT_UYVY: return PIPE_FORMAT_UYVY;
 case D3DFMT_YUY2: return PIPE_FORMAT_YUYV; /* XXX check */
 case D3DFMT_NV12: return PIPE_FORMAT_NV12;
diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index 5928892..b3c7c18 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -38,6 +38,8 @@
 
 #define DBG_CHANNEL DBG_SURFACE
 
+#define is_ATI1_ATI2(format) (format == PIPE_FORMAT_RGTC1_UNORM || format == 
PIPE_FORMAT_RGTC2_UNORM)
+
 HRESULT
 NineSurface9_ctor( struct NineSurface9 *This,
struct NineUnknownParams *pParams,
@@ -382,10 +384,19 @@ NineSurface9_LockRect( struct NineSurface9 *This,
 
 if (This->data) {
 DBG("returning system memory\n");
-
-pLockedRect->Pitch = This->stride;
-pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
-  box.x, box.y);
+/* ATI1 and ATI2 need special handling, because of d3d9 bug.
+ * We must advertise to the application as if it is uncompressed
+ * and bpp 8, and the app has a workaround to work with the fact
+ * that it is actually

[Mesa-dev] [PATCH 42/53] st/nine: Implement TEXDP3TEX

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 02fb69e..794a1db 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2204,7 +2204,25 @@ DECL_SPECIAL(TEXREG2RGB)
 
 DECL_SPECIAL(TEXDP3TEX)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
+struct ureg_dst tmp;
+struct ureg_src sample;
+const int m = tx->insn.dst[0].idx;
+const int n = tx->insn.src[0].idx;
+assert(m >= 0 && m > n);
+
+tx_texcoord_alloc(tx, m);
+
+tmp = tx_scratch(tx);
+ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], 
ureg_src(tx->regs.tS[n]));
+ureg_MOV(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_YZ), ureg_imm1f(ureg, 
0.0f));
+
+sample = ureg_DECL_sampler(ureg, m);
+tx->info->sampler_mask |= 1 << m;
+ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tmp), sample);
+
+return D3D_OK;
 }
 
 DECL_SPECIAL(TEXM3x2DEPTH)
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/53] st/nine: Handle NRM with input of null norm

2015-01-07 Thread Axel Davy
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.

Reviewed-by: David Heidelberg 
Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 4dee5f5..48492b4 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1973,10 +1973,12 @@ DECL_SPECIAL(NRM)
 struct ureg_program *ureg = tx->ureg;
 struct ureg_dst tmp = tx_scratch_scalar(tx);
 struct ureg_src nrm = tx_src_scalar(tmp);
+struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
 struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
 ureg_DP3(ureg, tmp, src, src);
 ureg_RSQ(ureg, tmp, nrm);
-ureg_MUL(ureg, tx_dst_param(tx, &tx->insn.dst[0]), src, nrm);
+ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
+ureg_MUL(ureg, dst, src, nrm);
 return D3D_OK;
 }
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 49/53] st/nine: Allocate the correct size for the user constant buffer

2015-01-07 Thread Axel Davy
Signed-off-by: Axel Davy 
Cc: "10.4" 
---
 src/gallium/state_trackers/nine/device9.c| 6 +++---
 src/gallium/state_trackers/nine/nine_state.c | 7 ---
 src/gallium/state_trackers/nine/nine_state.h | 2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index cae9239..68036c0 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -62,7 +62,7 @@ NineDevice9_SetDefaultState( struct NineDevice9 *This, 
boolean is_reset )
 
 assert(!This->is_recording);
 
-nine_state_set_defaults(&This->state, &This->caps, is_reset);
+nine_state_set_defaults(This, &This->caps, is_reset);
 
 This->state.viewport.X = 0;
 This->state.viewport.Y = 0;
@@ -265,8 +265,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->vs_const_size = max_const_vs * sizeof(float[4]);
 This->ps_const_size = max_const_ps * sizeof(float[4]);
 /* Include space for I,B constants for user constbuf. */
-This->state.vs_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
-This->state.ps_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
+This->state.vs_const_f = CALLOC(This->vs_const_size, 1);
+This->state.ps_const_f = CALLOC(This->ps_const_size, 1);
 if (!This->state.vs_const_f || !This->state.ps_const_f)
 return E_OUTOFMEMORY;
 
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 0137a78..dc97529 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -1000,9 +1000,10 @@ static const DWORD 
nine_samp_state_defaults[NINED3DSAMP_LAST + 1] =
 [NINED3DSAMP_SHADOW] = 0
 };
 void
-nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
+nine_state_set_defaults(struct NineDevice9 *device, const D3DCAPS9 *caps,
 boolean is_reset)
 {
+struct nine_state *state = &device->state;
 unsigned s;
 
 /* Initialize defaults.
@@ -1023,9 +1024,9 @@ nine_state_set_defaults(struct nine_state *state, const 
D3DCAPS9 *caps,
 }
 
 if (state->vs_const_f)
-memset(state->vs_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
+memset(state->vs_const_f, 0, device->vs_const_size);
 if (state->ps_const_f)
-memset(state->ps_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
+memset(state->ps_const_f, 0, device->ps_const_size);
 
 /* Cap dependent initial state:
  */
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index 3e0162c..742c6f6 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -218,7 +218,7 @@ struct NineDevice9;
 
 boolean nine_update_state(struct NineDevice9 *, uint32_t group_mask);
 
-void nine_state_set_defaults(struct nine_state *, const D3DCAPS9 *,
+void nine_state_set_defaults(struct NineDevice9 *, const D3DCAPS9 *,
  boolean is_reset);
 void nine_state_clear(struct nine_state *, const boolean device);
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Marek Olšák
Yes. I expect lower overhead on SI. This interface should have no
impact on r600g.

Marek

On Wed, Jan 7, 2015 at 3:59 PM, Aditya Avinash  wrote:
> Oh. So, we get better performance if we use atomic counters as buffers
> rather than textures (images) [manipulating views are expensive].
>
> Am I right?
>
> On Wed, Jan 7, 2015 at 8:52 AM, Marek Olšák  wrote:
>>
>> On Wed, Jan 7, 2015 at 3:44 PM, Aditya Avinash 
>> wrote:
>> >
>> >
>> > On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  wrote:
>> >>
>> >> From: Marek Olšák 
>> >>
>> >> set_shader_resources is unused.
>> >>
>> >> set_shader_buffers should support shader atomic counter buffers and
>> >> shader
>> >> storage buffers from OpenGL.
>> >>
>> >> The plan is to use slots 0..15 for atomic counters and slots 16..31
>> >> for storage buffers. Atomic counters are planned to be supported first.
>> >>
>> >> This doesn't add any interface for images. The documentation is added
>> >> for future reference.
>> >> ---
>> >>
>> >> This is the interface only. I don't plan to do anything else for now.
>> >> Comments welcome.
>> >>
>> >>  src/gallium/docs/source/context.rst | 16 
>> >>  src/gallium/docs/source/screen.rst  |  4 ++--
>> >>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>> >>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>> >>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>> >>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>> >>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>> >>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>> >>  src/gallium/include/pipe/p_context.h| 20
>> >> +++-
>> >>  src/gallium/include/pipe/p_defines.h|  2 +-
>> >>  src/gallium/include/pipe/p_state.h  | 10 ++
>> >>  11 files changed, 38 insertions(+), 26 deletions(-)
>> >>
>> >> diff --git a/src/gallium/docs/source/context.rst
>> >> b/src/gallium/docs/source/context.rst
>> >> index 5861f46..73fd35f 100644
>> >> --- a/src/gallium/docs/source/context.rst
>> >> +++ b/src/gallium/docs/source/context.rst
>> >> @@ -126,14 +126,14 @@ from a shader without an associated sampler.
>> >> This
>> >> means that they
>> >>  have no support for floating point coordinates, address wrap modes or
>> >>  filtering.
>> >>
>> >> -Shader resources are specified for all the shader stages at once using
>> >> -the ``set_shader_resources`` method.  When binding texture resources,
>> >> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
>> >> -specify the mipmap level and the range of layers the texture will be
>> >> -constrained to.  In the case of buffers, ``first_element`` and
>> >> -``last_element`` specify the range within the buffer that will be used
>> >> -by the shader resource.  Writes to a shader resource are only allowed
>> >> -when the ``writable`` flag is set.
>> >> +There are 2 types of shader resources: buffers and images.
>> >> +
>> >> +Buffers are specified using the ``set_shader_buffers`` method.
>> >> +
>> >> +Images are specified using the ``set_shader_images`` method. When
>> >> binding
>> >> +images, the ``level``, ``first_layer`` and ``last_layer``
>> >> pipe_image_view
>> >> +fields specify the mipmap level and the range of layers the image will
>> >> be
>> >> +constrained to.
>> >>
>> >>  Surfaces
>> >>  
>> >
>> >
>> > set_shader_images are not defined in this patch.
>> > Will it look similar to pipe_surface or pipe_sampler_view?
>>
>> There will be a separate view for images if this is approved.
>>
>> Marek
>
>
>
>
> --
> Regards,
> Aditya Atluri,
> USA.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Aditya Avinash
Thank you!

On Wednesday, January 7, 2015, Marek Olšák  wrote:

> On Wed, Jan 7, 2015 at 2:42 PM, Aditya Avinash  > wrote:
> > Hi,
> > Sounds great but, do you think a separate buffer pipe is required for
> this?
> > Changing Constant buffer to a generic buffer (with alu+load+store) can
> help.
>
> No, constant buffers should remain unchanged.
>
> >
> > What about for R600? Do we have to add
> >
> > r600_init_atom(rctx, &rctx->shaderbuf_state[PIPE_SHADER_VERTEX].atom,
> id++,
> > r600_emit_vs_shader_buffers, 0);
> >
> > to backend? Will this be specific to Atomics?
>
> No, atomic buffers should be set in the exact same way as colorbuffers
> on r600 except that the RAT bit should be set. Search the r600g driver
> for "RAT(1)". I think it supports them already. The shader
> instructions for accessing such buffers begin with "MEM_RAT".
>
> Marek
>
> >
> > Thank you!!
> >
> > On Wed, Jan 7, 2015 at 4:56 AM, Marek Olšák  > wrote:
> >>
> >> From: Marek Olšák >
> >>
> >> set_shader_resources is unused.
> >>
> >> set_shader_buffers should support shader atomic counter buffers and
> shader
> >> storage buffers from OpenGL.
> >>
> >> The plan is to use slots 0..15 for atomic counters and slots 16..31
> >> for storage buffers. Atomic counters are planned to be supported first.
> >>
> >> This doesn't add any interface for images. The documentation is added
> >> for future reference.
> >> ---
> >>
> >> This is the interface only. I don't plan to do anything else for now.
> >> Comments welcome.
> >>
> >>  src/gallium/docs/source/context.rst | 16 
> >>  src/gallium/docs/source/screen.rst  |  4 ++--
> >>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
> >>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
> >>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
> >>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
> >>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
> >>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
> >>  src/gallium/include/pipe/p_context.h| 20
> +++-
> >>  src/gallium/include/pipe/p_defines.h|  2 +-
> >>  src/gallium/include/pipe/p_state.h  | 10 ++
> >>  11 files changed, 38 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/src/gallium/docs/source/context.rst
> >> b/src/gallium/docs/source/context.rst
> >> index 5861f46..73fd35f 100644
> >> --- a/src/gallium/docs/source/context.rst
> >> +++ b/src/gallium/docs/source/context.rst
> >> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This
> >> means that they
> >>  have no support for floating point coordinates, address wrap modes or
> >>  filtering.
> >>
> >> -Shader resources are specified for all the shader stages at once using
> >> -the ``set_shader_resources`` method.  When binding texture resources,
> >> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
> >> -specify the mipmap level and the range of layers the texture will be
> >> -constrained to.  In the case of buffers, ``first_element`` and
> >> -``last_element`` specify the range within the buffer that will be used
> >> -by the shader resource.  Writes to a shader resource are only allowed
> >> -when the ``writable`` flag is set.
> >> +There are 2 types of shader resources: buffers and images.
> >> +
> >> +Buffers are specified using the ``set_shader_buffers`` method.
> >> +
> >> +Images are specified using the ``set_shader_images`` method. When
> binding
> >> +images, the ``level``, ``first_layer`` and ``last_layer``
> pipe_image_view
> >> +fields specify the mipmap level and the range of layers the image will
> be
> >> +constrained to.
> >>
> >>  Surfaces
> >>  
> >> diff --git a/src/gallium/docs/source/screen.rst
> >> b/src/gallium/docs/source/screen.rst
> >> index 55d114c..c81ad66 100644
> >> --- a/src/gallium/docs/source/screen.rst
> >> +++ b/src/gallium/docs/source/screen.rst
> >> @@ -403,8 +403,8 @@ resources might be created and handled quite
> >> differently.
> >>process.
> >>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
> >>address space of a compute program.
> >> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
> >> -  bound to the graphics pipeline as a shader resource.
> >> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader
> >> where
> >> +  it should support reads, writes, and atomics.
> >>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
> >>bound to the compute program as a shader resource.
> >>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by
> the
> >> diff --git a/src/gallium/drivers/galahad/glhd_context.c
> >> b/src/gallium/drivers/galahad/glhd_context.c
> >> index 37ea170..383d76c 100644
> >> --- a/src/gallium/drivers/galahad/glhd_context.c
> >> +++ b/src/gallium/drivers/galahad/glhd_context.c
> >> @@ -1017,7 +1017,7 @@ galahad_context_create(struct pipe_screen
> *_s

Re: [Mesa-dev] [PATCH 03/53] st/nine: Additional defines to d3dtypes.h

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> From: xavier 

Would be nice to fix up the from name... you can do this with 'git
commit --amend --author asdf' or by reimporting the patch.

>
> Reviewed-by: David Heidelberg 
> Reviewed-by: Axel Davy 
> Signed-off-by: Xavier Bouchoux 
>
> Cc: "10.4" 
> ---
>  include/D3D9/d3d9types.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/D3D9/d3d9types.h b/include/D3D9/d3d9types.h
> index 0a8f9e5..e53e389 100644
> --- a/include/D3D9/d3d9types.h
> +++ b/include/D3D9/d3d9types.h
> @@ -224,6 +224,8 @@ typedef struct _RGNDATA {
>  #define D3DERR_INVALIDDEVICE MAKE_D3DHRESULT(2155)
>  #define D3DERR_INVALIDCALL   MAKE_D3DHRESULT(2156)
>  #define D3DERR_DRIVERINVALIDCALL MAKE_D3DHRESULT(2157)
> +#define D3DERR_DEVICEREMOVED MAKE_D3DHRESULT(2160)
> +#define D3DERR_DEVICEHUNGMAKE_D3DHRESULT(2164)
>
>  /
>   * Bitmasks *
> @@ -331,6 +333,7 @@ typedef struct _RGNDATA {
>
>  #define D3DPRESENT_DONOTWAIT  0x0001
>  #define D3DPRESENT_LINEAR_CONTENT 0x0002
> +#define D3DPRESENT_RATE_DEFAULT0
>
>  #define D3DCREATE_FPU_PRESERVE  0x0002
>  #define D3DCREATE_MULTITHREADED 0x0004
> @@ -344,6 +347,13 @@ typedef struct _RGNDATA {
>  #define D3DSTREAMSOURCE_INDEXEDDATA  (1 << 30)
>  #define D3DSTREAMSOURCE_INSTANCEDATA (2 << 30)
>
> +/* D3DRS_COLORWRITEENABLE */
> +#define D3DCOLORWRITEENABLE_RED (1L << 0)
> +#define D3DCOLORWRITEENABLE_GREEN   (1L << 1)
> +#define D3DCOLORWRITEENABLE_BLUE(1L << 2)
> +#define D3DCOLORWRITEENABLE_ALPHA   (1L << 3)
> +
> +
>  /
>   * Function macros  *
>   ***/
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/53] st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> The cap means D3DFVF_XYZRHW vertices will see clipping.
> This is not the case when
> PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
> it'll disable clipping.
>
> Signed-off-by: Axel Davy 
>
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/adapter9.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/nine/adapter9.c 
> b/src/gallium/state_trackers/nine/adapter9.c
> index e409d5f..871a9a3 100644
> --- a/src/gallium/state_trackers/nine/adapter9.c
> +++ b/src/gallium/state_trackers/nine/adapter9.c
> @@ -549,7 +549,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
> D3DPMISCCAPS_CULLCCW |
> D3DPMISCCAPS_COLORWRITEENABLE |
> D3DPMISCCAPS_CLIPPLANESCALEDPOINTS |
> -   D3DPMISCCAPS_CLIPTLVERTS |
> +   /*D3DPMISCCAPS_CLIPTLVERTS |*/

Why is this commented out and not just removed?

> D3DPMISCCAPS_TSSARGTEMP |
> D3DPMISCCAPS_BLENDOP |
> D3DPIPECAP(INDEP_BLEND_ENABLE, 
> D3DPMISCCAPS_INDEPENDENTWRITEMASKS) |
> @@ -560,6 +560,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
> D3DPIPECAP(MIXED_COLORBUFFER_FORMATS, 
> D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS) |
> D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING |
> /*D3DPMISCCAPS_FOGVERTEXCLAMPED*/0;
> +if (!screen->get_param(screen, PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION))
> +pCaps->PrimitiveMiscCaps |= D3DPMISCCAPS_CLIPTLVERTS;

Just to confirm, when that cap is available, you *always* turn use the
window space position?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/53] st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> Reviewed-by: David Heidelberg 
> Signed-off-by: Axel Davy 
> ---
>  src/gallium/state_trackers/nine/cubetexture9.c   |  8 
>  src/gallium/state_trackers/nine/texture9.c   |  9 -
>  src/gallium/state_trackers/nine/volumetexture9.c | 10 +-
>  3 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/state_trackers/nine/cubetexture9.c 
> b/src/gallium/state_trackers/nine/cubetexture9.c
> index 2c607c0..43db8cb 100644
> --- a/src/gallium/state_trackers/nine/cubetexture9.c
> +++ b/src/gallium/state_trackers/nine/cubetexture9.c
> @@ -38,6 +38,8 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
> HANDLE *pSharedHandle )
>  {
>  struct pipe_resource *info = &This->base.base.info;
> +struct pipe_screen *screen = pParams->device->screen;
> +enum pipe_format pf;
>  unsigned i;
>  D3DSURFACE_DESC sfdesc;
>  HRESULT hr;
> @@ -55,6 +57,12 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
>  if (Usage & D3DUSAGE_AUTOGENMIPMAP)
>  Levels = 0;
>
> +pf = d3d9_to_pipe_format(Format);
> +if (pf == PIPE_FORMAT_NONE ||
> +!screen->is_format_supported(screen, pf, PIPE_TEXTURE_CUBE, 0, 
> PIPE_BIND_SAMPLER_VIEW)) {
> +return D3DERR_INVALIDCALL;
> +}
> +
>  info->screen = pParams->device->screen;
>  info->target = PIPE_TEXTURE_CUBE;
>  info->format = d3d9_to_pipe_format(Format);

info->format = pf; here as well for parity with the other code?

> diff --git a/src/gallium/state_trackers/nine/texture9.c 
> b/src/gallium/state_trackers/nine/texture9.c
> index 8852142..4d7e950 100644
> --- a/src/gallium/state_trackers/nine/texture9.c
> +++ b/src/gallium/state_trackers/nine/texture9.c
> @@ -47,6 +47,7 @@ NineTexture9_ctor( struct NineTexture9 *This,
>  struct pipe_screen *screen = pParams->device->screen;
>  struct pipe_resource *info = &This->base.base.info;
>  struct pipe_resource *resource;
> +enum pipe_format pf;
>  unsigned l;
>  D3DSURFACE_DESC sfdesc;
>  HRESULT hr;
> @@ -92,9 +93,15 @@ NineTexture9_ctor( struct NineTexture9 *This,
>  if (Usage & D3DUSAGE_AUTOGENMIPMAP)
>  Levels = 0;
>
> +pf = d3d9_to_pipe_format(Format);
> +if (Format != D3DFMT_NULL && (pf == PIPE_FORMAT_NONE ||

None of the others have this check... is null valid here but not for
cube/volume?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/53] st/nine: NineBaseTexture9: update sampler view creation

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> While previous code was having the correct behaviour in general,
> this new code is more readable (without checking all gallium formats
> manually) and has a more defined behaviour for depth stencil resources.
>
> Reviewed-by: David Heidelberg 
> Signed-off-by: Axel Davy 
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/basetexture9.c | 39 
> +-
>  1 file changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
> b/src/gallium/state_trackers/nine/basetexture9.c
> index af4778b..fb5a61a 100644
> --- a/src/gallium/state_trackers/nine/basetexture9.c
> +++ b/src/gallium/state_trackers/nine/basetexture9.c
> @@ -436,6 +436,10 @@ NineBaseTexture9_CreatePipeResource( struct 
> NineBaseTexture9 *This,
>  return D3D_OK;
>  }
>
> +#define SWIZZLE_TO_REPLACE(s) (s == UTIL_FORMAT_SWIZZLE_0 || \
> +   s == UTIL_FORMAT_SWIZZLE_1 || \
> +   s == UTIL_FORMAT_SWIZZLE_NONE)
> +
>  HRESULT
>  NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
>  const int sRGB )
> @@ -444,6 +448,7 @@ NineBaseTexture9_UpdateSamplerView( struct 
> NineBaseTexture9 *This,
>  struct pipe_context *pipe = This->pipe;
>  struct pipe_resource *resource = This->base.resource;
>  struct pipe_sampler_view templ;
> +unsigned i;
>  uint8_t swizzle[4];
>
>  DBG("This=%p sRGB=%d\n", This, sRGB);
> @@ -463,20 +468,28 @@ NineBaseTexture9_UpdateSamplerView( struct 
> NineBaseTexture9 *This,
>  swizzle[3] = PIPE_SWIZZLE_ALPHA;
>  desc = util_format_description(resource->format);
>  if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
> -/* ZZZ1 -> 0Z01 (see end of docs/source/tgsi.rst)
> - * XXX: but it's wrong
> -swizzle[0] = PIPE_SWIZZLE_ZERO;
> -swizzle[2] = PIPE_SWIZZLE_ZERO; */
> -} else
> -if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
> -desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1) {
> -/* R001/RG01 -> R111/RG11 */
> -if (desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_0)
> -swizzle[1] = PIPE_SWIZZLE_ONE;
> -if (desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_0)
> -swizzle[2] = PIPE_SWIZZLE_ONE;
> +/* msdn doc says default values are R = B = 0.0,
> + * A = 1.0. This implictly indicates the green channel
> + * is always filled with content. However games seem to
> + * look for depth in the r channel, like gallium does.
> + * Moreover it's what dx10 states. In addition, some documentation
> + * seems to indicate depth is the only thing given for depth-stencil
> + * formats. Thus reword the spec by: R should contain the depth.
> + * R, G and B default values are 0.0, while A default value is 1.0 */
> +if (SWIZZLE_TO_REPLACE(desc->swizzle[0]))
> +swizzle[0] = PIPE_SWIZZLE_ZERO;
> +swizzle[1] = PIPE_SWIZZLE_ZERO;
> +swizzle[2] = PIPE_SWIZZLE_ZERO;
> +swizzle[3] = PIPE_SWIZZLE_ONE;
> +} else if (resource->format != PIPE_FORMAT_A8_UNORM) {

Not sure what all the formats supported are, but take a look at
util_format_is_alpha -- it's a more fool-proof check of an alpha-only
format. However perhaps A8_UNORM is the only possible alpha-only
format, in which case this is fine as-is.

> +/* A8 is the only exception that should have 0.0 as default values
> + * for RGB. It is already what gallium does. All the other ones
> + * should have 1.0 for non-defined values */
> +for (i = 0; i < 4; i++) {
> +if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
> +swizzle[i] = PIPE_SWIZZLE_ONE;
> +}
>  }
> -/* but 000A remains unchanged */
>
>  templ.format = sRGB ? util_format_srgb(resource->format) : 
> resource->format;
>  templ.u.tex.first_layer = 0;
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/53] st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS

2015-01-07 Thread Axel Davy

On Wed, 7 Jan 2015, Ilia Mirkin wrote:


On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:

The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Signed-off-by: Axel Davy 

Cc: "10.4" 
---
 src/gallium/state_trackers/nine/adapter9.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index e409d5f..871a9a3 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -549,7 +549,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPMISCCAPS_CULLCCW |
D3DPMISCCAPS_COLORWRITEENABLE |
D3DPMISCCAPS_CLIPPLANESCALEDPOINTS |
-   D3DPMISCCAPS_CLIPTLVERTS |
+   /*D3DPMISCCAPS_CLIPTLVERTS |*/


Why is this commented out and not just removed?


In adapter9, we write all possible flags, but comment thoses that we 
don't support.





D3DPMISCCAPS_TSSARGTEMP |
D3DPMISCCAPS_BLENDOP |
D3DPIPECAP(INDEP_BLEND_ENABLE, 
D3DPMISCCAPS_INDEPENDENTWRITEMASKS) |
@@ -560,6 +560,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPIPECAP(MIXED_COLORBUFFER_FORMATS, 
D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS) |
D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING |
/*D3DPMISCCAPS_FOGVERTEXCLAMPED*/0;
+if (!screen->get_param(screen, PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION))
+pCaps->PrimitiveMiscCaps |= D3DPMISCCAPS_CLIPTLVERTS;


Just to confirm, when that cap is available, you *always* turn use the
window space position?


D3DPMISCCAPS_CLIPTLVERTS indicatesthe device will clip 
D3DFVF_XYZRHW vertices. This is what happen with our

fallback of D3DFVF_XYZRHW.

When we have PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION, we use
the vs_window_space_position flag to implement D3DFVF_XYZRHW.
According to spec, vs_window_space_position will bypass clipping
too.



 -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Marek Olšák
On Wed, Jan 7, 2015 at 4:39 PM, Ilia Mirkin  wrote:
> On Wed, Jan 7, 2015 at 9:35 AM, Marek Olšák  wrote:
>> On Wed, Jan 7, 2015 at 2:41 PM, Ilia Mirkin  wrote:
>>> On Wed, Jan 7, 2015 at 5:56 AM, Marek Olšák  wrote:
 From: Marek Olšák 

 set_shader_resources is unused.

 set_shader_buffers should support shader atomic counter buffers and shader
 storage buffers from OpenGL.

 The plan is to use slots 0..15 for atomic counters and slots 16..31
 for storage buffers. Atomic counters are planned to be supported first.

 This doesn't add any interface for images. The documentation is added
 for future reference.
 ---

 This is the interface only. I don't plan to do anything else for now.
 Comments welcome.
>>>
>>> Can you clarify how this is better than the set_shader_resources
>>> interface, which can also be shared for images (which will need to
>>> support texture buffers...)?
>>
>> 1) You don't need to create any views for these. Creating,
>> initializing, referencing, and destroying views is work that should be
>> avoided if it's unnecessary.
>
> I guess you mean surfaces? You still have to bind a reference to the
> backing buffer _somewhere_...

Surfaces = views. Roland suggested that we use a separate view type
for writable resources, so I propose that we add pipe_image_view for
images. That will let drivers decide whether they want to treat them
like ordinary textures (sampler views) or whether images are just
colorbuffers in disguise (surfaces).

>
>>
>> 2) It saves space for resource descriptions on SI (both memory and
>> cache). A buffer slot needs 4 dwords, but a texture (image) slot needs
>> 8 dwords.
>>
>> Original DX11 AMD hardware (Evergreen) will have to merge
>> set_shader_buffers, set_shader_images, and set_framebuffer_state
>> anyway. One less function won't make it much easier. Post-DX11
>> hardware (SI) can do pretty much anything, but this solution is more
>> efficient for that hardware.
>>
>>>
>>> FWIW, there's already an impl for nve4 images using
>>> set_shader_resources (not sure how Christoph had tested it, I think
>>> using some preliminary OpenCL C -> TGSI converter with image support).
>>>
>>> Are these buffers fundamentally different than images? We'll still
>>> need atomic support for images as well...
>>
>> The main difference is:
>> - shader buffers don't have a view and format. pipe_resources are set 
>> directly.
>> - shader images have a view and format, this also includes buffers
>> that have a format.
>>
>>>
>>> Also how do you anticipate this will be integrated into TGSI? Right
>>> now there's a TGSI_FILE_RESOURCE -- will there be a new
>>> TGSI_FILE_BUFFER and TGSI_FILE_IMAGE?
>>
>> Yes, this needs to be changed as well.
>>
>> Opinions?
>
> OK, well, this interface also seems workable. From what I can tell,
> nve0 (kepler) is more similar to radeonsi in this regard, and nv50
> isn't realistically going to gain support for this (blob driver
> doesn't either). The wildcard is nvc0, which I haven't really traced
> for image stuff yet. I guess instructions like LOAD/STORE/ATOM* would
> be able to take either IMAGE or BUFFER things? Or separate instruction
> variants?

This is yet to be decided. Are there any differences between opcodes
for buffers and images? If yes, we need separate opcodes. Whatever
happens, I don't see any problem with using the same opcodes for
different resource types.

>
> This does, however, present an asymmetry to the compute interface,
> which currently just has
>
>void (*set_compute_resources)(struct pipe_context *,
>  unsigned start, unsigned count,
>  struct pipe_surface **resources);
>
> Should that be changed over to the buffer/image interface as well?

Let's not touch any OpenCL-related stuff. We will see what we can do
with OpenCL after this interface is implemented and tested with
OpenGL.

I'd like to say that I won't have time to work on this in the
foreseeable future.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/53] st/nine: Rework of boolean constants

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> Convert them to shader booleans at earlier stage

Why? What's wrong with the conversion as it is now?

>
> Signed-off-by: Axel Davy 
>
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/device9.c| 35 
> +---
>  src/gallium/state_trackers/nine/device9.h|  6 ++---
>  src/gallium/state_trackers/nine/nine_state.c | 13 +++
>  3 files changed, 22 insertions(+), 32 deletions(-)
>
> @@ -3311,14 +3308,14 @@ NineDevice9_GetPixelShaderConstantB( struct 
> NineDevice9 *This,
>   UINT BoolCount )
>  {
>  const struct nine_state *state = &This->state;
> +int i;
>
>  user_assert(StartRegister  < NINE_MAX_CONST_B, 
> D3DERR_INVALIDCALL);
>  user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, 
> D3DERR_INVALIDCALL);
>  user_assert(pConstantData, D3DERR_INVALIDCALL);
>
> -memcpy(pConstantData,
> -   &state->ps_const_b[StartRegister],
> -   BoolCount * sizeof(state->ps_const_b[0]));
> +for (i = 0; i < BoolCount; i++)
> +pConstantData[i] = state->ps_const_b[StartRegister + i] != 0 ? TRUE 
> : FALSE;

The !=0 doesn't really add anything does it?

>
>  return D3D_OK;
>  }
> diff --git a/src/gallium/state_trackers/nine/device9.h 
> b/src/gallium/state_trackers/nine/device9.h
> index 3649e1b..cf2138a 100644
> --- a/src/gallium/state_trackers/nine/device9.h
> +++ b/src/gallium/state_trackers/nine/device9.h
> @@ -78,9 +78,7 @@ struct NineDevice9
>  struct pipe_resource *constbuf_vs;
>  struct pipe_resource *constbuf_ps;
>  uint16_t max_vs_const_f;
> -uint16_t max_ps_const_f;
> -uint32_t vs_bool_true;
> -uint32_t ps_bool_true;
> +uint16_t max_ps_const_f;;

Extra ;

>
>  struct gen_mipmap_state *gen_mipmap;
>
> @@ -111,6 +109,8 @@ struct NineDevice9
>  boolean user_vbufs;
>  boolean user_ibufs;
>  boolean window_space_position_support;
> +boolean vs_integer;
> +boolean ps_integer;
>  } driver_caps;
>
>  struct u_upload_mgr *upload;
> diff --git a/src/gallium/state_trackers/nine/nine_state.c 
> b/src/gallium/state_trackers/nine/nine_state.c
> index e4e6788..00da62b 100644
> --- a/src/gallium/state_trackers/nine/nine_state.c
> +++ b/src/gallium/state_trackers/nine/nine_state.c
> @@ -347,7 +347,6 @@ update_constants(struct NineDevice9 *device, unsigned 
> shader_type)
>  const int *const_i;
>  const BOOL *const_b;
>  uint32_t data_b[NINE_MAX_CONST_B];
> -uint32_t b_true;
>  uint16_t dirty_i;
>  uint16_t dirty_b;
>  const unsigned usage = PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE;
> @@ -381,7 +380,6 @@ update_constants(struct NineDevice9 *device, unsigned 
> shader_type)
>  dirty_b = device->state.changed.vs_const_b;
>  device->state.changed.vs_const_b = 0;
>  const_b = device->state.vs_const_b;
> -b_true = device->vs_bool_true;
>
>  lconstf = &device->state.vs->lconstf;
>  device->state.ff.clobber.vs_const = TRUE;
> @@ -406,7 +404,6 @@ update_constants(struct NineDevice9 *device, unsigned 
> shader_type)
>  dirty_b = device->state.changed.ps_const_b;
>  device->state.changed.ps_const_b = 0;
>  const_b = device->state.ps_const_b;
> -b_true = device->ps_bool_true;
>
>  lconstf = &device->state.ps->lconstf;
>  device->state.ff.clobber.ps_const = TRUE;
> @@ -421,7 +418,7 @@ update_constants(struct NineDevice9 *device, unsigned 
> shader_type)
> x = buf->width0 - (NINE_MAX_CONST_B - i) * 4;
> c -= i;
> for (n = 0; n < c; ++n, ++i)
> -  data_b[n] = const_b[i] ? b_true : 0;
> +  data_b[n] = const_b[i];

memcpy?

> box.x = x;
> box.width = n * 4;
> DBG("upload ConstantB [%u .. %u]\n", x, x + n - 1);
> @@ -491,9 +488,7 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
>  if (state->changed.vs_const_b) {
>  int *idst = (int *)&state->vs_const_f[4 * device->max_vs_const_f];
>  uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
> -int i;
> -for (i = 0; i < NINE_MAX_CONST_B; ++i)
> -bdst[i] = state->vs_const_b[i] ? device->vs_bool_true : 0;
> +memcpy(bdst, state->vs_const_b, sizeof(state->vs_const_b));
>  state->changed.vs_const_b = 0;
>  }
>
> @@ -557,9 +552,7 @@ update_ps_constants_userbuf(struct NineDevice9 *device)
>  if (state->changed.ps_const_b) {
>  int *idst = (int *)&state->ps_const_f[4 * device->max_ps_const_f];
>  uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
> -int i;
> -for (i = 0; i < NINE_MAX_CONST_B; ++i)
> -bdst[i] = state->ps_const_b[i] ? device->ps_bool_true : 0;
> +memcpy(bdst, state->ps_const_b, sizeof(state->ps_const_b));
>  state->changed.ps_const_b = 0;
>  }
>
> --
> 2.1.3
>
> 

Re: [Mesa-dev] [PATCH 18/53] st/nine: Remove some shader unused code

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> Signed-off-by: Axel Davy 
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/nine_shader.c | 23 +--
>  1 file changed, 1 insertion(+), 22 deletions(-)
>
> diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
> b/src/gallium/state_trackers/nine/nine_shader.c
> index fcc1c68..8b96673 100644
> --- a/src/gallium/state_trackers/nine/nine_shader.c
> +++ b/src/gallium/state_trackers/nine/nine_shader.c
> @@ -2355,7 +2334,7 @@ struct sm1_op_info inst_table[] =
>  /* Misc */
>  _OPI(CMP,CMP,  V(0,0), V(0,0), V(1,2), V(3,0), 1, 3, SPECIAL(CMP)), 
> /* reversed */
>  _OPI(BEM,NOP,  V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(BEM)),
> -_OPI(DP2ADD, DP2A, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, 
> SPECIAL(DP2ADD)), /* for radeons */
> +_OPI(DP2ADD, NOP,  V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, 
> SPECIAL(DP2ADD)), /* for radeons */

Not just for radeons anymore...

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/53] st/nine: Add ATI1 and ATI2 support

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> Adds ATI1 and ATI2 support to nine.
>
> They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
> but need special handling.
>
> Reviewed-by: David Heidelberg 
> Signed-off-by: Axel Davy 
> Signed-off-by: Xavier Bouchoux 
>
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/adapter9.c   |  3 +++
>  src/gallium/state_trackers/nine/basetexture9.c   |  9 ++---
>  src/gallium/state_trackers/nine/cubetexture9.c   |  4 
>  src/gallium/state_trackers/nine/nine_pipe.h  |  2 ++
>  src/gallium/state_trackers/nine/surface9.c   | 19 +++
>  src/gallium/state_trackers/nine/volumetexture9.c |  4 
>  6 files changed, 34 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/state_trackers/nine/adapter9.c 
> b/src/gallium/state_trackers/nine/adapter9.c
> index 871a9a3..481f863 100644
> --- a/src/gallium/state_trackers/nine/adapter9.c
> +++ b/src/gallium/state_trackers/nine/adapter9.c
> @@ -302,6 +302,9 @@ NineAdapter9_CheckDeviceFormat( struct NineAdapter9 *This,
>  return D3DERR_NOTAVAILABLE;
>  }
>
> +/* we support ATI1 and ATI2 hack only for 2D textures */
> +if (RType != D3DRTYPE_TEXTURE && (CheckFormat == D3DFMT_ATI1 || 
> CheckFormat == D3DFMT_ATI2))
> +return D3DERR_NOTAVAILABLE;
>  /* if (Usage & D3DUSAGE_NONSECURE) { don't know the implications of this 
> } */
>  /* if (Usage & D3DUSAGE_SOFTWAREPROCESSING) { we can always support this 
> } */
>
> diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
> b/src/gallium/state_trackers/nine/basetexture9.c
> index ffccafd..ea9af94 100644
> --- a/src/gallium/state_trackers/nine/basetexture9.c
> +++ b/src/gallium/state_trackers/nine/basetexture9.c
> @@ -486,9 +486,12 @@ NineBaseTexture9_UpdateSamplerView( struct 
> NineBaseTexture9 *This,
>  swizzle[1] = PIPE_SWIZZLE_ZERO;
>  swizzle[2] = PIPE_SWIZZLE_ZERO;
>  swizzle[3] = PIPE_SWIZZLE_ONE;
> -} else if (resource->format != PIPE_FORMAT_A8_UNORM) {
> -/* A8 is the only exception that should have 0.0 as default values
> - * for RGB. It is already what gallium does. All the other ones
> +} else if (resource->format != PIPE_FORMAT_A8_UNORM &&
> +   resource->format != PIPE_FORMAT_RGTC1_UNORM) {
> +/* exceptions:
> + * A8 should have 0.0 as default values for RGB.
> + * ATI1/RGTC1 should be r 0 0 1 (tested on windows).

But RGTC2 is rg11??

> + * It is already what gallium does. All the other ones
>   * should have 1.0 for non-defined values */
>  for (i = 0; i < 4; i++) {
>  if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
> diff --git a/src/gallium/state_trackers/nine/cubetexture9.c 
> b/src/gallium/state_trackers/nine/cubetexture9.c
> index 43db8cb..32635ad 100644
> --- a/src/gallium/state_trackers/nine/cubetexture9.c
> +++ b/src/gallium/state_trackers/nine/cubetexture9.c
> @@ -63,6 +63,10 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
>  return D3DERR_INVALIDCALL;
>  }
>
> +/* We support ATI1 and ATI2 hacks only for 2D textures */
> +if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
> +return D3DERR_INVALIDCALL;
> +
>  info->screen = pParams->device->screen;
>  info->target = PIPE_TEXTURE_CUBE;
>  info->format = d3d9_to_pipe_format(Format);
> diff --git a/src/gallium/state_trackers/nine/nine_pipe.h 
> b/src/gallium/state_trackers/nine/nine_pipe.h
> index 06e4dc9..41792f0 100644
> --- a/src/gallium/state_trackers/nine/nine_pipe.h
> +++ b/src/gallium/state_trackers/nine/nine_pipe.h
> @@ -185,6 +185,8 @@ d3d9_to_pipe_format(D3DFORMAT format)
>  case D3DFMT_DXT3: return PIPE_FORMAT_DXT3_RGBA;
>  case D3DFMT_DXT4: return PIPE_FORMAT_DXT5_RGBA; /* XXX */
>  case D3DFMT_DXT5: return PIPE_FORMAT_DXT5_RGBA;
> +case D3DFMT_ATI1: return PIPE_FORMAT_RGTC1_UNORM;
> +case D3DFMT_ATI2: return PIPE_FORMAT_RGTC2_UNORM;
>  case D3DFMT_UYVY: return PIPE_FORMAT_UYVY;
>  case D3DFMT_YUY2: return PIPE_FORMAT_YUYV; /* XXX check */
>  case D3DFMT_NV12: return PIPE_FORMAT_NV12;
> diff --git a/src/gallium/state_trackers/nine/surface9.c 
> b/src/gallium/state_trackers/nine/surface9.c
> index 5928892..b3c7c18 100644
> --- a/src/gallium/state_trackers/nine/surface9.c
> +++ b/src/gallium/state_trackers/nine/surface9.c
> @@ -38,6 +38,8 @@
>
>  #define DBG_CHANNEL DBG_SURFACE
>
> +#define is_ATI1_ATI2(format) (format == PIPE_FORMAT_RGTC1_UNORM || format == 
> PIPE_FORMAT_RGTC2_UNORM)

The macro is only used once... is it really worth keeping around?

> +
>  HRESULT
>  NineSurface9_ctor( struct NineSurface9 *This,
> struct NineUnknownParams *pParams,
> @@ -382,10 +384,19 @@ NineSurface9_LockRect( struct NineSurface9 *This,
>
>  if (This->data) {
>  DBG("returning system memory\n");
> -
> -pLockedRect->Pitch = This->stride;
> -pLockedRect->pBits = NineSurface9_GetSys

Re: [Mesa-dev] [PATCH] gallium: remove set_shader_resources, add set_shader_buffers for untyped buffers

2015-01-07 Thread Roland Scheidegger
Hmm I'm not quite sure what to think of it. Apparently
set_shader_resources was a closer match to what d3d does (seems to use
UAVs for everything as far as I can tell, plus they are global and not
per stage - I guess another reason why they were set together with RTs).
I guess set_shader_buffers would then cover what you can access in d3d
as StructuredBuffer and RWStructuredBuffer whereas set_shader_images
would cover what can be accessed declared as RWBuffer/RWTextureXD? In
that case I guess this would be manageable for translation, though a
driver would need to figure out what to set with
set_shader_buffers/set_shader_images on their own based on the actual
shaders. But if hardware works like that I don't really oppose that.
As for the slot numbers though d3d11.1 seems to support 64 UAVs -
globally but no further restrictions as far as I can tell (so all could
be atomic counters, for instance).
d3d also has some initial count feature to set the current offset (or
set to -1 to keep current value). I guess this works similarly to how
setting offset for streamout buffers work (a major pita to deal with).

And I think too that the compute interface should match - I guess though
since you now can set this per shader stage you don't need a separate
interface at all (for the record I found how you set that with d3d in a
compute shader, instead of OMSetRenderTargetsAndUnorderedAccessViews()
you use CSSetUnorderedAccessViews()).

Roland


Am 07.01.2015 um 11:56 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> set_shader_resources is unused.
> 
> set_shader_buffers should support shader atomic counter buffers and shader
> storage buffers from OpenGL.
> 
> The plan is to use slots 0..15 for atomic counters and slots 16..31
> for storage buffers. Atomic counters are planned to be supported first.
> 
> This doesn't add any interface for images. The documentation is added
> for future reference.
> ---
> 
> This is the interface only. I don't plan to do anything else for now.
> Comments welcome.
> 
>  src/gallium/docs/source/context.rst | 16 
>  src/gallium/docs/source/screen.rst  |  4 ++--
>  src/gallium/drivers/galahad/glhd_context.c  |  2 +-
>  src/gallium/drivers/ilo/ilo_state.c |  2 +-
>  src/gallium/drivers/nouveau/nouveau_buffer.c|  2 +-
>  src/gallium/drivers/nouveau/nouveau_screen.c|  2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_formats.c |  2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  2 +-
>  src/gallium/include/pipe/p_context.h| 20 +++-
>  src/gallium/include/pipe/p_defines.h|  2 +-
>  src/gallium/include/pipe/p_state.h  | 10 ++
>  11 files changed, 38 insertions(+), 26 deletions(-)
> 
> diff --git a/src/gallium/docs/source/context.rst 
> b/src/gallium/docs/source/context.rst
> index 5861f46..73fd35f 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -126,14 +126,14 @@ from a shader without an associated sampler.  This 
> means that they
>  have no support for floating point coordinates, address wrap modes or
>  filtering.
>  
> -Shader resources are specified for all the shader stages at once using
> -the ``set_shader_resources`` method.  When binding texture resources,
> -the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields
> -specify the mipmap level and the range of layers the texture will be
> -constrained to.  In the case of buffers, ``first_element`` and
> -``last_element`` specify the range within the buffer that will be used
> -by the shader resource.  Writes to a shader resource are only allowed
> -when the ``writable`` flag is set.
> +There are 2 types of shader resources: buffers and images.
> +
> +Buffers are specified using the ``set_shader_buffers`` method.
> +
> +Images are specified using the ``set_shader_images`` method. When binding
> +images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view
> +fields specify the mipmap level and the range of layers the image will be
> +constrained to.
>  
>  Surfaces
>  
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index 55d114c..c81ad66 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -403,8 +403,8 @@ resources might be created and handled quite differently.
>process.
>  * ``PIPE_BIND_GLOBAL``: A buffer that can be mapped into the global
>address space of a compute program.
> -* ``PIPE_BIND_SHADER_RESOURCE``: A buffer or texture that can be
> -  bound to the graphics pipeline as a shader resource.
> +* ``PIPE_BIND_SHADER_BUFFER``: A buffer that can be bound to a shader where
> +  it should support reads, writes, and atomics.
>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>bound to the compute program as a shader resource.
>  * ``PIPE_BIND_COMMAND_ARGS_BUFFER``: A buffer that may be sourced by the
> diff

Re: [Mesa-dev] [PATCH 118/133] nir: Add a sampler index indirect to nir_tex_instr

2015-01-07 Thread Jason Ekstrand
On Wed, Jan 7, 2015 at 7:55 AM, Connor Abbott  wrote:

> On Tue, Jan 6, 2015 at 6:36 PM, Jason Ekstrand 
> wrote:
> >
> >
> > On Mon, Jan 5, 2015 at 10:45 PM, Connor Abbott 
> wrote:
> >>
> >> I created nir_tex_src_sampler_index for exactly this purpose, which
> >> fits in with the "stick all the sources in an array so we can easily
> >> iterate over them" philosophy. If you decide to keep with this
> >> solution, though, at least remove that.
> >
> >
> > Sorry, I completely missed that.  My only gripe is that it doesn't really
> > follow the rest of our base_offset + indirect philosophy.  Is that the
> way
> > you were intending to use it?  i.e. direct just has sampler_index and
> > indirect is sampler_index + nir_tex_src_sampler_index.  If so, maybe we
> > should rename it to nir_tex_src_sampler_indirect.
> >
> > I'm 100% ok with that, It just isn't at all clear how the two work
> together.
> > --Jason
>
> Well, when I added nir_tex_src_sampler_index, it was more of a "I know
> we'll need something like this eventually so I'll stick it here to
> remind myself/other people when the time comes" thing, and I wasn't
> sure which option would be better. So you can keep it and always set
> sampler_index to 0 when it's indirect, or rename it - whatever's
> easier to do, so long as it's consistent.
>

I think I'll go ahead and rename it.  It's more consistent with the rest of
the texture instruction stuff to have it in the list and it's more
consistent with other things to have it be an offset applied to the index.
It's going to generate the same backend code either way.
--Jason


>
> >
> >>
> >>
> >> On Tue, Dec 16, 2014 at 1:13 AM, Jason Ekstrand 
> >> wrote:
> >> > ---
> >> >  src/glsl/nir/nir.c  | 11 +++
> >> >  src/glsl/nir/nir.h  | 10 ++
> >> >  src/glsl/nir/nir_print.c|  4 
> >> >  src/glsl/nir/nir_validate.c |  3 +++
> >> >  4 files changed, 28 insertions(+)
> >> >
> >> > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> >> > index 60c9cff..8bcc64a 100644
> >> > --- a/src/glsl/nir/nir.c
> >> > +++ b/src/glsl/nir/nir.c
> >> > @@ -461,6 +461,13 @@ nir_tex_instr_create(void *mem_ctx, unsigned
> >> > num_srcs)
> >> > instr->has_predicate = false;
> >> > src_init(&instr->predicate);
> >> >
> >> > +   instr->sampler_index = 0;
> >> > +   instr->has_sampler_indirect = false;
> >> > +   src_init(&instr->sampler_indirect);
> >> > +   instr->sampler_indirect_max = 0;
> >> > +
> >> > +   instr->sampler = NULL;
> >> > +
> >> > return instr;
> >> >  }
> >> >
> >> > @@ -1529,6 +1536,10 @@ visit_tex_src(nir_tex_instr *instr,
> >> > nir_foreach_src_cb cb, void *state)
> >> >if (!visit_src(&instr->predicate, cb, state))
> >> >   return false;
> >> >
> >> > +   if (instr->has_sampler_indirect)
> >> > +  if (!visit_src(&instr->sampler_indirect, cb, state))
> >> > + return false;
> >> > +
> >> > if (instr->sampler != NULL)
> >> >if (!visit_deref_src(instr->sampler, cb, state))
> >> >   return false;
> >> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> >> > index 32bf634..bc7a226 100644
> >> > --- a/src/glsl/nir/nir.h
> >> > +++ b/src/glsl/nir/nir.h
> >> > @@ -838,7 +838,17 @@ typedef struct {
> >> > /* gather component selector */
> >> > unsigned component : 2;
> >> >
> >> > +   /** The sampler index
> >> > +*
> >> > +* If has_indirect is true, then the sampler index is given by
> >> > +* sampler_index + sampler_indirect where sampler_indirect has a
> >> > maximum
> >> > +* possible value of sampler_indirect_max.
> >> > +*/
> >> > unsigned sampler_index;
> >> > +   bool has_sampler_indirect;
> >> > +   nir_src sampler_indirect;
> >> > +   unsigned sampler_indirect_max;
> >> > +
> >> > nir_deref_var *sampler; /* if this is NULL, use sampler_index
> >> > instead */
> >> >  } nir_tex_instr;
> >> >
> >> > diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
> >> > index 962e408..67df9a5 100644
> >> > --- a/src/glsl/nir/nir_print.c
> >> > +++ b/src/glsl/nir/nir_print.c
> >> > @@ -498,6 +498,10 @@ print_tex_instr(nir_tex_instr *instr,
> >> > print_var_state *state, FILE *fp)
> >> >print_deref(instr->sampler, state, fp);
> >> > } else {
> >> >fprintf(fp, "%u", instr->sampler_index);
> >> > +  if (instr->has_sampler_indirect) {
> >> > + fprintf(fp, " + ");
> >> > + print_src(&instr->sampler_indirect, fp);
> >> > +  }
> >> > }
> >> >
> >> > fprintf(fp, " (sampler)");
> >> > diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
> >> > index e565b3c..ed6e482 100644
> >> > --- a/src/glsl/nir/nir_validate.c
> >> > +++ b/src/glsl/nir/nir_validate.c
> >> > @@ -399,6 +399,9 @@ validate_tex_instr(nir_tex_instr *instr,
> >> > validate_state *state)
> >> >validate_src(&instr->src[i], state);
> >> > }
> >> >
> >> > +   if (instr->has_sampler_indirect)
> >> > +  validate_src(&instr->sampler_indirect, s

Re: [Mesa-dev] [PATCH 118/133] nir: Add a sampler index indirect to nir_tex_instr

2015-01-07 Thread Jason Ekstrand
On Wed, Jan 7, 2015 at 9:58 AM, Jason Ekstrand  wrote:

>
>
> On Wed, Jan 7, 2015 at 7:55 AM, Connor Abbott  wrote:
>
>> On Tue, Jan 6, 2015 at 6:36 PM, Jason Ekstrand 
>> wrote:
>> >
>> >
>> > On Mon, Jan 5, 2015 at 10:45 PM, Connor Abbott 
>> wrote:
>> >>
>> >> I created nir_tex_src_sampler_index for exactly this purpose, which
>> >> fits in with the "stick all the sources in an array so we can easily
>> >> iterate over them" philosophy. If you decide to keep with this
>> >> solution, though, at least remove that.
>> >
>> >
>> > Sorry, I completely missed that.  My only gripe is that it doesn't
>> really
>> > follow the rest of our base_offset + indirect philosophy.  Is that the
>> way
>> > you were intending to use it?  i.e. direct just has sampler_index and
>> > indirect is sampler_index + nir_tex_src_sampler_index.  If so, maybe we
>> > should rename it to nir_tex_src_sampler_indirect.
>> >
>> > I'm 100% ok with that, It just isn't at all clear how the two work
>> together.
>> > --Jason
>>
>> Well, when I added nir_tex_src_sampler_index, it was more of a "I know
>> we'll need something like this eventually so I'll stick it here to
>> remind myself/other people when the time comes" thing, and I wasn't
>> sure which option would be better. So you can keep it and always set
>> sampler_index to 0 when it's indirect, or rename it - whatever's
>> easier to do, so long as it's consistent.
>>
>
> I think I'll go ahead and rename it.  It's more consistent with the rest
> of the texture instruction stuff to have it in the list and it's more
> consistent with other things to have it be an offset applied to the index.
> It's going to generate the same backend code either way.
>

Also, it allows backends to do something more interesting if they know that
the index is always the base of the array.


> --Jason
>
>
>>
>> >
>> >>
>> >>
>> >> On Tue, Dec 16, 2014 at 1:13 AM, Jason Ekstrand 
>> >> wrote:
>> >> > ---
>> >> >  src/glsl/nir/nir.c  | 11 +++
>> >> >  src/glsl/nir/nir.h  | 10 ++
>> >> >  src/glsl/nir/nir_print.c|  4 
>> >> >  src/glsl/nir/nir_validate.c |  3 +++
>> >> >  4 files changed, 28 insertions(+)
>> >> >
>> >> > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> >> > index 60c9cff..8bcc64a 100644
>> >> > --- a/src/glsl/nir/nir.c
>> >> > +++ b/src/glsl/nir/nir.c
>> >> > @@ -461,6 +461,13 @@ nir_tex_instr_create(void *mem_ctx, unsigned
>> >> > num_srcs)
>> >> > instr->has_predicate = false;
>> >> > src_init(&instr->predicate);
>> >> >
>> >> > +   instr->sampler_index = 0;
>> >> > +   instr->has_sampler_indirect = false;
>> >> > +   src_init(&instr->sampler_indirect);
>> >> > +   instr->sampler_indirect_max = 0;
>> >> > +
>> >> > +   instr->sampler = NULL;
>> >> > +
>> >> > return instr;
>> >> >  }
>> >> >
>> >> > @@ -1529,6 +1536,10 @@ visit_tex_src(nir_tex_instr *instr,
>> >> > nir_foreach_src_cb cb, void *state)
>> >> >if (!visit_src(&instr->predicate, cb, state))
>> >> >   return false;
>> >> >
>> >> > +   if (instr->has_sampler_indirect)
>> >> > +  if (!visit_src(&instr->sampler_indirect, cb, state))
>> >> > + return false;
>> >> > +
>> >> > if (instr->sampler != NULL)
>> >> >if (!visit_deref_src(instr->sampler, cb, state))
>> >> >   return false;
>> >> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> >> > index 32bf634..bc7a226 100644
>> >> > --- a/src/glsl/nir/nir.h
>> >> > +++ b/src/glsl/nir/nir.h
>> >> > @@ -838,7 +838,17 @@ typedef struct {
>> >> > /* gather component selector */
>> >> > unsigned component : 2;
>> >> >
>> >> > +   /** The sampler index
>> >> > +*
>> >> > +* If has_indirect is true, then the sampler index is given by
>> >> > +* sampler_index + sampler_indirect where sampler_indirect has a
>> >> > maximum
>> >> > +* possible value of sampler_indirect_max.
>> >> > +*/
>> >> > unsigned sampler_index;
>> >> > +   bool has_sampler_indirect;
>> >> > +   nir_src sampler_indirect;
>> >> > +   unsigned sampler_indirect_max;
>> >> > +
>> >> > nir_deref_var *sampler; /* if this is NULL, use sampler_index
>> >> > instead */
>> >> >  } nir_tex_instr;
>> >> >
>> >> > diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
>> >> > index 962e408..67df9a5 100644
>> >> > --- a/src/glsl/nir/nir_print.c
>> >> > +++ b/src/glsl/nir/nir_print.c
>> >> > @@ -498,6 +498,10 @@ print_tex_instr(nir_tex_instr *instr,
>> >> > print_var_state *state, FILE *fp)
>> >> >print_deref(instr->sampler, state, fp);
>> >> > } else {
>> >> >fprintf(fp, "%u", instr->sampler_index);
>> >> > +  if (instr->has_sampler_indirect) {
>> >> > + fprintf(fp, " + ");
>> >> > + print_src(&instr->sampler_indirect, fp);
>> >> > +  }
>> >> > }
>> >> >
>> >> > fprintf(fp, " (sampler)");
>> >> > diff --git a/src/glsl/nir/nir_validate.c
>> b/src/glsl/nir/nir_validate.c
>> >> > index e565b3c..ed6e482 100644
>> >> > --- a/src/glsl/nir

Re: [Mesa-dev] [PATCH 24/53] st/nine: Handle RSQ special cases

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> We should use the absolute value of the input as input to ureg_RSQ.
>
> Moreover, an input of 0.0 should return FLT_MAX.
>
> Reviewed-by: David Heidelberg 
> Signed-off-by: Axel Davy 
>
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/nine_shader.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
> b/src/gallium/state_trackers/nine/nine_shader.c
> index da77da5..4dee5f5 100644
> --- a/src/gallium/state_trackers/nine/nine_shader.c
> +++ b/src/gallium/state_trackers/nine/nine_shader.c
> @@ -1957,6 +1957,17 @@ DECL_SPECIAL(POW)
>  return D3D_OK;
>  }
>
> +DECL_SPECIAL(RSQ)
> +{
> +struct ureg_program *ureg = tx->ureg;
> +struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
> +struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
> +struct ureg_dst tmp = tx_scratch(tx);
> +ureg_RSQ(ureg, tmp, ureg_abs(src));
> +ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));

When would this MIN not return the value in tmp? In the description
you say that RSQ(0.0) should return FLT_MAX... is the theory that
MIN(NaN, FLT_MAX) == FLT_MAX? Is RSQ(0) in tgsi defined to return NaN?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 25/53] st/nine: Handle NRM with input of null norm

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> When the input's xyz are 0.0, the output
> should be 0.0. This is due to the fact that
> Inf * 0 = 0 for dx9. To handle this case,
> cap the result of RSQ to FLT_MAX. We have
> FLT_MAX * 0 = 0.
>
> Reviewed-by: David Heidelberg 
> Signed-off-by: Axel Davy 
>
> Cc: "10.4" 
> ---
>  src/gallium/state_trackers/nine/nine_shader.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
> b/src/gallium/state_trackers/nine/nine_shader.c
> index 4dee5f5..48492b4 100644
> --- a/src/gallium/state_trackers/nine/nine_shader.c
> +++ b/src/gallium/state_trackers/nine/nine_shader.c
> @@ -1973,10 +1973,12 @@ DECL_SPECIAL(NRM)
>  struct ureg_program *ureg = tx->ureg;
>  struct ureg_dst tmp = tx_scratch_scalar(tx);
>  struct ureg_src nrm = tx_src_scalar(tmp);
> +struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
>  struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
>  ureg_DP3(ureg, tmp, src, src);
>  ureg_RSQ(ureg, tmp, nrm);
> -ureg_MUL(ureg, tx_dst_param(tx, &tx->insn.dst[0]), src, nrm);
> +ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
> +ureg_MUL(ureg, dst, src, nrm);

Was this supposed to use tmp instead of nrm? Otherwise tmp is
unused... Also, same question as before wrt the MIN.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 25/53] st/nine: Handle NRM with input of null norm

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 1:11 PM, Ilia Mirkin  wrote:
> On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
>> When the input's xyz are 0.0, the output
>> should be 0.0. This is due to the fact that
>> Inf * 0 = 0 for dx9. To handle this case,
>> cap the result of RSQ to FLT_MAX. We have
>> FLT_MAX * 0 = 0.
>>
>> Reviewed-by: David Heidelberg 
>> Signed-off-by: Axel Davy 
>>
>> Cc: "10.4" 
>> ---
>>  src/gallium/state_trackers/nine/nine_shader.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
>> b/src/gallium/state_trackers/nine/nine_shader.c
>> index 4dee5f5..48492b4 100644
>> --- a/src/gallium/state_trackers/nine/nine_shader.c
>> +++ b/src/gallium/state_trackers/nine/nine_shader.c
>> @@ -1973,10 +1973,12 @@ DECL_SPECIAL(NRM)
>>  struct ureg_program *ureg = tx->ureg;
>>  struct ureg_dst tmp = tx_scratch_scalar(tx);
>>  struct ureg_src nrm = tx_src_scalar(tmp);
>> +struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
>>  struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
>>  ureg_DP3(ureg, tmp, src, src);
>>  ureg_RSQ(ureg, tmp, nrm);
>> -ureg_MUL(ureg, tx_dst_param(tx, &tx->insn.dst[0]), src, nrm);
>> +ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
>> +ureg_MUL(ureg, dst, src, nrm);
>
> Was this supposed to use tmp instead of nrm? Otherwise tmp is
> unused... Also, same question as before wrt the MIN.

Er wait, I see, nrm == tmp..
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] mesa/st: Avoid passing a NULL buffer to the drivers

2015-01-07 Thread Tobias Klausmann
If we capture transform feedback from n stream in (n-1) buffers we face a
NULL buffer, use the buffer (n-1) to capture the output of stream n.

This fixes one piglit test with nvc0:
   arb_gpu_shader5-xfb-streams-without-invocations

Signed-off-by: Tobias Klausmann 
---
 src/mesa/state_tracker/st_cb_xformfb.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/state_tracker/st_cb_xformfb.c 
b/src/mesa/state_tracker/st_cb_xformfb.c
index 8f75eda..5a12da4 100644
--- a/src/mesa/state_tracker/st_cb_xformfb.c
+++ b/src/mesa/state_tracker/st_cb_xformfb.c
@@ -123,6 +123,11 @@ st_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
   struct st_buffer_object *bo = st_buffer_object(sobj->base.Buffers[i]);
 
   if (bo) {
+ if (!bo->buffer)
+/* If we capture transform feedback from n streams into (n-1)
+ * buffers we have to write to buffer (n-1) for stream n.
+ */
+bo = st_buffer_object(sobj->base.Buffers[i-1]);
  /* Check whether we need to recreate the target. */
  if (!sobj->targets[i] ||
  sobj->targets[i] == sobj->draw_count ||
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add mesa SHA-1 functions

2015-01-07 Thread Carl Worth
On Wed, Jan 07 2015, Jose Fonseca wrote:
> I lost bit track of email over the Christmas period.  Just noticed I had 
> flagged this one for replay. Sorry.

No worries. Thanks for following up now. :-)

> Do you still need me to test anything on Windows? If so are the patches 
> in some pull-able git repos by any chance?

Yes, some testing on Windows would be great. I've got these patches
here:

git://people.freedesktop.org/~cworth/mesa

And testing that the build works fine with or without one of the
potential Windows crypto libraries available would be great. Look for
lines like the following in the configure output:

Shader cache:yes
With SHA1 from:  libnettle

And you can manually control this by passing options such as:

./configure --disable-shader-cache

or:

./configure --enable-shader-cache --with-sha1=CryptoAPI

The possible values for --with-sha1 are listed in "./configure --help"
and include the following:

libc, libmd, libnettle, libgcrypt, libcrypto, libsha1,
CommonCrypto, CryptoAPI

As I said earlier in the thread, I've tested libnettle, libgcrypt, and
libcrypto on my Linux machine. So any touch testing of any of the other
options, (particularly those available only on Windows), would be great.

In that branch, there's not actually any code that calls into any of the
sha1 functions. So you'll basically just be testing configuration,
building, and linking. If you'd like to go the extra step and verify
that the code can be called and actually do something, then you could
use something like the attached patch which simply prints the computed
sha1 for any compiled shader.

Please let me know if you have any questions, and what testing results
you get.

Thanks,

-Carl

From 07bd85f5c620361ad0ea358f01a8a0b5139f1239 Mon Sep 17 00:00:00 2001
From: Carl Worth 
Date: Wed, 7 Jan 2015 11:07:59 -0800
Subject: [PATCH] Exercise the recently-added sha1 code.

This commit is not intended to be pushed upstream. It simply adds a
print message for each compiled shader, giving the computed sha1 of
the shader source. (This is intended to provide minimal testing that
the sha1 code detected by the configure script actually links and
runs.)
---
 src/glsl/glsl_parser_extras.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 7bfc39e..39a4749 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -33,6 +33,7 @@ extern "C" {
 }
 
 #include "util/ralloc.h"
+#include "util/sha1.h"
 #include "ast.h"
 #include "glsl_parser_extras.h"
 #include "glsl_parser.h"
@@ -1449,11 +1450,17 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader,
struct _mesa_glsl_parse_state *state =
   new(shader) _mesa_glsl_parse_state(ctx, shader->Stage, shader);
const char *source = shader->Source;
+   unsigned char sha1[20];
+   char sha1_str[41];
 
if (ctx->Const.GenerateTemporaryNames)
   (void) p_atomic_cmpxchg(&ir_variable::temporaries_allocate_names,
   false, true);
 
+   _mesa_sha1_compute(source, strlen(source), sha1);
+   _mesa_sha1_format(sha1_str, sha1);
+   printf("Computed sha1 of GLSL source string: %s\n", sha1_str);
+
state->error = glcpp_preprocess(state, &source, &state->info_log,
  &ctx->Extensions, ctx);
 
-- 
2.1.4



pgpz8TYx0Gn3A.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: Fix crash when destroying si_screen

2015-01-07 Thread Tom Stellard
We need to dispose the cached LLVMTargetMachine before calling
r600_destory_common_screen(), because this function frees the si_screen
object which invalidates LLVMTargetMachine pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=88170
---
 src/gallium/drivers/radeonsi/si_pipe.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 38bff31..e3f8fcf 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -424,11 +424,13 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
if (!sscreen->b.ws->unref(sscreen->b.ws))
return;
 
-   r600_destroy_common_screen(&sscreen->b);
-
 #if HAVE_LLVM >= 0x0306
+   // r600_destroy_common_screen() frees sscreen, so we need to make
+   // sure to dispose the TargetMachine before we call it.
LLVMDisposeTargetMachine(sscreen->tm);
 #endif
+
+   r600_destroy_common_screen(&sscreen->b);
 }
 
 #define SI_TILE_MODE_COLOR_2D_8BPP  14
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon on Source games

2015-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=87886

--- Comment #15 from almos  ---
I tried l4d2 again with mesa 10.5-dev (git-1829f9c), and still nothing. Kernel
is the same as before (3.17.7). Do I need to underclock my CPU to see the lag
spikes?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon on Source games

2015-01-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=87886

--- Comment #16 from Stéphane Travostino  ---
Is it possible there's a weird interaction with PRIME? @almos, does your system
have a muxless setup?

My CPU isn't underclocked nor undervolted, and using the performance governor
doesn't help in any way.

Also, I had the same problem with 3.17.6 -- I'll soon try again with an updated
mesa and Linux 3.19-rc

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/13] radeonsi: reduce the size of si_pm4_state

2015-01-07 Thread Tom Stellard
On Mon, Jan 05, 2015 at 12:18:40AM +0100, Marek Olšák wrote:
> From: Marek Olšák 
> 
> - the relocs array is unused, remove it
> - ndw is at most 115 (init), set 140 as the maximum
> - compute needs 4 buffers per state, graphics only needs 1; set 4 as the 
> maximum

The number of buffers per state is dependent on the input arguments to the 
kernel,
so it can be way more than 4.

OpenCL requires that the input buffer be at least 256 bytes, which means at 
minimum,
we must be able to support:

(256 / 8) + 3 (internal buffers) = 35 buffers

This patch breaks most of the OpenCL tests, can we revert it for now?

-Tom

> ---
>  src/gallium/drivers/radeonsi/si_pm4.c | 6 +-
>  src/gallium/drivers/radeonsi/si_pm4.h | 9 ++---
>  2 files changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_pm4.c 
> b/src/gallium/drivers/radeonsi/si_pm4.c
> index 954eb6e..21ab9f2 100644
> --- a/src/gallium/drivers/radeonsi/si_pm4.c
> +++ b/src/gallium/drivers/radeonsi/si_pm4.c
> @@ -145,17 +145,13 @@ unsigned si_pm4_dirty_dw(struct si_context *sctx)
>  void si_pm4_emit(struct si_context *sctx, struct si_pm4_state *state)
>  {
>   struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
> +
>   for (int i = 0; i < state->nbo; ++i) {
>   r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, 
> state->bo[i],
> state->bo_usage[i], 
> state->bo_priority[i]);
>   }
>  
>   memcpy(&cs->buf[cs->cdw], state->pm4, state->ndw * 4);
> -
> - for (int i = 0; i < state->nrelocs; ++i) {
> - cs->buf[cs->cdw + state->relocs[i]] += cs->cdw << 2;
> - }
> -
>   cs->cdw += state->ndw;
>  
>  #if SI_TRACE_CS
> diff --git a/src/gallium/drivers/radeonsi/si_pm4.h 
> b/src/gallium/drivers/radeonsi/si_pm4.h
> index 8680a9e..388bb4b 100644
> --- a/src/gallium/drivers/radeonsi/si_pm4.h
> +++ b/src/gallium/drivers/radeonsi/si_pm4.h
> @@ -29,9 +29,8 @@
>  
>  #include "radeon/drm/radeon_winsys.h"
>  
> -#define SI_PM4_MAX_DW256
> -#define SI_PM4_MAX_BO32
> -#define SI_PM4_MAX_RELOCS4
> +#define SI_PM4_MAX_DW140
> +#define SI_PM4_MAX_BO4
>  
>  // forward defines
>  struct si_context;
> @@ -54,10 +53,6 @@ struct si_pm4_state
>   enum radeon_bo_usagebo_usage[SI_PM4_MAX_BO];
>   enum radeon_bo_priority bo_priority[SI_PM4_MAX_BO];
>  
> - /* relocs for shader data */
> - unsignednrelocs;
> - unsignedrelocs[SI_PM4_MAX_RELOCS];
> -
>   bool compute_pkt;
>  };
>  
> -- 
> 2.1.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/53] st/nine: Match REP implementation to LOOP

2015-01-07 Thread Ilia Mirkin
On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy  wrote:
> Previous implementation was fine,
> just instead of having increasing counter,
> have a decreasing counter.
>
> Signed-off-by: Axel Davy 
> ---
>  src/gallium/state_trackers/nine/nine_shader.c | 41 
> +++
>  1 file changed, 23 insertions(+), 18 deletions(-)
>
> diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
> b/src/gallium/state_trackers/nine/nine_shader.c
> index 21b06ce..88d4c07 100644
> --- a/src/gallium/state_trackers/nine/nine_shader.c
> +++ b/src/gallium/state_trackers/nine/nine_shader.c
> @@ -1562,9 +1562,7 @@ DECL_SPECIAL(REP)
>  unsigned *label;
>  struct ureg_src rep = tx_src_param(tx, &tx->insn.src[0]);
>  struct ureg_dst ctr;
> -struct ureg_dst tmp = tx_scratch_scalar(tx);
> -struct ureg_src imm =
> -tx->native_integers ? ureg_imm1u(ureg, 0) : ureg_imm1f(ureg, 0.0f);
> +struct ureg_dst tmp;
>
>  label = tx_bgnloop(tx);
>  ctr = tx_get_loopctr(tx, FALSE);
> @@ -1572,33 +1570,40 @@ DECL_SPECIAL(REP)
>  /* NOTE: rep must be constant, so we don't have to save the count */
>  assert(rep.File == TGSI_FILE_CONSTANT || rep.File == 
> TGSI_FILE_IMMEDIATE);
>
> -ureg_MOV(ureg, ctr, imm);
> +ureg_MOV(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), rep);
> +/* in the case ctr is float, remove 0.5 to avoid precision issues for 
> comparisons */
> +if (!tx->native_integers)
> +ureg_ADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), 
> ureg_src(ctr), ureg_imm1f(ureg, -0.5f));
> +
>  ureg_BGNLOOP(ureg, label);
> -if (tx->native_integers)
> -{
> -ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
> -ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
> -}
> -else
> -{
> -ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
> +tmp = tx_scratch_scalar(tx);
> +
> +/* stop when crt.x <= 0 */

ctr.x

> +if (!tx->native_integers) {
> +ureg_SLE(ureg, tmp, ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X), 
> ureg_imm1f(ureg, 0.0f));

It would be less confusing if this were the same as below, i.e. switch
it to SGT. Or switch the other one to ISLT. Having one be greater and
one less-than is weird.

>  ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
> +} else {
> +ureg_ISGE(ureg, tmp, ureg_imm1i(ureg, 0), ureg_scalar(ureg_src(ctr), 
> TGSI_SWIZZLE_X));
> +ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
>  }
>  ureg_BRK(ureg);
>  tx_endcond(tx);
>  ureg_ENDIF(ureg);
>
> -if (tx->native_integers) {
> -ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
> -} else {
> -ureg_ADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1f(ureg, 1.0f));
> -}
> -
>  return D3D_OK;
>  }
>
>  DECL_SPECIAL(ENDREP)
>  {
> +struct ureg_program *ureg = tx->ureg;
> +struct ureg_dst ctr = tx_get_loopctr(tx, FALSE);
> +
> +if (!tx->native_integers) {
> +ureg_ADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), 
> ureg_src(ctr), ureg_imm1f(ureg, -1.0f));
> +} else {
> +ureg_UADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), 
> ureg_src(ctr), ureg_imm1i(ureg, -1.0));

-1 -- presumably this is an integer...

But at what point does ctr get converted to an integer? At the
beginning, when you subtract 0.5, it appears to still be a float... I
think that ADD is wrong and needs to be a UADD? (But then obviously
-0.5 doesn't make sense).

> +}
> +
>  ureg_ENDLOOP(tx->ureg, tx_endloop(tx));
>  return D3D_OK;
>  }
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >