Re: [Mesa-dev] RFC: remove ctx->Driver.TextureMemCpy() hook

2011-12-02 Thread Keith Whitwell
On Fri, 2011-12-02 at 08:14 -0700, Brian Paul wrote:
> This hook was added many years ago to allow using an alternative 
> implementation of memcpy() for glTexImage() that was faster under some 
> circumstances.
> 
> The code is still present in the state tracker in st_cb_texture.c
> 
> The hook is only used in texstore.c in the memcpy_texture() helper. 
> It's not used for glCompressedTex[Sub]Image nor a few other places 
> where it could have been used.
> 
> The non-gallium drivers just set ctx->Driver.TextureMemCpy = memcpy so 
> it's really not utilized there.
> 
> If we think that using regular memcpy() everywhere is OK, I'd like to 
> remove this hook.  I haven't done any investigation into whether the 
> assembly __memcpy() function in st_cb_teximage.c is really any faster 
> nowadays.  But if there really is a benefit to this function, we could 
> use it in more places.
> 
> Any comments?

That was a very long time ago.  I'd be surprised if the problem
persists.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC]Improves st_finalize_texture cycles consumption

2012-01-08 Thread Keith Whitwell
I don't have the code handy (and haven't looked at it in a while), but wonder 
if finer-grained tracking of dirtiness would help?  Or more generally trying to 
preserve more computed results across state changes?

Keith

- Original Message -
> Hi,
> 
> I did some profiling with perf under nexuiz and found that
> st_finalize_texture
> function was one of the most cycle consumming. (~1,50% whereas
> darkplaces took ~30%)
> 
> I rewrite some part of this function to make it a bit faster ; with
> these 2 patches,
> st_finalize_texture consumption went down to ~1%, so a 40-50% boost.
> This does however not translate to more fps to Nexuiz : if there is
> any improvement,
> it is not noticeable (too much noise in measurements). On the other
> hand, the function
> has become less readable. I had to manually unroll loops and use
> intermediate values
> (gcc does not do it automaticaly, using default parameters).
> Of course I think that we should make less call to this function to
> see a true gain,
> but this would require more work.
> 
> Regards,
> Vincent
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] state_trackers/dri/sw: Implement texture_from_pixmap.

2011-08-31 Thread Keith Whitwell
On Wed, 2011-08-31 at 04:55 -0700, Jose Fonseca wrote:
> I haven't tested but the whole patch series looks good AFAICT.
> 
> I'm really happy to see this work completed, as it was excluding the 
> llvmpipe/softpipe from a very big class of apps. Thanks for taking the 
> initiative!

Likewise!  Thanks for taking the time to figure this stuff out.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Building with -fno-builtin-memcmp for improved performance

2011-09-20 Thread Keith Whitwell
On Tue, 2011-09-20 at 10:59 +0200, Fabio wrote:
> There was a discussion some months ago about using -fno-builtin-memcmp for 
> improving memcmp performance:
> http://lists.freedesktop.org/archives/mesa-dev/2011-June/009078.html
> 
> Since then, was it properly addressed in mesa or the flag is still 
> recommended? If so, what about adding it in configure.ac?

I've been meaning to follow up on this too.  I don't know the answer,
but pinging Roland in case he does.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Building with -fno-builtin-memcmp for improved performance

2011-09-20 Thread Keith Whitwell
On Tue, 2011-09-20 at 16:02 +0200, Roland Scheidegger wrote:
> Am 20.09.2011 12:35, schrieb Keith Whitwell:
> > On Tue, 2011-09-20 at 10:59 +0200, Fabio wrote:
> >> There was a discussion some months ago about using -fno-builtin-memcmp for 
> >> improving memcmp performance:
> >> http://lists.freedesktop.org/archives/mesa-dev/2011-June/009078.html
> >>
> >> Since then, was it properly addressed in mesa or the flag is still 
> >> recommended? If so, what about adding it in configure.ac?
> > 
> > I've been meaning to follow up on this too.  I don't know the answer,
> > but pinging Roland in case he does.
> 
> I guess it is still recommended.
> Ideally this is really something which should be fixed in gcc - the
> compiler has all the knowledge about fixed alignment and size (if any)
> (and more importantly knows if only a binary answer is needed which
> makes this much easier) and doesn't need to do any function call.
> If you enable that flag and some platform just has the same primitive
> repz cmpsb sequence in the system library it will just get even slower,
> though I guess chances of that happening are slim (with the possible
> exception of windows).
> I think in most cases it won't make much difference, so nobody cared to
> implement that change. It is most likely still a good idea unless gcc
> addressed that in the meantime...

Hmm, it seemed like it made a big difference in the earlier
discussion...

I should take a look at reducing the size of the struct (as mentioned
before), but surely there's some way to pull in a better memcmp??

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Building with -fno-builtin-memcmp for improved performance

2011-09-20 Thread Keith Whitwell
On Tue, 2011-09-20 at 16:35 +0200, Roland Scheidegger wrote:
> Am 20.09.2011 16:15, schrieb Keith Whitwell:
> > On Tue, 2011-09-20 at 16:02 +0200, Roland Scheidegger wrote:
> >> Am 20.09.2011 12:35, schrieb Keith Whitwell:
> >>> On Tue, 2011-09-20 at 10:59 +0200, Fabio wrote:
> >>>> There was a discussion some months ago about using -fno-builtin-memcmp 
> >>>> for 
> >>>> improving memcmp performance:
> >>>> http://lists.freedesktop.org/archives/mesa-dev/2011-June/009078.html
> >>>>
> >>>> Since then, was it properly addressed in mesa or the flag is still 
> >>>> recommended? If so, what about adding it in configure.ac?
> >>>
> >>> I've been meaning to follow up on this too.  I don't know the answer,
> >>> but pinging Roland in case he does.
> >>
> >> I guess it is still recommended.
> >> Ideally this is really something which should be fixed in gcc - the
> >> compiler has all the knowledge about fixed alignment and size (if any)
> >> (and more importantly knows if only a binary answer is needed which
> >> makes this much easier) and doesn't need to do any function call.
> >> If you enable that flag and some platform just has the same primitive
> >> repz cmpsb sequence in the system library it will just get even slower,
> >> though I guess chances of that happening are slim (with the possible
> >> exception of windows).
> >> I think in most cases it won't make much difference, so nobody cared to
> >> implement that change. It is most likely still a good idea unless gcc
> >> addressed that in the meantime...
> > 
> > Hmm, it seemed like it made a big difference in the earlier
> > discussion...
> Yes for llvmpipe and one app at least.
> But that struct being compared there is most likely the biggest (by far)
> anywhere (at least which is compared in a regular fashion).
> 
> > I should take a look at reducing the size of the struct (as mentioned
> > before), but surely there's some way to pull in a better memcmp??
> 
> Well, apart from using -fno-builtin-memcmp we could build our own
> memcmpxx, though the version I did there (returning binary only result
> and assuming 32bit alignment/size allowing gcc to optimize it) was still
> slower for large sizes than -fno-builtin-memcmp. Of course we could
> optimize it more (e.g. for 64bit aligned/sized things, or using
> hand-coded sse2 versions using 128bit at-a-time comparisons) but then it
> gets more complicated, so I wasn't sure it was worth it.
> 
> For reference here are the earlier numbers (ipers with llvmpipe):
> original ipers: 12.1 fps
> optimized struct compare: 16.8 fps
> -fno-builtin-memcmp: 18.1 fps
> 
> And this was the function I used for getting the numbers:
> 
> static INLINE int util_cmp_struct(const void *src1, const void *src2,
> unsigned count)
> {
>   /* hmm pointer casting is evil */
>   const uint32_t *src1_ptr = (uint32_t *)src1;
>   const uint32_t *src2_ptr = (uint32_t *)src2;
>   unsigned i;
>   assert(count % 4 == 0);
>   for (i = 0; i < count/4; i++) {
> if (*src1_ptr != *src2_ptr) {
>   return 1;
> }
> src1_ptr++;
> src2_ptr++;
>   }
>   return 0;
> }

OK, maybe the first thing to do is fix the compared struct, then let's
see if there's anything significant left for a better memcmp to extract.

I can find some time to do that in the next few days.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] intel: Remove the pbo zero-copy code.

2011-09-21 Thread Keith Whitwell
I'm suprised that fragile code lasted as long as it did...

Looks good to me.

Keith

On Wed, 2011-09-21 at 10:15 -0700, Eric Anholt wrote:
> There were notes about the possibility of slowdowns due to zcopy from
> a PBO due to thrashing around of the region.  Slowdowns are even more
> likely now that textures are generally tiled, which a zcopy wouldn't
> get.  Additionally, there were no checks on the buffer size to ensure
> that the hardware-required rounding was present, which could result in
> GPU hangs on large zcopy PBOs.
> ---
>  src/mesa/drivers/dri/intel/intel_buffer_objects.c |   45 
>  src/mesa/drivers/dri/intel/intel_buffer_objects.h |   12 --
>  src/mesa/drivers/dri/intel/intel_regions.c|  119 
> -
>  src/mesa/drivers/dri/intel/intel_regions.h|   11 --
>  src/mesa/drivers/dri/intel/intel_tex_image.c  |   60 ---
>  5 files changed, 0 insertions(+), 247 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_buffer_objects.c 
> b/src/mesa/drivers/dri/intel/intel_buffer_objects.c
> index d35a50e..4df2d76 100644
> --- a/src/mesa/drivers/dri/intel/intel_buffer_objects.c
> +++ b/src/mesa/drivers/dri/intel/intel_buffer_objects.c
> @@ -79,30 +79,6 @@ intel_bufferobj_alloc(struct gl_context * ctx, GLuint 
> name, GLenum target)
> return &obj->Base;
>  }
>  
> -/* Break the COW tie to the region.  The region gets to keep the data.
> - */
> -void
> -intel_bufferobj_release_region(struct intel_buffer_object *intel_obj)
> -{
> -   assert(intel_obj->region->buffer == intel_obj->buffer);
> -   intel_obj->region->pbo = NULL;
> -   intel_obj->region = NULL;
> -
> -   release_buffer(intel_obj);
> -}
> -
> -/* Break the COW tie to the region.  Both the pbo and the region end
> - * up with a copy of the data.
> - */
> -void
> -intel_bufferobj_cow(struct intel_context *intel,
> -struct intel_buffer_object *intel_obj)
> -{
> -   assert(intel_obj->region);
> -   intel_region_cow(intel, intel_obj->region);
> -}
> -
> -
>  /**
>   * Deallocate/free a vertex/pixel buffer object.
>   * Called via glDeleteBuffersARB().
> @@ -122,9 +98,6 @@ intel_bufferobj_free(struct gl_context * ctx, struct 
> gl_buffer_object *obj)
>intel_bufferobj_unmap(ctx, obj);
>  
> free(intel_obj->sys_buffer);
> -   if (intel_obj->region) {
> -  intel_bufferobj_release_region(intel_obj);
> -   }
>  
> drm_intel_bo_unreference(intel_obj->buffer);
> free(intel_obj);
> @@ -160,9 +133,6 @@ intel_bufferobj_data(struct gl_context * ctx,
>  
> assert(!obj->Pointer); /* Mesa should have unmapped it */
>  
> -   if (intel_obj->region)
> -  intel_bufferobj_release_region(intel_obj);
> -
> if (intel_obj->buffer != NULL)
>release_buffer(intel_obj);
>  
> @@ -219,9 +189,6 @@ intel_bufferobj_subdata(struct gl_context * ctx,
>  
> assert(intel_obj);
>  
> -   if (intel_obj->region)
> -  intel_bufferobj_cow(intel, intel_obj);
> -
> /* If we have a single copy in system memory, update that */
> if (intel_obj->sys_buffer) {
>if (intel_obj->source)
> @@ -347,9 +314,6 @@ intel_bufferobj_map_range(struct gl_context * ctx,
>intel_obj->sys_buffer = NULL;
> }
>  
> -   if (intel_obj->region)
> -  intel_bufferobj_cow(intel, intel_obj);
> -
> /* If the mapping is synchronized with other GL operations, flush
>  * the batchbuffer so that GEM knows about the buffer access for later
>  * syncing.
> @@ -510,15 +474,6 @@ intel_bufferobj_buffer(struct intel_context *intel,
> struct intel_buffer_object *intel_obj,
>  GLuint flag)
>  {
> -   if (intel_obj->region) {
> -  if (flag == INTEL_WRITE_PART)
> - intel_bufferobj_cow(intel, intel_obj);
> -  else if (flag == INTEL_WRITE_FULL) {
> - intel_bufferobj_release_region(intel_obj);
> -  intel_bufferobj_alloc_buffer(intel, intel_obj);
> -  }
> -   }
> -
> if (intel_obj->source)
>release_buffer(intel_obj);
>  
> diff --git a/src/mesa/drivers/dri/intel/intel_buffer_objects.h 
> b/src/mesa/drivers/dri/intel/intel_buffer_objects.h
> index d75cdbf..b174e93 100644
> --- a/src/mesa/drivers/dri/intel/intel_buffer_objects.h
> +++ b/src/mesa/drivers/dri/intel/intel_buffer_objects.h
> @@ -31,7 +31,6 @@
>  #include "main/mtypes.h"
>  
>  struct intel_context;
> -struct intel_region;
>  struct gl_buffer_object;
>  
> 
> @@ -47,10 +46,6 @@ struct intel_buffer_object
> /** System memory buffer data, if not using a BO to store the data. */
> void *sys_buffer;
>  
> -   struct intel_region *region; /* Is there a zero-copy texture
> -   associated with this (pixel)
> -   buffer object? */
> -
> drm_intel_bo *range_map_bo;
> void *range_map_buffer;
> unsigned int range_map_offset;
> @@ -102,11 +97,4 @@ intel_buffer_object(struct gl_buffer_object *obj)
> return (struct intel_buffer_object *) obj;
>  }

Re: [Mesa-dev] libgallium.so and miscelaneous buildsystem patches

2011-10-05 Thread Keith Whitwell
On Wed, 2011-10-05 at 20:14 +1100, Christopher James Halse Rogers wrote:
> On Wed, 2011-10-05 at 09:24 +0200, Joakim Sindholt wrote:
> > On Tue, 2011-10-04 at 17:58 +0200, Fabio wrote:
> > > Can the patches at
> > > http://lists.freedesktop.org/archives/mesa-dev/2011-August/011099.html
> > > be considered for merging?
> > > 
> > > Sharing libgallium should save some MB of installed space.
> > 
> > And be an ABI nightmare for distributions
> 
> No; it's a private library.  Distributions will happily ship a
> libgallium built from exactly the same source that the DRI drivers are
> built from.  Indeed, that's what currently happens for those
> distributions with ship with --enable-shared-dricore, and what happens
> in Ubuntu, where we've got this patch series applied in our never-ending
> quest to cram a fully-featured linux system on a 700MB CD.
> 
> Saving 20-odd megabytes is really useful there :)

An alternative would be to build all the drivers into a single library
for maximal sharing.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: fix a crash in non-SSE path

2011-10-30 Thread Keith Whitwell
Looks good to me.

Keith

On Sun, 2011-10-30 at 20:05 +0800, Chia-I Wu wrote:
> From: Chia-I Wu 
> 
> It is a typo went unnoticed.
> ---
>  src/gallium/drivers/llvmpipe/lp_rast_tri.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_rast_tri.c 
> b/src/gallium/drivers/llvmpipe/lp_rast_tri.c
> index 3adfbaa..71d0ddf 100644
> --- a/src/gallium/drivers/llvmpipe/lp_rast_tri.c
> +++ b/src/gallium/drivers/llvmpipe/lp_rast_tri.c
> @@ -129,7 +129,7 @@ lp_rast_triangle_4_16(struct lp_rasterizer_task *task,
> union lp_rast_cmd_arg arg2;
> arg2.triangle.tri = arg.triangle.tri;
> arg2.triangle.plane_mask = (1<<4)-1;
> -   lp_rast_triangle_3(task, arg2);
> +   lp_rast_triangle_4(task, arg2);
>  }
>  
>  void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] RFC: Remove tgsi-sse2.

2011-11-08 Thread Keith Whitwell
On Tue, 2011-11-08 at 07:47 -0800, Jose Fonseca wrote:
> tgsi_exec is simple; llvm is fast; and tgsi_sse2 ends up being neither. So 
> really serves no purpose and is currently broken.
> 

Sounds good to me!

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI declarations missing type info

2011-11-14 Thread Keith Whitwell
On Sun, 2011-11-13 at 14:43 -0600, Bryan Cain wrote:
> On 11/13/2011 09:06 AM, Dave Airlie wrote:
> > Hi guys,
> >
> > Just been looking at llvmpipe integer support and it seems like we
> > lose some information about the type of data stored into temporaries,
> >
> > after st_glsl_to_cpp we no longer know what type the temporaries are,
> > and llvm would really like to know and I can't see any reason that
> > TGSI doesn't contain the info. Having untyped temp decls means we'd
> > have to allocate some sort of "union" via aliases I guess in llvmpipe
> > for all temps so we can store int/float in them.
> >
> > I've attached a run of glsl-vs-loop from llvmpipe with integer opcodes
> > forced on. (llvmpipe-int-test branch of my repo).
> >
> > Dave.
> 
> If you do add types to TGSI registers, it's worth noting that the
> internal IR used by glsl_to_tgsi (glsl_to_tgsi_instruction) already the
> types of all src and dst registers, and it's only lost when converting
> that to TGSI.  However, it was only intended to be good enough to
> determine whether to emit an integer or float instruction, so there
> might be some mistakes remaining somewhere that would need to be corrected.
> 

I'd certainly support the idea of adding type information to TGSI.  It
would mean that any SM4-to-TGSI translator would have to do type
inference, but afaik SM4 is the only place where that would have to
happen -- all other potential sources of TGSI either have type
information (like IR as noted above), or are pretty much float-only
(like SM3).  

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI declarations missing type info

2011-11-14 Thread Keith Whitwell
On Mon, 2011-11-14 at 09:42 +, Keith Whitwell wrote:
> On Sun, 2011-11-13 at 14:43 -0600, Bryan Cain wrote:
> > On 11/13/2011 09:06 AM, Dave Airlie wrote:
> > > Hi guys,
> > >
> > > Just been looking at llvmpipe integer support and it seems like we
> > > lose some information about the type of data stored into temporaries,
> > >
> > > after st_glsl_to_cpp we no longer know what type the temporaries are,
> > > and llvm would really like to know and I can't see any reason that
> > > TGSI doesn't contain the info. Having untyped temp decls means we'd
> > > have to allocate some sort of "union" via aliases I guess in llvmpipe
> > > for all temps so we can store int/float in them.
> > >
> > > I've attached a run of glsl-vs-loop from llvmpipe with integer opcodes
> > > forced on. (llvmpipe-int-test branch of my repo).
> > >
> > > Dave.
> > 
> > If you do add types to TGSI registers, it's worth noting that the
> > internal IR used by glsl_to_tgsi (glsl_to_tgsi_instruction) already the
> > types of all src and dst registers, and it's only lost when converting
> > that to TGSI.  However, it was only intended to be good enough to
> > determine whether to emit an integer or float instruction, so there
> > might be some mistakes remaining somewhere that would need to be corrected.
> > 
> 
> I'd certainly support the idea of adding type information to TGSI.  It
> would mean that any SM4-to-TGSI translator would have to do type
> inference, but afaik SM4 is the only place where that would have to
> happen -- all other potential sources of TGSI either have type
> information (like IR as noted above), or are pretty much float-only
> (like SM3).  

Note I'm mainly posting this because I previously held the opposite view
fairly strongly & wanted to make sure that nobody feels they need to
keep on considering that...

Keith 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] gallium: remove PIPE_CAP_GLSL and enable GLSL unconditionally

2011-11-18 Thread Keith Whitwell


- Original Message -
> On 11/18/2011 11:27 AM, Marek Olšák wrote:
> > Only i965g does not enable GLSL, but that driver has been
> > unmaintained and
> > bitrotting for quite a while anyway.
> 
> It doesn't even do GLSL?  I'm pretty shocked, I figured it at least
> did
> that.  Is it even worth keeping around in the tree?  Seems like it's
> just creating extra work for you guys, having to update it for
> Gallium
> changes...when ultimately, nobody's using it.
>

I agree -- this was never finished & isn't likely to be either.

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GLSL IR int-to-float pass

2011-05-25 Thread Keith Whitwell
On Wed, 2011-05-25 at 09:32 -0400, Jerome Glisse wrote:
> On Tue, May 24, 2011 at 8:09 PM, Bryan Cain  wrote:
> > Hi,
> >
> > In the past few days, I've been working on native integer support in my
> > GLSL to TGSI translator.  Something that's come to my attention is that
> > supporting Gallium targets with and without integer support using a
> > single GLSL IR backend will more or less require a GLSL IR pass to
> > convert int, uint, and possibly bool variables and operations to floats.
> >
> > Currently, this is done directly in the backend, in both ir_to_mesa and
> > st_glsl_to_tgsi.  However, the mod_to_fract and div_to_mul_rcp lowering
> > passes for GLSL IR need to know whether to lower integer modulus and
> > division operations to their corresponding float operations.  (They both
> > do this in Mesa master without asking the backend, but that will be easy
> > to change later.)  So a GLSL IR pass will be needed to do the type lowering.
> >
> > Such a pass would also have the advantage of less duplicated
> > functionality between backends, since ir_to_mesa could also take
> > advantage of the pass to eliminate some code.
> >
> > I'm more than willing to try writing such a pass myself if no one else
> > is interested in doing it, but I figure I should make sure there are no
> > objections before starting on it.
> >
> > Bryan
> 
> TGSI needs to grow type support (int, uint and possibly int8,16,32..)

Or go away entirely...

I'm not trying to impose a direction on this, but it seems like the GLSL
IR->TGSI converter (once running) could be pushed down into the
individual drivers and GLSL IR or a close cousin of it could become the
gallium-level interface.  Then individual drivers could be modified to
consume IR directly.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] softpipe: Anisotropic filtering extension

2011-06-06 Thread Keith Whitwell
Andreas,

This looks very interesting.  Ultimately llvmpipe would want to have aniso as 
well, but performance would be much more important there.  Do you have a 
feeling for what shortcuts the hardware implementations are taking?

Keith

- Original Message -
From: "Andreas Faenger" 
To: mesa-dev@lists.freedesktop.org
Cc: "a faenger" 
Sent: Monday, 6 June, 2011 8:13:15 AM
Subject: [Mesa-dev] [PATCH] softpipe: Anisotropic filtering extension

Hi,

as requested by Paul, I've converted the patch which provides anisotropic 
filtering for swrast to softpipe. The rendering results of both version are 
almost identical and are much better compared to typical HW rendering, e.g. 
NVIDIA which produces a lot more aliasing.

Andreas

Andreas Faenger (1):
  softpipe: Anisotropic filtering extension.

 src/gallium/drivers/softpipe/sp_screen.c |4 +-
 src/gallium/drivers/softpipe/sp_tex_sample.c |  331 ++
 2 files changed, 333 insertions(+), 2 deletions(-)

-- 
1.7.4.msysgit.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: improved is_interleaved_arrays() checking

2011-06-14 Thread Keith Whitwell
On Tue, 2011-06-14 at 09:39 -0600, Brian Paul wrote:
> Good question.  I was thinking that the interleaved vs. 
> non-interleaved paths could probably be merged with a little work.  I 
> don't remember the original reason for doing things as they are.

I think it enabled an easier upload path within the driver/state-tracker
-- memcpy a single range to a single VBO, rather than gathering.

Now that the upload is potentially code-generated, that may no longer
matter as much.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/6] Overhaul of Gallium configure options

2011-06-14 Thread Keith Whitwell
On Tue, 2011-06-14 at 18:25 +0200, Marek Olšák wrote:
> Hi,
> 
> This series reworks some of our configure options to make Gallium easier to 
> configure.
> 
> First, there is a new option --with-gallium-drivers=DIRS, which replaces the 
> current heap of options --enable-gallium-DRIVER. --disable-gallium is removed 
> as well, instead, --with-gallium-drivers= without parameters should be used 
> to disable Gallium.
> 
> --enable-gallium-egl is removed. having --enable-egl and 
> --with-gallium-drivers=somedriver is sufficient.
> 
> --with-state-trackers is removed as well. The list of state trackers is 
> automatically deduced from the --enable-API options (the vega,egl state 
> trackers) and --with-driver=dri|xlib (the dri,glx state trackers). Some state 
> trackers lack an enable flag now, so these two have been added to make the 
> list complete: --enable-xorg and --enable-d3d1x.
> 
> In order to be able to "git bisect run" through this change, you can specify 
> both the old and new options at the same time. Those that are unsupported are 
> ignored.
> 
> Other than that, I am enabling r600g by default and removing r300g and r600g 
> from scons. I am not a fan of having multiple build systems and most people 
> prefer autoconf anyway. It's not like anybody needs to build those drivers on 
> Windows.

I did use r600g + scons for the little bit of work I did there, and if I
went back to it, it would continue to be with scons...

Is there a significant cost to you having it there? 

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] A gallium XA state tracker

2011-06-15 Thread Keith Whitwell
On Wed, 2011-06-15 at 11:29 +0200, Thomas Hellstrom wrote:
> Hi!
> 
> I just pushed an initial commit of an X Acceleration state tracker to 
> the xa_branch.
> 
> The idea is that in the long run it will be replacing the Xorg state 
> tracker, which can then move back to a modular xf86-video-modesetting. 
> It will also be responsible for the acceleration part of an updated 
> vmwgfx X driver
> 
>  From the README:
> 
> 8<--
> The XA state tracker is intended as a versioned interface to gallium for
> xorg driver writers. Initially it's mostly based on Zack Rusin's
> composite / video work for the Xorg state tracker.
> 
> The motivation behind this state tracker is that the Xorg state tracker has
> a number of interfaces to work with:
> 
> 1) The Xorg sdk (versioned)
> 2) Gallium3D (not versioned)
> 3) KMS modesetting (versioned)
> 4) Driver-private (hopefully versioned)
> 
> Since Gallium3D is versioned, the Xorg state tracker needs to be compiled

Hi Thomas!  Is there a missing "not" before versioned in the above
sentence?

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] r600g: implement fragment and vertex color clamp

2011-06-27 Thread Keith Whitwell
On Mon, 2011-06-27 at 15:32 +0200, Marek Olšák wrote:
> On Mon, Jun 27, 2011 at 2:38 PM, Roland Scheidegger
>  wrote:
> > Am 25.06.2011 00:22, schrieb Vadim Girlin:
> >> On 06/24/2011 11:38 PM, Jerome Glisse wrote:
> >>> On Fri, Jun 24, 2011 at 12:29 PM, Vadim
> Girlin
> >>> wrote:
>  Fixes https://bugs.freedesktop.org/show_bug.cgi?id=38440
> 
>  Signed-off-by: Vadim Girlin
> >>>
> >>> As discussed previously, there is better to handle this. I think
> best
> >>> solution is to always add the instruction and to conditionally
> execute
> >>> them thanks to the boolean constant. If this reveal to have a too
> big
> >>> impact on shader, other solution i see is adding a cf block with
> those
> >>> instructions and to enable or disable that block (cf_nop) and
> reupload
> >>> shader that would avoid a rebuild.
> >>
> >> I know its not optimal to do a full rebuild, but rebuild is needed
> only
> >> when the application will use the same shader in different clamping
> >> states. It won't be a problem if the application doesn't change
> clamping
> >> state or if it changes the state but uses each shader in one state
> only.
> >> So assuming that typical app will not use one shader in both
> states, it
> >> shouldn't be a problem. Is this assumption wrong? I'm not really
> sure
> >> because I have no much experience in this. But if it's wrong then
> it's
> >> probably better for performance to build and cache both versions.
> > I tend to think you're right apps probably don't want to use the
> same
> > shader both with and without clamping.
> 
> It still can be changed by st/mesa or by u_blitter and u_blit for
> various reasons. IIRC, the OpenGL default is TRUE if the current
> framebuffer is fixed-point including texture_snorm and FALSE
> otherwise, so changing the framebuffer may change the clamp color
> state. Besides that, the u_blitter and u_blit operations always
> disable the clamping, so if a framebuffer is fixed-point and thus
> clamp color state is TRUE (if not changed by an app), the driver may
> receive state changes that turn the clamping on, off, on, off,... with
> the blit operations turning it off and everything else turning it on.
> The state might be changing pretty much all the time and doing full
> shader rebuilds repeatedly may turn some apps into a slideshow.

I haven't looked at the code, maybe this is irrelevant for some reason,
but the alternative to doing rebuilds when this type of state changes is
to permit >1 compiled version of the shader to exist, parameterized in
different ways.  That way the ping-pong scenario you describe results in
swapping between shaders (which should be cheap), rather than
rebuilding.

> Therefore we must ensure that a fragment shader is set/built as late
> as possible, i.e. in draw_vbo. Each shader variant should be compiled
> once at most and stored for later use. create_fs_state and
> bind_fs_state should not do anything except copying the parameters. 

Actually it sounds like you're describing the same idea here...

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

2011-06-29 Thread Keith Whitwell
On Wed, 2011-06-29 at 13:19 -0400, Adam Jackson wrote:
> Perversely, do this by eliminating the comparison between stored and
> current fs state.  On ipers, a perf trace showed try_update_scene_state
> using 31% of a CPU, and 98% of that was in 'repz cmpsb', ie, the memcmp.
> Taking that out takes try_update_scene_state down to 6.5% of the
> profile; more importantly, ipers goes from 10 to 14fps and gears goes
> from 790 to 830fps.

Some of the motivation for that memcpy is about keeping the memory usage
of the binned scene from exploding and forcing unnecessary flushes on
more complex apps.

I wonder if there is a way to improve the dirty flag handling to avoid
ending up in that memcpy so often?


Note that freeglut is probably dominating your gears numbers by trying
to reinitialize your SpaceBall device (I don't have one either) on every
swapbuffers.

http://lists.freedesktop.org/archives/mesa-dev/2011-February/005599.html


Keith


> Signed-off-by: Adam Jackson 
> ---
>  src/gallium/drivers/llvmpipe/lp_setup.c |   61 
> ++-
>  1 files changed, 27 insertions(+), 34 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
> b/src/gallium/drivers/llvmpipe/lp_setup.c
> index cbe06e5..9118db5 100644
> --- a/src/gallium/drivers/llvmpipe/lp_setup.c
> +++ b/src/gallium/drivers/llvmpipe/lp_setup.c
> @@ -839,42 +839,35 @@ try_update_scene_state( struct lp_setup_context *setup )
>setup->dirty |= LP_SETUP_NEW_FS;
> }
>  
> -
> if (setup->dirty & LP_SETUP_NEW_FS) {
> -  if (!setup->fs.stored ||
> -  memcmp(setup->fs.stored,
> - &setup->fs.current,
> - sizeof setup->fs.current) != 0)
> -  {
> - struct lp_rast_state *stored;
> - uint i;
> - 
> - /* The fs state that's been stored in the scene is different from
> -  * the new, current state.  So allocate a new lp_rast_state object
> -  * and append it to the bin's setup data buffer.
> -  */
> - stored = (struct lp_rast_state *) lp_scene_alloc(scene, sizeof 
> *stored);
> - if (!stored) {
> -assert(!new_scene);
> -return FALSE;
> - }
> +  struct lp_rast_state *stored;
> +  uint i;
> +  
> +  /* The fs state that's been stored in the scene is different from
> +   * the new, current state.  So allocate a new lp_rast_state object
> +   * and append it to the bin's setup data buffer.
> +   */
> +  stored = (struct lp_rast_state *) lp_scene_alloc(scene, sizeof 
> *stored);
> +  if (!stored) {
> + assert(!new_scene);
> + return FALSE;
> +  }
>  
> - memcpy(stored,
> -&setup->fs.current,
> -sizeof setup->fs.current);
> - setup->fs.stored = stored;
> - 
> - /* The scene now references the textures in the rasterization
> -  * state record.  Note that now.
> -  */
> - for (i = 0; i < Elements(setup->fs.current_tex); i++) {
> -if (setup->fs.current_tex[i]) {
> -   if (!lp_scene_add_resource_reference(scene,
> -setup->fs.current_tex[i],
> -new_scene)) {
> -  assert(!new_scene);
> -  return FALSE;
> -   }
> +  memcpy(stored,
> + &setup->fs.current,
> + sizeof setup->fs.current);
> +  setup->fs.stored = stored;
> +  
> +  /* The scene now references the textures in the rasterization
> +   * state record.  Note that now.
> +   */
> +  for (i = 0; i < Elements(setup->fs.current_tex); i++) {
> + if (setup->fs.current_tex[i]) {
> +if (!lp_scene_add_resource_reference(scene,
> + setup->fs.current_tex[i],
> + new_scene)) {
> +   assert(!new_scene);
> +   return FALSE;
>  }
>   }
>}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

2011-06-30 Thread Keith Whitwell
On Wed, 2011-06-29 at 16:16 -0700, Corbin Simpson wrote:
> Okay, so maybe I'm failing to recognize the exact situation here, but
> wouldn't it be possible to mark the FS state with a serial number and
> just compare those? Or are these FS states not CSO-cached?

No, the struct being compared is poorly named & collides with a CSO
entity.  It's really all the state which the compiled fragment shader
will reference when it is later invoked.  It's all packed into a single
struct because it's easier to pass a single parameter to llvm-compiled
shaders and add/change that parameter, but it is somewhat non-orthogonal
and we end up generating too many of them.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

2011-06-30 Thread Keith Whitwell
On Thu, 2011-06-30 at 03:36 +0200, Roland Scheidegger wrote:
> Ok in fact there's a gcc bug about memcmp:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> In short gcc's memcmp builtin is totally lame and loses to glibc's
> memcmp (including call overhead, no knowledge about alignment etc.) even
> when comparing only very few bytes (and loses BIG time for lots of bytes
> to compare). Oops. Well at least if the strings are the same (I'd guess
> if the first byte is different it's hard to beat the gcc builtin...).
> So this is really a gcc bug. The bug is quite old though with no fix in
> sight apparently so might need to think about some workaround (but just
> not doing the comparison doesn't look like the right idea, since
> apparently it would be faster with the comparison if gcc's memcmp got
> fixed).

Looking at the struct again (it's been a while), it seems like it could
be rearranged to be variable-sized and on average significantly smaller:

struct lp_rast_state {
   struct lp_jit_context jit_context;
   struct lp_fragment_shader_variant *variant;
};

struct lp_jit_context {
   const float *constants;
   float alpha_ref_value;
   uint32_t stencil_ref_front, stencil_ref_back;
   uint8_t *blend_color;
   struct lp_jit_texture textures[PIPE_MAX_SAMPLERS];
};

If we moved the jit_context part behind "variant", and then hopefully
note that most of those lp_jit_texture structs are not in use.  That
would save time on the memcmp *and* space in the binned data.

It's weird this wasn't showing up in past profiling.

Kieth


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

2011-06-30 Thread Keith Whitwell
On Thu, 2011-06-30 at 03:27 -0700, Jose Fonseca wrote:
> 
> - Original Message -
> > On Thu, 2011-06-30 at 03:36 +0200, Roland Scheidegger wrote:
> > > Ok in fact there's a gcc bug about memcmp:
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > In short gcc's memcmp builtin is totally lame and loses to glibc's
> > > memcmp (including call overhead, no knowledge about alignment etc.)
> > > even
> > > when comparing only very few bytes (and loses BIG time for lots of
> > > bytes
> > > to compare). Oops. Well at least if the strings are the same (I'd
> > > guess
> > > if the first byte is different it's hard to beat the gcc
> > > builtin...).
> > > So this is really a gcc bug. The bug is quite old though with no
> > > fix in
> > > sight apparently so might need to think about some workaround (but
> > > just
> > > not doing the comparison doesn't look like the right idea, since
> > > apparently it would be faster with the comparison if gcc's memcmp
> > > got
> > > fixed).
> > 
> > Looking at the struct again (it's been a while), it seems like it
> > could
> > be rearranged to be variable-sized and on average significantly
> > smaller:
> > 
> > struct lp_rast_state {
> >struct lp_jit_context jit_context;
> >struct lp_fragment_shader_variant *variant;
> > };
> > 
> > struct lp_jit_context {
> >const float *constants;
> >float alpha_ref_value;
> >uint32_t stencil_ref_front, stencil_ref_back;
> >uint8_t *blend_color;
> >struct lp_jit_texture textures[PIPE_MAX_SAMPLERS];
> > };
> > 
> > If we moved the jit_context part behind "variant", and then hopefully
> > note that most of those lp_jit_texture structs are not in use.  That
> > would save time on the memcmp *and* space in the binned data.
> 
> Yeah, sounds a good idea.
> 
> But there's some subtletly to computing the number of textures: it
>  can't be just the NULL textures, because they may be reffered by the
>  JIT code, which has no NULL checks and  relies on the state setup to
>  provide storage for all textures, or dummy memory if one is not bound.

So it's a property of the variant, right?  We should just store that
information when we generate the llvm variant.

> I think a better idea would be:
> - split the texture/sampler state
> - to make the lp_jit_context::textures an array of pointers, and put the 
> struct lp_jit_texture in the pipe_texture object themselves
> - to make the lp_jit_context::samplers an array of pointers, and put the 
> struct lp_jit_sampler in the pipe_sampler_state CSO

I like this too - it's somewhat more involved of course.

In fact the two are orthogonal -- the struct below can still be shrunk
significantly by knowing how many samplers & textures the variant refers
to.  Interleaving them or packing them would reduce the bytes to be
compared.

Alternatively there could be just a pointer in jit_context to
textures/samplers binned elsewhere.

> struct lp_jit_context {
> struct lp_jit_texture *textures[PIPE_MAX_SAMPLERS];
> struct lp_jit_sampler *samplers[PIPE_MAX_SAMPLERS];
> };

The jit context above seems to have lost some of its fields...

The next step might be to split the context into four parts: textures,
samplers, constants, "other", and have jit_context just be a set of
pointers into the binned data:

struct lp_jit_context {
 struct lp_jit_texture **textures;
 struct lp_jit_sampler **samplers;
 const float *constants;
 const struct lp_jit_other *other;   
};

struct lp_jit_other {
   float alpha_ref_value;
   uint32_t stencil_ref_front;
   uint32_t stencil_ref_back;
   uint8_t *blend_color;
};

> struct lp_jit_texture
> {
>uint32_t width;
>uint32_t height;
>uint32_t depth;
>uint32_t first_level;
>uint32_t last_level;
>uint32_t row_stride[LP_MAX_TEXTURE_LEVELS];
>uint32_t img_stride[LP_MAX_TEXTURE_LEVELS];
>const void *data[LP_MAX_TEXTURE_LEVELS];
>/* sampler state, actually */
>float min_lod;
>float max_lod;
>float lod_bias;
>float border_color[4];
> };
> 
> struct lp_jit_sampler
> {
>float min_lod;
>float max_lod;
>float lod_bias;
>float border_color[4];
> };
> 
> 
> Jose


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

2011-06-30 Thread Keith Whitwell
On Thu, 2011-06-30 at 17:53 +0200, Roland Scheidegger wrote:
> Am 30.06.2011 16:14, schrieb Adam Jackson:
> > On Thu, 2011-06-30 at 03:36 +0200, Roland Scheidegger wrote:
> >> Ok in fact there's a gcc bug about memcmp:
> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> >> In short gcc's memcmp builtin is totally lame and loses to glibc's
> >> memcmp (including call overhead, no knowledge about alignment etc.) even
> >> when comparing only very few bytes (and loses BIG time for lots of bytes
> >> to compare). Oops. Well at least if the strings are the same (I'd guess
> >> if the first byte is different it's hard to beat the gcc builtin...).
> >> So this is really a gcc bug. The bug is quite old though with no fix in
> >> sight apparently so might need to think about some workaround (but just
> >> not doing the comparison doesn't look like the right idea, since
> >> apparently it would be faster with the comparison if gcc's memcmp got
> >> fixed).
> > 
> > How do things fare if you build with -fno-builtin-memcmp?
> 
> This is even faster:
> original ipers: 12.1 fps
> ajax patch: 15.5 fps
> optimized struct compare: 16.8 fps
> -fno-builtin-memcmp: 18.1 fps
> 
> Looks like we have a winner :-) I guess glibc optimizes the hell out of
> it (in contrast to the other results, this affected all memcmp though I
> don't know if any others benefited from that on average).
> As noted by Keith though the struct we compare is really large (over 4k)
> so trimming the size might be a good idea anyway (of course the 4k size
> also meant any call overhead and non-optimal code due to glibc not
> knowing alignment beforehand and usage of return value is completely
> insignificant).
> A 50% improvement from disabling a compiler optimization, lol.

We probably what this everywhere throughout Mesa & Gallium...

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/13] gallium/util: implement pack functions for Z32F and Z32F_S8X24

2011-07-01 Thread Keith Whitwell
On Fri, 2011-07-01 at 02:29 +0200, Marek Olšák wrote:
> The suffix of 64 means it returns uint64_t.

It might be slightly clearer to call these functions util_pack64_{xxx}
-- currently it reads as if it is packing 64-bit source data.

Keith

> ---
>  src/gallium/auxiliary/util/u_pack_color.h |   64 
> +
>  1 files changed, 64 insertions(+), 0 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_pack_color.h 
> b/src/gallium/auxiliary/util/u_pack_color.h
> index 5378f2d..d2dfba5 100644
> --- a/src/gallium/auxiliary/util/u_pack_color.h
> +++ b/src/gallium/auxiliary/util/u_pack_color.h
> @@ -458,6 +458,19 @@ util_pack_mask_z(enum pipe_format format, uint32_t z)
> }
>  }
>  
> +
> +static INLINE uint64_t
> +util_pack_mask_z64(enum pipe_format format, uint32_t z)
> +{
> +   switch (format) {
> +   case PIPE_FORMAT_Z32_FLOAT_S8X24_USCALED:
> +  return z;
> +   default:
> +  return util_pack_mask_z(format, z);
> +   }
> +}
> +
> +
>  static INLINE uint32_t
>  util_pack_mask_z_stencil(enum pipe_format format, uint32_t z, uint8_t s)
>  {
> @@ -481,6 +494,21 @@ util_pack_mask_z_stencil(enum pipe_format format, 
> uint32_t z, uint8_t s)
>  }
>  
> 
> +static INLINE uint64_t
> +util_pack_mask_z_stencil64(enum pipe_format format, uint32_t z, uint8_t s)
> +{
> +   uint64_t packed;
> +
> +   switch (format) {
> +   case PIPE_FORMAT_Z32_FLOAT_S8X24_USCALED:
> +  packed = util_pack_mask_z64(format, z);
> +  packed |= (uint64_t)s << 32ull;
> +  return packed;
> +   default:
> +  return util_pack_mask_z_stencil(format, z, s);
> +   }
> +}
> +
>  
>  /**
>   * Note: it's assumed that z is in [0,1]
> @@ -525,6 +553,24 @@ util_pack_z(enum pipe_format format, double z)
>return 0;
> }
>  }
> +
> +
> +static INLINE uint64_t
> +util_pack_z64(enum pipe_format format, double z)
> +{
> +   union fi fui;
> +
> +   if (z == 0)
> +  return 0;
> +
> +   switch (format) {
> +   case PIPE_FORMAT_Z32_FLOAT_S8X24_USCALED:
> +  fui.f = (float)z;
> +  return fui.ui;
> +   default:
> +  return util_pack_z(format, z);
> +   }
> +}
>   
>  
>  /**
> @@ -554,6 +600,24 @@ util_pack_z_stencil(enum pipe_format format, double z, 
> uint8_t s)
>  }
>  
> 
> +static INLINE uint64_t
> +util_pack_z_stencil64(enum pipe_format format, double z, uint8_t s)
> +{
> +   uint64_t packed;
> +
> +   switch (format) {
> +   case PIPE_FORMAT_Z32_FLOAT_S8X24_USCALED:
> +  packed = util_pack_z64(format, z);
> +  packed |= (uint64_t)s << 32ull;
> +  break;
> +   default:
> +  return util_pack_z_stencil(format, z, s);
> +   }
> +
> +   return packed;
> +}
> +
> +
>  /**
>   * Pack 4 ubytes into a 4-byte word
>   */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Gallium: fix buffer overflow

2011-07-01 Thread Keith Whitwell
This looks good to me -- Jose?

Keith

On Thu, 2011-06-30 at 03:33 +0100, Micael Dias wrote:
> ---
>  src/gallium/auxiliary/draw/draw_llvm.c |   12 
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
> b/src/gallium/auxiliary/draw/draw_llvm.c
> index 56c26f5..19134f3 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
> @@ -1163,6 +1163,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
> draw_llvm_variant *variant)
> struct lp_build_loop_state lp_loop;
> const int max_vertices = 4;
> LLVMValueRef outputs[PIPE_MAX_SHADER_OUTPUTS][NUM_CHANNELS];
> +   LLVMValueRef fetch_max;
> void *code;
> struct lp_build_sampler_soa *sampler = 0;
> LLVMValueRef ret, ret_ptr;
> @@ -1234,6 +1235,10 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
> draw_llvm_variant *variant)
>draw_llvm_variant_key_samplers(&variant->key),
>context_ptr);
>  
> +   fetch_max = LLVMBuildSub(builder, count,
> +lp_build_const_int32(gallivm, 1),
> +"fetch_max");
> +
>  #if DEBUG_STORE
> lp_build_printf(builder, "start = %d, end = %d, step = %d\n",
> start, end, step);
> @@ -1257,6 +1262,13 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
> draw_llvm_variant *variant)
>  builder,
>  lp_loop.counter,
>  lp_build_const_int32(gallivm, i), "");
> + LLVMValueRef fetch_ptr;
> +
> + /* make sure we're not out of bounds which can happen
> +  * if fetch_count % 4 != 0, because on the last iteration
> +  * a few of the 4 vertex fetches will be out of bounds */
> + true_index = lp_build_min(&bld, true_index, fetch_max);
> + 
>   for (j = 0; j < draw->pt.nr_vertex_elements; ++j) {
>  struct pipe_vertex_element *velem = &draw->pt.vertex_element[j];
>  LLVMValueRef vb_index = lp_build_const_int32(gallivm, 
> velem->vertex_buffer_index);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/13] gallium/util: implement pack functions for Z32F and Z32F_S8X24

2011-07-01 Thread Keith Whitwell
On Fri, 2011-07-01 at 14:42 +0200, Marek Olšák wrote:
> On Fri, Jul 1, 2011 at 10:49 AM, Keith Whitwell  wrote:
> > On Fri, 2011-07-01 at 02:29 +0200, Marek Olšák wrote:
> >> The suffix of 64 means it returns uint64_t.
> >
> > It might be slightly clearer to call these functions util_pack64_{xxx}
> > -- currently it reads as if it is packing 64-bit source data.
> 
> Yeah, that's nicer. Here's the diff I am going to squash with the
> patches 11 and 12.

Looks great!

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] About merging pipe-video to master

2011-07-12 Thread Keith Whitwell
On Mon, 2011-07-11 at 18:24 +0200, Christian König wrote:
> Hi guys,
> 
> as the subject already indicates: I'm about to merge pipe-video to
> master and just wanted to ask if anybody has still any objections?
> 
> After following Jose and Younes discussion on mesa-dev about how to
> design such an abstraction layer I took another round of cleaning up the
> interface and moved some more parts into the state tracker.
> 
> So the interface between the state tracker and drivers only consist of
> the following now:
> 
> 1. two additional functions for the screen object: get_video_param and
> is_video_format_supported. get_video_param gets a parameter for a
> specified codec (like max width/height of decoding target, which could
> be smaller than texture max width/height), and is_video_format_supported
> which checks if a texture format is supported as a decoding target for a
> codec.
> 
> 2. create_video_decoder function in the pipe_context object, which
> creates a decoder object for a given codec. The decoder object in turn
> includes everything needed to decode a video stream of that codec and
> uses pipe_video_decode_buffer objects to hold the input data of a single
> frame of that video codec.
> 
> 3. create_video_buffer function in the pipe_context object, which
> creates a video_buffer object to store a decoded video frame. This
> video_buffer object is then used for both rendering to the screen with
> normal pipe_context functionality and also as the input for reference
> frames back to the decoder.
> 
> The pipe_video_buffer object is there because I think hardware decoders
> need some special memory layout of the different planes of a yuv buffer.
> There is a standard implementation that just uses normal textures as the
> different planes for yuv buffer, which can be used by a driver when
> there is no need for a special memory layout or when the driver just
> uses shader based decoding.
> 
> The other option would be adding a PIPE_BIND_VIDEO_BUFFER flag to the
> resource creation functions, but I don't want to repeat functionality in
> the different drivers and as far as I can see the current resource
> functions (samplers/surfaces) can't be used to create a surface for just
> one plane/component of a yuv buffer and we could still clean that up to
> use the standard resource functions if the need arise.

I'm a bit unsure about what's the best approach here, though at this
stage I'm happy with your approach and don't think it needs to be
changed before any merge.

But speaking in general terms, individual planes map well onto 8-bit
single-component texture images (L8 or similar) and any hardware
requirements (pitch, memory pool, etc) for the individual plane could be
specified with a PIPE_BIND_VIDEO_BUFFER flag.

However, it's also easy to imagine hardware having special requirements
about the positioning of the planes relative to one another, similar to
how mipmaps must be layed out in hardware-specific ways.

If we did decide to get rid of video_buffers and integrate the concept
with pipe_resources, it seems like there would need to be a way to
specify this at resource creation - either a planar YUV format, or some
other marking on the resource.

I don't have easy answers for that, and in the meantime I don't think
it's important enough to hold up pipe-video, which is looking now like a
good step forward.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] About merging pipe-video to master

2011-07-12 Thread Keith Whitwell
On Tue, 2011-07-12 at 11:13 -0400, Younes Manton wrote:
> 2011/7/12 Keith Whitwell :
> > I'm a bit unsure about what's the best approach here, though at this
> > stage I'm happy with your approach and don't think it needs to be
> > changed before any merge.
> >
> > But speaking in general terms, individual planes map well onto 8-bit
> > single-component texture images (L8 or similar) and any hardware
> > requirements (pitch, memory pool, etc) for the individual plane could be
> > specified with a PIPE_BIND_VIDEO_BUFFER flag.
> >
> > However, it's also easy to imagine hardware having special requirements
> > about the positioning of the planes relative to one another, similar to
> > how mipmaps must be layed out in hardware-specific ways.
> >
> > If we did decide to get rid of video_buffers and integrate the concept
> > with pipe_resources, it seems like there would need to be a way to
> > specify this at resource creation - either a planar YUV format, or some
> > other marking on the resource.
> >
> > I don't have easy answers for that, and in the meantime I don't think
> > it's important enough to hold up pipe-video, which is looking now like a
> > good step forward.
> >
> > Keith
> 
> 
> I've considered that. The problem that brings up is what happens when
> you need to hand that planar surface over to the 3D context as a
> texture to be sampled from for color conversion. From the state
> tracker's POV you've just handed over a single texture with
> corresponding vertex attribs, texcoords, shaders, etc, but in reality
> your 3D engine will have to treat each plane as a separate texture.
> You could special-case planar textures and internally create extra
> state objs and fix up the shader, but the extra complexity buys you
> nothing except a "nicer looking" planar texture representation in the
> interface and is ugly in itself.
> 
> Anyhow, Christian, your changes look alright to me. Again, I don't
> think this interface has to be perfect now to be acceptable.

Agreed.

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: initial multi-threaded span rendering

2011-08-10 Thread Keith Whitwell
I'm not sure it makes a lot of sense to be optimizing swrast at this
stage.  Take a look at llvmpipe and perhaps consider improving the
multithreading already in place in that rasterizer, which is far better
optimized than swrast already.

Keith

On Wed, 2011-08-10 at 08:07 +, Andreas Fänger wrote:
> Optional parallel rendering of spans using OpenMP.
> Initial implementation for aa triangles. A new option for scons is
> also provided to activate the openmp support (off by default).
> ---
>  common.py  |1 +
>  scons/gallium.py   |   12 +++
>  src/mesa/swrast/s_aatritemp.h  |   68 ++-
>  src/mesa/swrast/s_context.c|   26 ---
>  src/mesa/swrast/s_texcombine.c |4 ++
>  src/mesa/tnl/t_pipeline.c  |   12 +++
>  6 files changed, 87 insertions(+), 36 deletions(-)
> 
> diff --git a/common.py b/common.py
> index 8657030..cfee1b5 100644
> --- a/common.py
> +++ b/common.py
> @@ -88,6 +88,7 @@ def AddOptions(opts):
>   opts.Add('toolchain', 'compiler toolchain', default_toolchain)
>   opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support', 
> 'no'))
>   opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))
> + opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp 
> (swrast)', 'no'))
>   opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))
>   opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))
>   opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
> diff --git a/scons/gallium.py b/scons/gallium.py
> index 8cd3bc7..7135251 100755
> --- a/scons/gallium.py
> +++ b/scons/gallium.py
> @@ -596,6 +596,18 @@ def generate(env):
>  libs += ['m', 'pthread', 'dl']
>  env.Append(LIBS = libs)
>  
> +# OpenMP
> +if env['openmp']:
> +if env['msvc']:
> +env.Append(CCFLAGS = ['/openmp'])
> +# When building openmp release VS2008 link.exe crashes with 
> LNK1103 error.
> +# Workaround: overwrite PDB flags with empty value as it isn't 
> required anyways
> +if env['build'] == 'release':
> +env['PDB'] = ''
> +if env['gcc']:
> +env.Append(CCFLAGS = ['-fopenmp'])
> +env.Append(LIBS = ['gomp'])
> +
>  # Load tools
>  env.Tool('lex')
>  env.Tool('yacc')
> diff --git a/src/mesa/swrast/s_aatritemp.h b/src/mesa/swrast/s_aatritemp.h
> index 91d4f7a..005d12c 100644
> --- a/src/mesa/swrast/s_aatritemp.h
> +++ b/src/mesa/swrast/s_aatritemp.h
> @@ -181,13 +181,18 @@
>const GLfloat *pMax = vMax->attrib[FRAG_ATTRIB_WPOS];
>const GLfloat dxdy = majDx / majDy;
>const GLfloat xAdj = dxdy < 0.0F ? -dxdy : 0.0F;
> -  GLfloat x = pMin[0] - (yMin - iyMin) * dxdy;
>GLint iy;
> -  for (iy = iyMin; iy < iyMax; iy++, x += dxdy) {
> +  #pragma omp parallel for schedule(dynamic) private(iy) 
> firstprivate(span)
> +  for (iy = iyMin; iy < iyMax; iy++) {
> + GLfloat x = pMin[0] - (yMin - iy) * dxdy;
>   GLint ix, startX = (GLint) (x - xAdj);
>   GLuint count;
>   GLfloat coverage = 0.0F;
>  
> +#ifdef _OPENMP
> + /* each thread needs to use a different (global) SpanArrays 
> variable */
> + span.array = SWRAST_CONTEXT(ctx)->SpanArrays + omp_get_thread_num();
> +#endif
>   /* skip over fragments with zero coverage */
>   while (startX < MAX_WIDTH) {
>  coverage = compute_coveragef(pMin, pMid, pMax, startX, iy);
> @@ -228,13 +233,12 @@
>  coverage = compute_coveragef(pMin, pMid, pMax, ix, iy);
>   }
>   
> - if (ix <= startX)
> -continue;
> - 
> - span.x = startX;
> - span.y = iy;
> - span.end = (GLuint) ix - (GLuint) startX;
> - _swrast_write_rgba_span(ctx, &span);
> + if (ix > startX) {
> +span.x = startX;
> +span.y = iy;
> +span.end = (GLuint) ix - (GLuint) startX;
> +_swrast_write_rgba_span(ctx, &span);
> + }
>}
> }
> else {
> @@ -244,13 +248,18 @@
>const GLfloat *pMax = vMax->attrib[FRAG_ATTRIB_WPOS];
>const GLfloat dxdy = majDx / majDy;
>const GLfloat xAdj = dxdy > 0 ? dxdy : 0.0F;
> -  GLfloat x = pMin[0] - (yMin - iyMin) * dxdy;
>GLint iy;
> -  for (iy = iyMin; iy < iyMax; iy++, x += dxdy) {
> +  #pragma omp parallel for schedule(dynamic) private(iy) 
> firstprivate(span)
> +  for (iy = iyMin; iy < iyMax; iy++) {
> + GLfloat x = pMin[0] - (yMin - iy) * dxdy;
>   GLint ix, left, startX = (GLint) (x + xAdj);
>   GLuint count, n;
>   GLfloat coverage = 0.0F;
>   
> +#ifdef _OPENMP
> + /* each thread needs to use a different (global) SpanArrays 
> variable */
> + span.array = SWRAST_CONTEXT(ctx)->SpanArrays + omp_get_thread_num();
> +#endif
>  

Re: [Mesa-dev] [PATCH] st/mesa: fix incorrect loop over instruction src regs

2011-08-17 Thread Keith Whitwell
On Wed, 2011-08-17 at 09:36 -0500, Bryan Cain wrote:
> The usual commit message prefix for changes to glsl_to_tgsi is
> "glsl_to_tgsi", not "st/mesa".
> 
> On 08/16/2011 05:33 PM, Brian Paul wrote:
> > The array of src regs is of size 3, not 4.
> > ---
> >  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > index aef23e7..7b90c81 100644
> > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > @@ -3443,7 +3443,7 @@ 
> > glsl_to_tgsi_visitor::eliminate_dead_code_advanced(void)
> >   /* Continuing the block, clear any channels from the write array 
> > that
> >* are read by this instruction.
> >*/
> > - for (int i = 0; i < 4; i++) {
> > + for (unsigned i = 0; i < Elements(inst->src); i++) {
> 
> Why not just use 3 here?

Elements(inst->src) is self-documenting.  

3 is just a number and to figure out if it was the correct number you'd
have to go and look at the header file to see if it matched the value
there.

Both should generate the same compiled code.

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/12] mesa: Fix incorrect access parameter passed to MapBuffer

2011-08-22 Thread Keith Whitwell
Your analysis sounds reasonable to me, Ian.  Looks good.

Keith

On Mon, 2011-08-22 at 00:33 -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> The code previously passed GL_DYNAMIC_DRAW for the access parameter.
> By inspection, I believe that all drivers would treat this as
> GL_READ_WRITE because it's not GL_READ_ONLY and it's not
> GL_WRITE_ONLY.  However, my guess is that this code actually wants to
> use GL_WRITE_ONLY.
> 
> Cc: Eric Anholt 
> Cc: Keith Whitwell 
> ---
>  src/mesa/drivers/dri/i965/brw_draw_upload.c |4 +---
>  src/mesa/main/api_arrayelt.c|4 +---
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index 66c42aa..3b95244 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -689,9 +689,7 @@ static void brw_prepare_indices(struct brw_context *brw)
> * rebase it into a temporary.
> */
> if ((get_size(index_buffer->type) - 1) & offset) {
> -   GLubyte *map = ctx->Driver.MapBuffer(ctx,
> -GL_DYNAMIC_DRAW_ARB,
> -bufferobj);
> +   GLubyte *map = ctx->Driver.MapBuffer(ctx, GL_READ_WRITE, 
> bufferobj);
> map += offset;
>  
>  intel_upload_data(&brw->intel, map, ib_size, ib_type_size,
> diff --git a/src/mesa/main/api_arrayelt.c b/src/mesa/main/api_arrayelt.c
> index 6400c8f..b897a33 100644
> --- a/src/mesa/main/api_arrayelt.c
> +++ b/src/mesa/main/api_arrayelt.c
> @@ -1602,9 +1602,7 @@ void _ae_map_vbos( struct gl_context *ctx )
>_ae_update_state(ctx);
>  
> for (i = 0; i < actx->nr_vbos; i++)
> -  ctx->Driver.MapBuffer(ctx,
> - GL_DYNAMIC_DRAW_ARB,
> - actx->vbo[i]);
> +  ctx->Driver.MapBuffer(ctx, GL_READ_WRITE, actx->vbo[i]);
>  
> if (actx->nr_vbos)
>actx->mapped_vbos = GL_TRUE;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] DEATH to old drivers!

2011-08-25 Thread Keith Whitwell
On Wed, 2011-08-24 at 20:46 -0400, Kristian Høgsberg wrote:
> On Wed, Aug 24, 2011 at 3:11 PM, Ian Romanick  wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > I'd like to propose giving the ax to a bunch of old, unmaintained
> > drivers.  I've been doing a bunch of refactoring and reworking of core
> > Mesa code, and these drivers have been causing me problems for a number
> > of reasons.
> >
> > 1. The hardware is so old that it doesn't support a lot of features that
> > have been common for 12+ years.
> >
> > 2. The drivers are so unmaintained that even hacking in new features
> > with dummy implementations is painful.
> >
> > 3. The drivers are so buggy that many piglit tests hang the GPU.  I
> > tried doing a piglit run on a Rage128 Pro that I have, but I gave up
> > after having to blacklist 15 tests.
> >
> > It also seems that at least some distros (e.g., Fedora) have stopped
> > shipping non-DRI2 drivers.  If nobody is shipping it, nobody is using it.
> >
> > My specific proposal is:
> >
> >  - Remove all DRI1 drivers: i810, mach64, mga, r128, savage, sis, tdfx,
> > and unichrome.
> >
> >  - Remove all unmaintained Windows drivers: gldirect, icd.
> >
> >  - Remove beos.
> >
> >  - Remove fbdev (this is swrast on raw fbdev).
> >
> > Opinions?
> 
> I wasn't going to chime in with another "me too", but just make it
> clear that there's a pretty strong concensus, here we go: yes please!
> And I've done a good deal of work in the DRI interface area and the
> maintenence burden is real, no matter what the back seat drivers say.

I will though:  Me too!

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] tgsi: add TXQ support.

2011-08-25 Thread Keith Whitwell
On Thu, 2011-08-25 at 07:28 -0600, Brian Paul wrote:
> How would the TXQ instruction be implemented for a hardware driver?
> 
> Is there really a HW GPU instruction that returns the size of a texture?

Yes, that's correct.

> Otherwise, this seems like something we could implement in the state 
> tracker by putting the texture size into a constant buffer slot.  Then 
> we'd have it for all drivers.

I think that's a good fallback for hardware that's missing this
capability, but DX10+ hardware should be expected to have it.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] tgsi: add TXQ support.

2011-08-25 Thread Keith Whitwell
On Thu, 2011-08-25 at 15:00 +0100, Dave Airlie wrote:
> On Thu, Aug 25, 2011 at 2:43 PM, Keith Whitwell  wrote:
> > On Thu, 2011-08-25 at 07:28 -0600, Brian Paul wrote:
> >> How would the TXQ instruction be implemented for a hardware driver?
> >>
> >> Is there really a HW GPU instruction that returns the size of a texture?
> >
> > Yes, that's correct.
> >
> >> Otherwise, this seems like something we could implement in the state
> >> tracker by putting the texture size into a constant buffer slot.  Then
> >> we'd have it for all drivers.
> >
> > I think that's a good fallback for hardware that's missing this
> > capability, but DX10+ hardware should be expected to have it.
> 
> I can't see us caring really, its part of GLSL1.30 which pretty muhc
> means GL3.0, which pretty much means DX10.

Sounds fair enough...

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] [PATCH] util: Remove check_os_katmai_support.

2010-08-16 Thread Keith Whitwell
I think this is fine.  It's been a very long time since we had to worry
about this.

Keith

On Mon, 2010-08-16 at 01:17 -0700, Vinson Lee wrote:
> I am proposing to remove the check_os_katmai_support function from 
> u_cpu_detect.
> 
> util: Remove check_os_katmai_support.
> 
> check_os_katmai_support checks that the operating system running on a 
> SSE-capable processor supports SSE. This is necessary for unpatched 2.2.x and 
> earlier kernels. 2.4.x and later kernels support SSE.
> 
> check_os_katmai_support will disable SSE capabilities for 32-bit x86 
> operating systems for which there is no code path. Currently, this function 
> handles Linux, Windows, and several BSDs. Mac OS, Cygwin, and Solaris are 
> several operating systems with no code paths.
> 
> Rather than add code for the unhandled operating systems, remove this 
> function altogether. This will fix SSE detection on all recent 32-bit x86 
> operating systems. This completely breaks functionality on unpatched 2.2.x 
> and earlier kernels, although there are likely no Gallium3D users on such 
> operating systems.
> 
> 
> diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c 
> b/src/gallium/auxiliary/util/u_cpu_detect.c
> index 5056351..b9b9f92 100644
> --- a/src/gallium/auxiliary/util/u_cpu_detect.c
> +++ b/src/gallium/auxiliary/util/u_cpu_detect.c
> @@ -194,123 +194,8 @@ check_os_altivec_support(void)
>  }
>  #endif /* PIPE_ARCH_PPC */
>  
> -/* If we're running on a processor that can do SSE, let's see if we
> - * are allowed to or not.  This will catch 2.4.0 or later kernels that
> - * haven't been configured for a Pentium III but are running on one,
> - * and RedHat patched 2.2 kernels that have broken exception handling
> - * support for user space apps that do SSE.
> - */
> -#if defined(PIPE_ARCH_X86) || defined (PIPE_ARCH_X86_64)
> -static void
> -check_os_katmai_support(void)
> -{
> -#if defined(PIPE_ARCH_X86)
> -#if defined(PIPE_OS_FREEBSD)
> -   int has_sse=0, ret;
> -   int len = sizeof (has_sse);
> -
> -   ret = sysctlbyname("hw.instruction_sse", &has_sse, &len, NULL, 0);
> -   if (ret || !has_sse)
> -  util_cpu_caps.has_sse=0;
> -
> -#elif defined(PIPE_OS_NETBSD) || defined(PIPE_OS_OPENBSD)
> -   int has_sse, has_sse2, ret, mib[2];
> -   int varlen;
> -
> -   mib[0] = CTL_MACHDEP;
> -   mib[1] = CPU_SSE;
> -   varlen = sizeof (has_sse);
> -
> -   ret = sysctl(mib, 2, &has_sse, &varlen, NULL, 0);
> -   if (ret < 0 || !has_sse) {
> -  util_cpu_caps.has_sse = 0;
> -   } else {
> -  util_cpu_caps.has_sse = 1;
> -   }
> -
> -   mib[1] = CPU_SSE2;
> -   varlen = sizeof (has_sse2);
> -   ret = sysctl(mib, 2, &has_sse2, &varlen, NULL, 0);
> -   if (ret < 0 || !has_sse2) {
> -  util_cpu_caps.has_sse2 = 0;
> -   } else {
> -  util_cpu_caps.has_sse2 = 1;
> -   }
> -   util_cpu_caps.has_sse = 0; /* FIXME ?!?!? */
> -
> -
> -#elif defined(PIPE_OS_WINDOWS)
> -   LPTOP_LEVEL_EXCEPTION_FILTER exc_fil;
> -   if (util_cpu_caps.has_sse) {
> -  exc_fil = SetUnhandledExceptionFilter(win32_sig_handler_sse);
> -#if defined(PIPE_CC_GCC)
> -  __asm __volatile ("xorps %xmm0, %xmm0");
> -#elif defined(PIPE_CC_MSVC)
> -  __asm {
> -  xorps xmm0, xmm0/* executing SSE instruction */
> -  }
> -#else
> -#error Unsupported compiler
> -#endif
> -  SetUnhandledExceptionFilter(exc_fil);
> -   }
> -#elif defined(PIPE_OS_LINUX)
> -   struct sigaction saved_sigill;
> -   struct sigaction saved_sigfpe;
> -
> -   /* Save the original signal handlers.
> -   */
> -   sigaction(SIGILL, NULL, &saved_sigill);
> -   sigaction(SIGFPE, NULL, &saved_sigfpe);
> -
> -   signal(SIGILL, (void (*)(int))sigill_handler_sse);
> -   signal(SIGFPE, (void (*)(int))sigfpe_handler_sse);
> -
> -   /* Emulate test for OSFXSR in CR4.  The OS will set this bit if it
> -* supports the extended FPU save and restore required for SSE.  If
> -* we execute an SSE instruction on a PIII and get a SIGILL, the OS
> -* doesn't support Streaming SIMD Exceptions, even if the processor
> -* does.
> -*/
> -   if (util_cpu_caps.has_sse) {
> -  __asm __volatile ("xorps %xmm1, %xmm0");
> -   }
> -
> -   /* Emulate test for OSXMMEXCPT in CR4.  The OS will set this bit if
> -* it supports unmasked SIMD FPU exceptions.  If we unmask the
> -* exceptions, do a SIMD divide-by-zero and get a SIGILL, the OS
> -* doesn't support unmasked SIMD FPU exceptions.  If we get a SIGFPE
> -* as expected, we're okay but we need to clean up after it.
> -*
> -* Are we being too stringent in our requirement that the OS support
> -* unmasked exceptions?  Certain RedHat 2.2 kernels enable SSE by
> -* setting CR4.OSFXSR but don't support unmasked exceptions.  Win98
> -* doesn't even support them.  We at least know the user-space SSE
> -* support is good in kernels that do support unmasked exceptions,
> -* and therefore to be safe I'm going to leave this test in here.
> -*/
> -   if (util_cpu_caps.has_sse) {
>

Re: [Mesa-dev] Merging translate and unnormalized-coords-hint?

2010-08-16 Thread Keith Whitwell
On Mon, 2010-08-16 at 04:54 -0700, Luca Barbieri wrote:
> I added the two patchsets I posted to the list to the two branches
> named in the subject.
> 
> The version pushed contain slight changes over the ones sent to the ML:
> 1. In translate, Win64 support has been further fixed to use the
> proper macro (_WIN64) and properly preserve xmm registers
> 2. unnormalized-coords-hint has been changed to add support for
> unnormalized coordinates in st_Bitmap
> 
> There seemed to be no fundamental opposition to the changes, and I
> fixed the issues raised.
> 
> The translate branch can be tested with
> src/gallium/tests/unit/translate_test and the piglit draw-vertices and
> draw-elements test with softpipe.
> The unnormalized-coords-hint branch can be partially tested with
> softpipe and has otherwise been tested as part of work on the nv30
> driver.
> 
> Should I merge them?

I'm very happy to merge the translate code, providing you're satisfied
that the remaining win64 discussion has been concluded.  Also, please
delete the feature branch once it's been merged.


In terms of the unnormalized change, I think I'd like to go over it one
more time.

It looks like there are a few things happening at once:

a) The state tracker is informing the driver whether it will use
normalized texcoords, unnormalized texcoords or both for a given
texture.

b) There is a query from the state tracker to the driver to find out
which it prefers (normalized vs unnormalized) for a given texture.

These two usages seem somewhat disjoint, and the mechanism for the query
is via a new channel we haven't used for queries in the past - ie based
on the driver modifying some of the values in the create-resource
template.

Is this a fair summary of the intentions of the change?  If so, my
request would be to divorce the two meanings of this flag -- keep the
PIPE_RESOURCE_FLAG for state_tracker->driver communications, ie. (a),
and use an explicit query for driver->state_tracker communications, ie
(b).

In this model, the state tracker would query the driver explicitly to
find out what normalization to use for internal rendering, and pass
through the API constraints otherwise.  

To represent all possibilities you'd need two flags, one for normalized
and one for unnormalized, such that OpenCL could set (NORMALIZED |
UNNORMALIZED).

Would that work for your needs?

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] [PATCH 0/4] Add frequency declaration for vertex elements

2010-08-16 Thread Keith Whitwell
Luca,

It seems like there is an alternate fix possible -- modify the mesa
fixed-function vertex program generator to put these constant values in
the constant buffer, rather than passing them as vertex data.  That
would remove the need for us to have this unique capability at the
gallium level that none of the other graphics abstractions seem to
consider necessary.

I think we're putting more effort into accommodating the behaviour of
that code than really makes sense...

Keith

On Mon, 2010-08-16 at 04:30 -0700, Luca Barbieri wrote:
> I recently posted a patch that would set instance_divisor to 0x for
> constant elements, so that the driver could take that knowledge into account
> when creating the vertex elements CSO.
> 
> However, in Direct3D 11, the instance id system value is specified as a 32-bit
> unsigned integers, which can wrap to 0 and thus assume the value 0x.
> 
> Hence, such a divisor would have a different meaning in Direct3D 11, so my
> approach was the wrong one.
> 
> Here is a new approach, which instead adds a new field to pipe_vertex_element
> called "frequency", which indicates whether the element is per-vertex,
> per-instance or constant.
> 
> Direct3D 11 has a similar feature with the InputSlotClass field in
> D3D11_INPUT_ELEMENT_DESC, but lacks support for constant elements, since
> these are needed only to support "legacy OpenGL style" immediate attribute
> specification.
> 
> Note that currently draw uses translate in a suboptimal way to duplicate
> constant and per-instance elements.
> 
> While translate could be improved to not repeatedly fetch them, the
> real fix should be applied to draw (but this probably doesn't matter
> because draw_llvm doesn't seem to use translate).
> 
> draw_llvm could also perhaps use fetching constant elements only once: this
> patchset only eliminates the stride multiplication and index bounds checking.
> 
> We could theoretically add only a "constant vs non-constant field" instead,
> but this wouldn't reduce the lines of code changed, and deviates needlessly
> from Direct3D 11.
> 
> Luca Barbieri (4):
>   gallium: introduce explicit frequency declaration for vertex elements
>   gallium: set frequency = PIPE_ELEMENT_FREQUENCY_PER_VERTEX explicitly
>   mesa/st: specify constant frequency for elements
>   draw: optimize for vertex element frequency (esp. draw_llvm)
> 
>  src/gallium/auxiliary/draw/draw_llvm.c   |   54 
> --
>  src/gallium/auxiliary/draw/draw_pt.c |   15 --
>  src/gallium/auxiliary/draw/draw_pt_vcache.c  |2 +-
>  src/gallium/auxiliary/util/u_blit.c  |1 +
>  src/gallium/auxiliary/util/u_blitter.c   |1 +
>  src/gallium/auxiliary/util/u_gen_mipmap.c|1 +
>  src/gallium/docs/d3d11ddi.txt|2 +-
>  src/gallium/docs/source/context.rst  |   23 --
>  src/gallium/docs/source/cso/velems.rst   |   11 +++-
>  src/gallium/drivers/r300/r300_render_translate.c |1 +
>  src/gallium/include/pipe/p_defines.h |7 +++
>  src/gallium/include/pipe/p_state.h   |   14 --
>  src/gallium/state_trackers/python/p_context.i|1 +
>  src/gallium/state_trackers/vega/polygon.c|1 +
>  src/gallium/state_trackers/vega/vg_context.c |1 +
>  src/gallium/state_trackers/xorg/xorg_renderer.c  |1 +
>  src/gallium/tests/graw/tri-instanced.c   |3 +
>  src/gallium/tests/trivial/quad-tex.c |2 +
>  src/gallium/tests/trivial/tri.c  |2 +
>  src/mesa/state_tracker/st_cb_drawtex.c   |1 +
>  src/mesa/state_tracker/st_context.c  |1 +
>  src/mesa/state_tracker/st_draw.c |4 ++
>  src/mesa/state_tracker/st_draw_feedback.c|2 +
>  23 files changed, 108 insertions(+), 43 deletions(-)
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium: add resource normalization flags (v3)

2010-08-17 Thread Keith Whitwell
On Tue, 2010-08-17 at 00:09 -0700, Luca Barbieri wrote:
> -#define PIPE_BIND_STREAM_OUTPUT(1 << 11) /*
> set_stream_output_buffers */
> +
> +/* Sampler views can be created based on this texture. Only the
> + * normalization preferred by the driver can be used, unless the
> other
> + * flags below are set as well. Only clamp wrap modes are allowed. */
> +#define PIPE_BIND_SAMPLER_VIEW_ANY (1 << 2) /* get_sampler_view
> */

I don't see the need for this "any" flag -- if this is an internally
generated texture, the state tracker can query the driver, find out what
normalization it prefers, and then use that explicitly.  


> +/* State trackers must set this flag if they/the user API need to be
> able to use
> + * unnormalized coodinates with clamp, clamp-to-edge or
> clamp-to-border wrap
> + * mode with this resource when a sampler view based on it is bound.
> + *
> + * OpenCL and OpenGL TEXTURE_RECTANGLE textures will have this flag
> set.
> + */
> +#define PIPE_BIND_SAMPLER_VIEW_UNNORMALIZED_CLAMP ((1 << 2) | (1 <<
> 3))
> +
> +/* State trackers must set this flag if they/the user API need to be
> able to use
> + * unnormalized coordinates with non-clamp wrap modes with this
> resource
> + * when a sampler view based on it is bound.
> + *
> + * OpenCL textures will have this flag set.
> + */
> +#define PIPE_BIND_SAMPLER_VIEW_UNNORMALIZED_NON_CLAMP ((1 << 2) | (1
> << 4))
> +
> +/* State trackers must set this flag if they/the user API need to be
> able to use
> + * normalized coordinates with any wrap mode with this resource
> + * when a sampler view based on it is bound.
> + *
> + * OpenCL, OpenGL TEXTURE_2D and D3D11 textures will have this flag
> set.
> + */
> +#define PIPE_BIND_SAMPLER_VIEW_NORMALIZED ((1 << 2) | (1 << 5))


Is there a practical difference at the driver level between the two
unnormalized versions?  

I know some hardware will support one and not the other, but that's
something to be handled with queries.

Once we get around to creating a texture, is it going to get layed out
differently depending on whether it uses clamp vs wrap addressing?

Keith 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium: add resource normalization flags (v3)

2010-08-17 Thread Keith Whitwell
On Tue, 2010-08-17 at 00:09 -0700, Luca Barbieri wrote:
> +
> +/* State trackers should support using either normalization in all
> internal drawing
> + * code, using these flag to tell which one to use.
> + *
> + * If they do not have such support, then they should indicate the
> + * normalization needed using the bind flags, but they should be
> aware
> + * that if it is different than the API-needed one, some drivers may
> + * fail creation because they only support one normalization at once.
> + *
> + * State trackers must not set these flags.
> + *
> + * Drivers should set these flags to inform the state tracker of the
> normalization
> + * it should use in internal drawing code, if they prefer any.
> + *
> + * Drivers who need to always have the same normalization used for a
> given
> + * resource must set these flags according to the bind flags above,
> and refuse
> + * creation if too many normalization bind flags are set.
> + */
> +#define PIPE_RESOURCE_FLAG_PREFER_UNNORMALIZED_COORDS (1 << 1)
> +#define PIPE_RESOURCE_FLAG_PREFER_NORMALIZED_COORDS (1 << 2)
> + 

Again, I don't see why this needs to be per-resource information --
surely the hardware has a single preference that it keeps under all
circumstances?

Or even if it's more complex than that, it's hard to believe it is so
random it needs to be handled per-resource.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging translate and unnormalized-coords-hint?

2010-08-17 Thread Keith Whitwell

-
In terms of the unnormalized change, I think I'd like to go over it one
more time.

It looks like there are a few things happening at once:

a) The state tracker is informing the driver whether it will use
normalized texcoords, unnormalized texcoords or both for a given
texture.

b) There is a query from the state tracker to the driver to find out
which it prefers (normalized vs unnormalized) for a given texture.

These two usages seem somewhat disjoint, and the mechanism for the query
is via a new channel we haven't used for queries in the past - ie based
on the driver modifying some of the values in the create-resource
template.

Is this a fair summary of the intentions of the change? 
--

On Mon, 2010-08-16 at 07:20 -0700, Luca Barbieri wrote:
> > Is this a fair summary of the intentions of the change?
> It's an excellent summary.

(Just adding it back in...)

What's wrong with addressing these needs respectively by:

a) Adding a new pipe_texture_target enum PIPE_TEXTURE_RECT to capture
the GL usage (unnormalized, clamp).  Think about CL later.

b) Adding a pipe cap to determine hardware preference for state-tracker
generated rendering (prefer TEXTURE_RECT vs TEXTURE_2D).  For API
rendering (ie non-state-tracker-generated), pass through exactly what
the API asks for.

Roland suggested an alternate mechanism for b: adding a cap for whether
the hw supports NPOT + normalized, which would also be fine for me.

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] gallium & texture rectangles

2010-08-18 Thread Keith Whitwell
On Tue, 2010-08-17 at 13:16 -0700, Luca Barbieri wrote:
> > Using a flag instead of a new texture target allows to avoid hundreds of
> > changes to existing code, and allows drivers for modern hardware to
> > just ignore this flag.
> I grepped a bit through the code, and a new texture target seems
> easier than expected: apparently there are only about 36 checks for
> PIPE_TEXTURE_2D (grepping for
> ([!=]=.*PIPE_TEXTURE_2D|PIPE_TEXTURE_2D.*[!=]=|case.*PIPE_TEXTURE_2D)).
> 
> It's still going to be more intrusive than a flag that almost
> everything just ignores (and with much greater risk of introducing
> bugs), but it could perhaps be an option, if the consensus is to add a
> new target.

A new texture target is much less surprising than any of these other
suggestions.  GL describes this behaviour as a texture target and its
meaning is well understood.

I think modifying 36 usages is pretty manageable, it's a one time cost
and keeps gallium within a set of concepts with which people are already
familiar.

I appreciate all the work you've put into looking at alternatives, but
at this stage I'm going to be firm - if PIPE_TEXTURE_RECT can be made to
work, that's the direction we should be taking.  I haven't seen anything
so far that suggests it isn't a feasible approach.

So if that's ok, let's start from a minimal suggestion and really try to
avoid letting complexity creep into the interface design.

What about this for a starting poing:
   - PIPE_TEXTURE_RECT texture target, following GL semantics
   - PIPE_CAP_TEXTURE_2D_NPOT  (nv30 should not advertise this)

Is there anything more you actually require right now?

Keith



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] gallium & texture rectangles

2010-08-18 Thread Keith Whitwell
On Wed, 2010-08-18 at 08:01 -0700, Luca Barbieri wrote:
> > I appreciate all the work you've put into looking at alternatives, but
> > at this stage I'm going to be firm - if PIPE_TEXTURE_RECT can be made to
> > work, that's the direction we should be taking.  I haven't seen anything
> > so far that suggests it isn't a feasible approach.
> Yes, it seems feasible, it's just a matter of choosing between tradeoffs.
> Right now I think it's more important to actually make any choice
> rather than making a specific choice.

Agreed.

> Effectively, implementing this is a matter of changing the resource flag to 
> be a
> target, and change all existing code that checks for PIPE_TEXTURE_2D to check
> for PIPE_TEXTURE_RECT as well.

Sounds perfect.

> > What about this for a starting poing:
> >   - PIPE_TEXTURE_RECT texture target, following GL semantics
> OK.
> 
> >   - PIPE_CAP_TEXTURE_2D_NPOT  (nv30 should not advertise this)
> We already have this right now, it's called PIPE_CAP_NPOT_TEXTURES.

OK, my bad.

> > Is there anything more you actually require right now?
> State trackers must support both normalizations and have some criteria
> to choose between them.
> 
> I see these options:
> 1. Have a cap to decide whether to use PIPE_TEXTURE_RECT for internal
> textures, and use unnormalized if and only if the target is
> PIPE_TEXTURE_RECT

I like this very much.  

If there was a way to simplify it even further I'd like it yet more.  

For instance use 2D/normalized for internal rendering iff the driver
advertises CAP_2D_NPOT, otherwise use RECT/non-normalized.

> 2. Have a cap to decide whether to use PIPE_TEXTURE_RECT for internal
> textures, and use the two preference flags to choose the normalization
> 3. Have state trackers always use PIPE_TEXTURE_2D, and use the two
> preference flags to choose the normalization
> 4. Have state trackers always use PIPE_TEXTURE_RECT, and use the two
> preference flags to choose the normalization

Hmm - my intention was that TEXTURE_RECT implies non-normalized
coordinates -- so this doesn't look like it works.

> 5. Have state trackers use yet another new target, e.g.
> "PIPE_SURFACE", that would be intended for "2D operations", and use
> the two preference flags to choose the normalization
> 
> Instead of two preference flags, only one can be used, but this loses
> the ability for the driver to express no preference, and only saves a
> single line of code.

> My latest patchset uses option 3.
> 
> What is your preference in this?

I like (1) for reasons of interface simplicity.  If there was an option
zero, that would be better still...

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] gallium & texture rectangles

2010-08-18 Thread Keith Whitwell
On Wed, 2010-08-18 at 09:08 -0700, Luca Barbieri wrote:
> > I have a feeling that CL performance will not matter that much for
> > nvfx and r300, compared to nv50 and r600.
> 
> Sure.
> The point is that if you can't use normalized coordinates at all on
> PIPE_TEXTURE_RECT, you can't implement OpenCL well on nv50 and r600.
> Hence, that should be allowed on OpenCL capable cards (and not on others).

That's fine.  When CL comes along, we can add a further capability to
indicate the driver permits this. 

> > r300 having to emit a shader instruction unconditionally in this case
> > really isn't that big of a deal.
> On nVidia cards, this doesn't matter, since they never prefer the
> normalization opposite to the OpenGL-required one.
> So if the maintainers of Radeon and software renderers, who might
> care, don't, I suppose we could as well not introduce the feature,
> even though I'm not totally convinced of this.

At this stage, nobody's asking for it.  If that changes, we can revisit
the issue and consider adding something more.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] gallium & texture rectangles

2010-08-18 Thread Keith Whitwell
On Wed, 2010-08-18 at 10:27 -0700, Luca Barbieri wrote:
> I pushed a first version on the gallium-rect-textures branch (not
> tested beyond compilation).
> 
> As a consequence of the decisions made in this thread, the interface
> is exactly identical to OpenGL, and internal drawing code works
> exactly as an OpenGL application not requiring
> ARB_texture_non_power_of_two would do it (no opinion implied on
> whether this is good or not).

Luca,

Thanks for your patience on this.  I know you've put a lot of effort
into getting this into shape.

Even if it isn't your first choice of interface, I think this looks like
a reasonable way forward and I hope addresses the underlying concern
adequately.

Couple of minor comments:
- You've documented FLEXIBLE_SAMPLING in the main part of the docs, I
think I'd prefer to keep forward-looking/todo items separate from the
documentation of the interface as it currently stands.  Can these
paragraphs go in a new section?

- actually that's all.

It looks good to me, thanks again Luca.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] gallium & texture rectangles

2010-08-18 Thread keith whitwell
On Wed, Aug 18, 2010 at 7:03 PM, Luca Barbieri  wrote:
>> Couple of minor comments:
>> - You've documented FLEXIBLE_SAMPLING in the main part of the docs, I
>> think I'd prefer to keep forward-looking/todo items separate from the
>> documentation of the interface as it currently stands.  Can these
>> paragraphs go in a new section?
>
> The reason is that currently several drivers already implicitly support this.
>
> I think the most sensible route would be to just say that drivers can
> in practice not have (some of?) those restrictions and that OpenCL
> will probably require drivers to have none of them, since adding the
> restrictions to the drivers just to remove them later seems silly.

Yes, gallium isn't that kind of interface to start with.  Drivers
aren't required to have any particular behaviour in response to state
tracker requests which exceed the defined bounds of the interface --
those would be considered state tracker bugs.

So generally the drivers shouldn't be worrying about validating state
tracker requests nor restricting themselves to only servicing legal
requests.  We can beef up galahad to do that for debugging purposes,
and in release builds the assumption is that the state tracker
performs any necessary API validation (potentially none, depending on
the API).

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging gallium-rect-textures and debug-refcnt?

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 00:59 -0700, Luca Barbieri wrote:
> gallium-rect-textures adds the PIPE_TEXTURE_RECT target as discussed
> in the "gallium & texture rectangles" thread.
> I tested nv30, nv40, softpipe and "softpipe with NPOT disabled" using piglit.


Yes, definitely.  Thanks again for your efforts on this, Luca.

> debug-refcnt adds the ability to log reference count modifications on
> Gallium objects to a file, which allows to track down leaks and
> compute several kinds of statistics on resource creation.
> It changes pipe_resource_reference in a way that gets optimized away
> in non-debug builds, and turns into checking a flag, and calling
> another function only if an environment variable is set.

Sounds like a useful facility, I hadn't noticed these commits though -
let me take a look.

I see some direct header file inclusions, not sure if that's an issue
for embedded platforms - maybe Jose can comment.  Possibly some of these
facilities are more os/ than util/  ?


Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI Sanity Checker

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 00:45 -0700, Corbin Simpson wrote:
> I can't find the email where we were discussing TGSI sanity. Did we
> want to move the sanity checker to galahad?

It only makes sense to move it there if galahad will end up being the
only user of it -- I'm not sure we're ready to make that statement yet.

But certainly galahad should be invoking it & I'd hope that it can be
worked on in its current location?

> Also should I be double-checking the documentation around galahad
> tests and making the documentation specify some of these caveats? The
> docs are currently really loose on specifying things and don't have
> any of the SHOULD/MUST/ALLOWED feel of the GL specs, but this might
> not be a problem given Gallium's flexibility.

I think it's more a function of the relative amounts of put into GL and
gallium docs.  The first step was to capture the basic thrust of the
interface, but if you've got time & inclination to take it a step
further, that's fantastic and appreciated as always...

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging gallium-rect-textures and debug-refcnt?

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 01:58 -0700, Luca Barbieri wrote:
> There is also another small issue: a new tool is necessary to
> post-process the traces, to resolve function names and line numbers.
> I put it a new directory called "src/gallium/tools" since none of the
> existing places seem appropriate.
> Is this a good idea?

Maybe we should merge it with the gallium/tests directory, or rather
create a tree gallium/progs with tests, tools, etc underneath?

This isn't your problem, but one thing that bugs me about the scons
builds is all the graw programs getting re-linked on every build.  We'll
have to figure out how to stop that if gallium/progs starts to become an
active place.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging gallium-rect-textures and debug-refcnt?

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 03:01 -0700, Luca Barbieri wrote:
> I pushed a new version as debug-refcnt-2, which uses os_stream instead of 
> FILE*.
> A new commit adds a printf facility to os_stream to support this.
> 
> It still uses the sprintf functions from stdio.h, but I suppose this is OK.
> If a platform doesn't have those, they can be taken from a BSD libc
> (or gnulib, glibc, Linux/BSD kernels, etc.).

Hasn't this already happened somewhere - util/u_snprintf.c ?  

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging gallium-rect-textures and debug-refcnt?

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 04:11 -0700, Luca Barbieri wrote:
> > Hasn't this already happened somewhere - util/u_snprintf.c ?
> Indeed, I'll fix it to use those.
> 
> There's something (independent from this) that bugs me though.
> 
> Why does Gallium feel the need to implement all this stuff with ad-hoc
> names, instead of, for instance, just implementing a function called
> "sprintf" if the platform doesn't provide it?
> Other similar instances are the "INLINE" macro instead of an "inline",
> "MALLOC" instead of "malloc", PIPE_ARCH_X86 instead of "__i386__", and
> surely many others.
> On the same note, if, for instance, "stdio.h" doesn't exist, one can
> just write an header called "stdio.h" and add that directory to the
> include path on the affected platform, instead of inventing another
> header name.

I guess the issue would be if there was something with that name already
but which wasn't workable for some reason.  Windows provides busted and
incomplete implementations of some of these functions, for instance.

> The result of this is that the codebase is cluttered with non-standard
> (and often ugly) conventions, and it is trivial to write code that
> works perfectly on Linux, but not on other platforms, because it uses
> the standard naming instead of the strange Gallium-specific names (as
> you can see in the case of this patchset).
> 
> As a byproduct of this, hardware drivers tend to drift away for this
> unusual convention, since they are never built for Windows. For
> instance, nouveau drivers use "inline" instead of the "INLINE" that
> Gallium code is supposed to use.

Gallium followed core mesa for this, and that convention is still in
core mesa, eg:

static INLINE void
_mesa_init_accum_dispatch(struct _glapi_table *disp)
{
}

I'd be ok to see this become "inline", I don't see any obvious problems
that can't be worked around.  Brian's on holidays atm, but when he's
back it would make sense to have his input on this as well.

> Given that I suspect that essentially no one outside VMware does Mesa
> development on a non-Unix platform, it seems a good idea to make sure
> that such code also works on Windows automatically without needing to
> be fixed each time.

I'm sure we're not the only ones.  But yes, if we can say in general
that there's a way to reduce the textual discrepancies without impacting
the ability to build and run on other platforms, then great.

Let's take it step by step though.

> Would a set of regular expressions plus simple patches that tries to
> fix all instances of this all over the codebase be positively
> accepted?
> I seem to recall that a similar substitution was applied to the
> non-Gallium parts of the Mesa.

I think there has been some movement there, but things like INLINE, etc.
still exist in core Mesa.

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging gallium-rect-textures and debug-refcnt?

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 06:35 -0700, José Fonseca wrote:
> On Fri, 2010-08-20 at 06:17 -0700, Luca Barbieri wrote:
> > > And define magic is very brittle. Especially with C++: you #define
> > > printf to be something else, but nothing prevents a class or a namespace
> > > to have the printf symbol in its scope.
> > 
> > Yes, but hopefully that's going to be very rare.
> > 
> > Alternatively, you can do this:
> > 1. Compile with cl /Dsprintf=dontuse_windows_sprintf
> > 
> > 2. Put in a header:
> > #include 
> > #undef sprintf
> > 
> > static __forceinline sprintf(...)
> > {
> >  my_vsprintf(...)
> > }
> > 
> > 3. Put in a C file:
> > void my_vsprintf()
> > {...}
> > 
> > If the header is not included, linking will fail (due to the reference
> > to dontuse_windows_sprintf), otherwise the custom version will be used
> > whenever sprintf is called, while still having sprintf as a (forced
> > inline) function.
> > 
> > It's also possible to ask the compiler to include the header with a
> > command line switch (-include in gcc), so that fully standard code
> > will work.
> > 
> > As a third option, you can have a build step that creates a patched
> > copy of the Windows headers, kind of like GCC's fixincludes.
> > 
> > > And again, using the CRT for strings is subject to the
> > > current locale. Care has to be taken when using it for stuff other that
> > > debug printing (e.g., shader parsing, which was exactly the bug what
> > > happened before in Mesa).
> > 
> > Only some functions respect the locale.
> > 
> > The glibc documentation says this:
> >  `strcmp' does not take sorting conventions of the language the
> >  strings are written in into account.  To get that one has to use
> >  `strcoll'.
> > 
> > And MSDN says this:
> > In locales for which the character set and the lexicographic character
> > order differ, use strcoll rather than strcmp for lexicographic
> > comparison of strings according to the LC_COLLATE category setting of
> > the current locale. Thus, to perform a lexicographic comparison of the
> > locale in the above example, use strcoll rather than strcmp.
> > Alternatively, you can use strxfrm on the original strings, then use
> > strcmp on the resulting strings
> > 
> > > Another example are math.h's sinf/cosf etc. It's indeed quite convenient
> > > to have the standard names. But some CRTs use #defines other inlines,
> > > and then suddenly different versions of the compiler/CRT do different
> > > things.
> > 
> > Isn't that going to be equivalent if the macros use the argument only once?
> > 
> > > I still have the plan to make a CGRT [1] module to centralize all
> > > existing OS portibility and abstract data types used everywhere in the
> > > mesa tree.
> > 
> > How about just making everything in the Mesa tree depend
> > unconditionally on Gallium, and let them use auxiliary/, os/, etc.
> > directly?
> 
> No. Some people would have an indigestion, and it Gallium itself would
> benefit from the cleanup of seperating OS/debug/ADT stuff from the
> gallium speific utilities. ATM everything is a big spaghetti inside
> auxiliary/* 

Yeah, I think it's a better plan to suck the foundational stuff out of
util/ and os/ and turn that into a new compatibility layer that both
mesa and gallium can use.

The os/ stuff is fairly clean, but util is a mix of various unrelated
helpers which need to be separated.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging gallium-rect-textures and debug-refcnt?

2010-08-20 Thread Keith Whitwell
On Fri, 2010-08-20 at 07:40 -0700, Luca Barbieri wrote:
> Does debug-refcnt-2 look good now?

Yes, looks good to me.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCHES] clang compatibility

2010-08-24 Thread Keith Whitwell
On Mon, 2010-08-23 at 14:09 -0700, José Fonseca wrote:
> On Sun, 2010-08-22 at 02:35 -0700, nobled wrote:
> > The first three attached patches make it possible to compile Mesa with
> > LLVM/Clang:
> > 1. Add -lstdc++ when linking glsl_compiler and glcpp
> > 2. Move -lstdc++ from the Gallium-specific Makefile.dri to
> > DRI_LIB_DEPS in configure (fixes linking classic Mesa drivers)
> > 3. Since autoconf gives GCC=yes even when using clang (since it just
> > tests for the __GNUC__ macro), don't check for a minimum version of
> > 3.3 if $(CC) points to a clang executable. (unfortunately I'm not sure
> > how to properly detect clang, short of test-compiling a file that
> > contains #ifdef __clang__. I.e. if $(CC) = 'cc', and 'cc' is an
> > alternatives symlink to llvm-clang, this doesn't detect that case.)
> > 
> > The rest are just fixes to compiler warnings:
> > 4. dri: Fix implicit declaration
> > 5. program: Fix struct/class confusion
> > 6. dr/radeon: Fix printf format
> > 7. llvmpipe: Fix memory leak
> > 
> > With the first three patches, I can compile Mesa with clang 2.7 in
> > Ubuntu Lucid if I export three variables before configure:
> > export CC=llvm-clang
> > export CXX=llvm-clang
> > export CPPFLAGS=/usr/lib/clang/1.1/include
> > ./configure
> > (Yeah, the third one is really prone to breakage with new versions and
> > I'm still trying to figure out how to not need it; it should also get
> > passed as part of MKDEP_OPTIONS in configure.ac, TBH.)
> 
> The llvmpipe patch looks correct to me, as draw_destroy doesn't destroy
> vbuf.
> 
> But if so then interface looks bad to me -- what's the point of putting
> a destroy callback in the interface if it's never used by the other side
> of the interface? Perhaps Keith can make a call of what's the right
> thing to do here.

Yes, that could be cleaned up if anyone has the time to do so.
Basically there's a question mark about who owns the vbuf entity - the
driver or draw module?  It was the driver initially, but now it really
looks like it should be the draw module.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): r300g: fix gl_PointCoord

2010-08-24 Thread keith whitwell
The point state is complex, there are actually a lot of variations
built into GL itself which aren't obvious on first reading of the
spec.  Probably the most obscure is that wide points and point sprites
are specified to actually be rasterized differently, ie color
different pixels.  I'm doing this from memory, but Roland dug into it
in depth.  Wide points have an explicit set of rules for which pixels
they cover, but point sprites are to be rasterized with the same rules
as a quad of similar size (again from memory).  The docs do explain
it, but it's complex none the less, and I do still wonder if there is
a clear way of capturing all the variations.

On Wed, Aug 25, 2010 at 3:57 AM, Marek Olšák
 wrote:
> Module: Mesa
> Branch: master
> Commit: 879a73023189eed488db2840b829aa5c78e5ba3f
> URL:    
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=879a73023189eed488db2840b829aa5c78e5ba3f
>
> Author: Marek Olšák 
> Date:   Wed Aug 25 04:55:01 2010 +0200
>
> r300g: fix gl_PointCoord
>
> Is this hackish or is this the correct way to use point_quad_rasterization?
> Copied from nvfx.
>
> ---
>
>  src/gallium/drivers/r300/r300_state.c |    9 +++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/r300/r300_state.c 
> b/src/gallium/drivers/r300/r300_state.c
> index 3e35d7a..47e359c 100644
> --- a/src/gallium/drivers/r300/r300_state.c
> +++ b/src/gallium/drivers/r300/r300_state.c
> @@ -950,6 +950,11 @@ static void* r300_create_rs_state(struct pipe_context* 
> pipe,
>     rs->rs = *state;
>     rs->rs_draw = *state;
>
> +    /* Generate point sprite texture coordinates in GENERIC0
> +     * if point_quad_rasterization is TRUE. */
> +    rs->rs.sprite_coord_enable = state->point_quad_rasterization *
> +                                 (state->sprite_coord_enable | 1);
> +
>     /* Override some states for Draw. */
>     rs->rs_draw.sprite_coord_enable = 0; /* We can do this in HW. */
>
> @@ -1051,10 +1056,10 @@ static void* r300_create_rs_state(struct 
> pipe_context* pipe,
>
>     /* Point sprites */
>     stuffing_enable = 0;
> -    if (state->sprite_coord_enable) {
> +    if (rs->rs.sprite_coord_enable) {
>         stuffing_enable = R300_GB_POINT_STUFF_ENABLE;
>         for (i = 0; i < 8; i++) {
> -            if (state->sprite_coord_enable & (1 << i))
> +            if (rs->rs.sprite_coord_enable & (1 << i))
>                 stuffing_enable |=
>                     R300_GB_TEX_ST << (R300_GB_TEX0_SOURCE_SHIFT + (i*2));
>         }
>
> ___
> mesa-commit mailing list
> mesa-com...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-commit
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] vertex shader that processes 0 vertex data

2010-09-01 Thread keith whitwell
On Wed, Sep 1, 2010 at 7:14 AM, Dave Airlie  wrote:
> I was looking at glsl-vs-point-size in piglit today and it doesn't
> work on gallium.
>
> void main()
> {
>        gl_Position = vec4(0.0, 0.0, 0.0, 1.0);
>        gl_PointSize = 16.0;
>        gl_FrontColor = vec4(1.0, 1.0, 1.0, 1.0);
> }
>
> Since the vertex shader doesn't have any declared inputs,
>
> we work out the vp in gallium and then in st_draw.c:st_draw_vbo we hit
>
>   if (num_vbuffers == 0 || num_velements == 0)
>      return;
>
> Any ideas on what should happen here?
>

I hadn't really considered this case, but it's clearly valid.  What
should be happening seems clear - there should be zero vertex buffers
bound and zero vertex elements referencing them and the draw call
should be legal.

Basically, the test above is incorrect and should be removed.  I
suspect the same assumption is built in elsewhere in the code and
you'll hit a few asserts and crashes in the process of getting this
working.

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] swizzling in llvmpipe [was: other stuff]

2010-09-01 Thread Keith Whitwell
On Wed, 2010-09-01 at 09:24 -0700, Luca Barbieri wrote:
> > It's an impressive amount of work you did here. I'll comment only on the
> > llvmpipe of the changes for now.
> 
> Thanks for your feedback!
> 
> > Admittedly, always using a floating point is not ideal. A better
> > solution would be to choose a swizzled data type (unorm8, fixed point,
> > float, etc) that matched the color buffer format.
> 
> Exactly, also because we'll want unnormalized formats and 64-bit formats too.
> 
> > But we've been seeing some results which point that the whole color
> > buffer swizzling idea might be overrated: it increases memory bandwidth
> > usage substantially,
> 
> Why?
> It should decrease it due to a lower number of cache misses, thanks to
> having a 2D instead of a 1D neighborhood of a pixel in the cache.

Well, don't forget that you have to populate the tile from somewhere -
so you'll hit all of the same cachelines that the non-swizzled version
would have.  

We still get locality from binning, meaning that all accesses to a group
of cachelines come in a single burst, after which they are done with and
can migrate out of L1 cache according to the processors own mechanisms.
With swizzling, we need to write them out ourselves, and try and do so
without blowing the caches (which is possible with non-temporal writes,
but it's still an extra operation).

A tile covers the same number of cachelines either way, and in normal
rendering the entire tile gets written, even if it's just with the clear
color.

In any case, this isn't really an argument that will be resolved by
discourse - one day someone will do the work to build a non-swizzling
version of llvmpipe & it will either be faster than what's there
currently or not.

It's also worth noting that things like improving texture sampling are
far higher on the list.  We do quite well in non-textured mesa demos
relative to i965 (50% from a single core seems typical), but drop behind
drastically in things like tunnel, etc.  Swizzling or not won't bridge
that gap.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] swizzling in llvmpipe [was: other stuff]

2010-09-01 Thread keith whitwell
On Wed, Sep 1, 2010 at 7:54 PM, Luca Barbieri  wrote:
>> It still sounds that you're referring to sampling from a texture and not
>> rendering/blending to it. Of course the are related (we only want one
>> swizzled layout at most), but the current swizzled layout was chosen to
>> make blending easy; and not to make texture sampling easy.
>>
>> No SoA swizzled layout makes texture sampling easier or more efficient.
>
> Yes, I incorrectly assumed that sampling were performed using the
> swizzled SoA layout too, which is indeed not the case and probably
> would be a bad idea.
>
> Perhaps always keeping the texture in a linear layout and converting
> before/after processing tiles could be an option? It seems the current
> code can do this, but only does in some cases (not totally sure).

We switched to this a little while ago, though the code is more
structured towards the original approach so it isn't easy to tell.

Z buffers are special as they are normally never accessed outside
shaders, so they stay swizzled always.  That's about it.

> That would even allow to keep the tiles in a SoA floating point layout
> without any significant RAM usage disadvantage.
> This may perhaps be faster than the current scheme, since you don't
> have to do format conversions during rendering (but you get a larger
> cache footprint, and need to clamp anyway for correctness before and
> after blending).

Switching to this scheme was indeed a win.  If nothing changes, this
is where we'll stay but at some point I'll make the experiment of
shading directly into the linear format.

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] [BRANCH] Floating point textures and rendering for Mesa, softpipe and llvmpipe

2010-09-01 Thread keith whitwell
On Wed, Sep 1, 2010 at 8:34 PM, tom fogal  wrote:
> Luca Barbieri  writes:
>> > It's an impressive amount of work you did here. I'll comment only
>> > on the llvmpipe of the changes for now.
>>
>> Thanks for your feedback!
>
> While we're on the topic: yes, this is great to see Luca.  Thank you!
>
>> > So instead of going through a lot of work to support multiple
>> > swizzled types I'd prefer to keep the current simplistic (always
>> > 8bit unorm) swizzled type, and simply ignore errors in the
>> > clamping/precision loss when rendering to formats with higher
>> > precision dynamic range.
>> >
>> > In summary, apart of your fragment clamping changes, I'd prefer to
>> > keep the rest of llvmpipe unchanged (and innacurate) for the time
>> > being.
>>
>> Note that this totally destroys the ability to use llvmpipe for high
>> dynamic range rendering, which fundamentally depends on the ability
>> to store unclamped and relatively high precision values.
>
> FWIW, we've got an app that allows one to choose the bit depth we do
> our compositing in, and the difference between 8 and 16 for some data
> is quite striking.  16 and 32 not so much, though sometimes visible.

Is that 8-fixed and 16-float (ie half-float)?  Or does a 16-bit fixed
format also work?

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (shader-work): glsl: introduce ir_binop_all_equal and ir_binop_any_equal, allow vector cmps

2010-09-09 Thread Keith Whitwell
On Wed, 2010-09-08 at 21:30 -0700, Marek Olšák wrote:
> On Thu, Sep 9, 2010 at 2:35 AM, Eric Anholt  wrote:
> However, the fact that when I ask about performance nobody
> says "OMG the
> texture upload/download/copy/etc. support in gallium is so
> great and is
> way faster than anybody managed in classic because it catches
> all the
> corner cases" makes me really scared about it.
> 
> OK so there you have it: The texture upload/download/copy is way
> faster in r300g than anyone managed in r300c. For example, the ETQW
> main menu uses some texture streaming and I get 1 fps in r300c and
> about 30 fps in r300g. That's not something one can ignore.
> 
> The transfers in r300g were really simple to implement, it's just
> transfer, map, unmap, transfer (transfer=blit). The code is clean and
> isolated (about 250 LoC). This is just the driver code. There is also
> some additional transfer code in st/mesa, but I didn't have to care
> about that.
> 
> The overall speed of r300g is either at the same level as r300c or
> higher, based on what application you look at. Various users have
> reported that, unlike r300c, all compiz effects just work, kwin works,
> a lot more games work, and the driver is faster in some apps. We used
> to have some performance issues in Tremulous not so long ago, but
> that's been fixed since then. Of course, one can find synthetic tests
> where one driver is always faster than another. I am talking about
> real world applications here. For example, I no longer have to kill
> Strogg in ETQW with the lowest details on my R580 for it to be smooth.
> 
> r300g is quite optimized (I say "quite", because you're never sure),
> so the overhead of other mesa components is larger than other Gallium
> drivers might be able to see. In Tremulous, the overhead of r300g
> +libdrm is under 50% of the whole time spent in Mesa, and that's
> without using Draw, so we should start caring about the speed of
> st/mesa and mesa/main. The only obvious way to get some speed there
> seems to be merging Mesa core with st/mesa, dropping the classic
> driver inteface, and simplifying the whole thing. I guess this won't
> happen until everybody moves to Gallium.

There's a lot that can be improved in st/mesa with regard to assembling
the vertex buffers & vertex elements -- very little of this work is
reused between subsequent primitives, but with a bit of analysis and
dirty state tracking I think a good improvement is possible.

st/mesa just hasn't had a lot of love - it got to a working state fairly
early, but has been largely neglected since.  

The fact it's a bit ugly is one factor that puts people off, but I guess
it's also the one part of gallium which doesn't benefit from gallium --
ie. you still have to deal with all the overlapping & conflicting GL
semantics that made things so confusing in the first place.

> It's a sure thing Gallium is the future, and it doesn't seem to be a
> good idea to implement e.g. LLVM-powered vertex shaders for classic,
> considering the same thing has already been implemented and now stable
> in Gallium.
> 
> The only disadvantage of Gallium seems to be TGSI. It's not better
> than Mesa IR, because all shaders must pass through Mesa IR anyway. I
> suppose using GLSL IR or something similar would help some drivers
> produce more optimized shaders (I am getting at temporary arrays here,
> which r300 hardware partially supports).

I'm really open to improving (or replacing) TGSI & I agree that passing
some sort of semantically rich IR would be a big win - especially now
there is a source for such a thing.

There's a reasonably painless path to getting there, I think, where we
could start off passing the new IR but have drivers initially slot in a
helper to drop it down to TGSI - ie push the IR->TGSI translation down
into each driver & then eliminate it piecewise.

I've just been incredibly swamped the last month or so, so I haven't had
much chance to follow up with a plan to get there, but the steps to do
this seem fairly clear.

Probably the first thing would be to define an IR which is a derivitive
of the new Mesa IR, without any dependencies into mesa concepts - ie.
which talks about constant buffers, etc, instead of GL parameters.  I
haven't had even the beginning of an opportunity to see how hard this
would be.


Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] mesa/st: set compiler options based on Gallium shader caps

2010-09-13 Thread Keith Whitwell
On Sat, 2010-09-11 at 15:48 -0700, Luca Barbieri wrote:
> Jose Fonseca seemed open to accepting this, but no final go ahead was given.
> 
> > We need this to enable full loop unrolling for r3xx->r4xx fragment shaders,
> > which don't support loops. It's needed for the blur shader in KWin to work.
> > This is a regression since the GLSL compiler merge, because the previous
> > compiler did unroll loops automatically. I'd like to get this fixed for 7.9.
> 
> nv30 has the same needs (and other cards are likely to get some
> performance boost from unrolling).

Yes, let's go ahead with this.  I'm assuming you've integrated the other
comments from Zack, Jose, etc.

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: Set FLUSH_UPDATE_CURRENT when setting vertex attibutes

2010-09-13 Thread keith whitwell
Hey Kristian,

The first question is whether this is necessary - from vague memory I
have an idea that current attributes need not be updated by vertex
buffer rendering - ie. it's optional/implementation-dependent.

I assume you're concerned with the case where you have something like

   // ctx->Current.Color is xyz

   glDrawArrays();

   // has ctx->Current.Color been updated??

But assuming I'm wrong about that & we really do want to make
DrawArrays set the current values, the patch looks good...

Keith


2010/9/13 Kristian Høgsberg :
> Setting constant vertex attributes with glDrawArrays() doesn't work right
> because the last attribute isn't copied to ctx->Current.  Typically,
> only the last attribute doesn't get set, since vbo_exec_wrap_upgrade_vertex()
> ends up getting called when setting a new attribute, and it will copy all
> previously set attributes to Current.
> ---
>  src/mesa/vbo/vbo_exec_api.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
>
> I'm not too familiar with this code, so I'd appreciate if somebody who
> knows the vbo code better could take a quick look.
>
> Kristian
>
> diff --git a/src/mesa/vbo/vbo_exec_api.c b/src/mesa/vbo/vbo_exec_api.c
> index 9df75a8..90c3dd4 100644
> --- a/src/mesa/vbo/vbo_exec_api.c
> +++ b/src/mesa/vbo/vbo_exec_api.c
> @@ -359,6 +359,9 @@ static void vbo_exec_fixup_vertex( GLcontext *ctx,
>  do {                                                           \
>    struct vbo_exec_context *exec = &vbo_context(ctx)->exec;    \
>                                                                \
> +   /* FLUSH_UPDATE_CURRENT needs to be set manually */         \
> +   exec->ctx->Driver.NeedFlush |= FLUSH_UPDATE_CURRENT;                \
> +                                                               \
>    if (exec->vtx.active_sz[A] != N)                            \
>       vbo_exec_fixup_vertex(ctx, A, N);                        \
>                                                                \
> --
> 1.7.2.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: Set FLUSH_UPDATE_CURRENT when setting vertex attibutes

2010-09-14 Thread Keith Whitwell
On Tue, 2010-09-14 at 08:18 -0700, Chia-I Wu wrote:
> 2010/9/14 Kristian Høgsberg :
> > 2010/9/14 Chia-I Wu :
> >> 2010/9/14 Kristian Høgsberg :
> >>> 2010/9/13 keith whitwell :
> >>>> Hey Kristian,
> >>>>
> >>>> The first question is whether this is necessary - from vague memory I
> >>>> have an idea that current attributes need not be updated by vertex
> >>>> buffer rendering - ie. it's optional/implementation-dependent.
> >>>>
> >>>> I assume you're concerned with the case where you have something like
> >>>>
> >>>>   // ctx->Current.Color is xyz
> >>>>
> >>>>   glDrawArrays();
> >>>>
> >>>>   // has ctx->Current.Color been updated??
> >>>>
> >>>> But assuming I'm wrong about that & we really do want to make
> >>>> DrawArrays set the current values, the patch looks good...
> >>>
> >>> No, what I'm seeing is that the code in question sets three generic
> >>> vertex attributes and then calls glDrawArrays().  The value of the
> >>> last attribute is not propagates into the shader.
> >>>
> >>> The problem is that the vertex array code keeps the values in
> >>> exec->vtx.vertex, but the implementation of glDrawArrays looks in
> >>> ctx->Current (that's what I assume, I didn't track that down).  When
> >>> the code hits a case where the size of an attribute is smaller that
> >>> what we're trying to set, it recomputes the layout of the
> >>> exec->vtx.vertex values and as a side effect copies the
> >>> exec->vtx.vertex values to ctx->Current.  Since we start out with
> >>> attrsz == 0 for all attributes, each new attribute will trigger this
> >>> recomputation and thus effectively flushes all previous values to
> >>> ctx->Current.  Which is why all but the last attribute make it to the
> >>> shader.
> >>>
> >>> Note that the ATTR macro is defined differently, depending on
> >>> FEATURE_beginend - the !FEATURE_beginend case sets the
> >>> FLUSH_UPDATE_CURRENT flag too.  I don't know why we wouldn't also set
> >>> it in the FEATURE_beginend case, not using begin/end in that case is
> >>> still an option.
> >> The way glColor4f is dispatched depends on whether it is GL or ES:
> >>
> >>  GL (with FEATURE_beginend): glColor4f -> neutral_Color4f -> vbo_Color4f
> >>  ES (w/o  FEATURE_beginend): glColor4f -> _es_Color4f -> vbo_Color4f
> >>
> >> In the former case, FLUSH_UPDATE_CURRENT should have been set by
> >> vbo_exec_BeginVertices which is called by neutral_Color4f.  In the latter 
> >> case,
> >> the flag must be set in vbo_Color4f.  Could it be a bug some where in 
> >> vtxfmt?
> >>
> >> One issue I noticed just last week is that the current scheme does not take
> >> into account "ES with FEATURE_beginend".  This happens with --enable-gles2
> >> build.  Since FEATURE_beginend is enabled in such build, vbo_Color4f does 
> >> not
> >> set FLUSH_UPDATE_CURRENT.  Yet, an ES context does not use neutral_Color4f.
> >
> > And that's exactly the problem I have.  Setting FLUSH_UPDATE_CURRENT
> > in the ATTR macro makes sure that the FLUSH_CURRENT in
> > vbo_exec_DrawArrays (and other array draw funcs) ends up calling
> > vbo_exec_copy_to_current(), which then pushes the values into
> > ctx->Current before the draw call.  I don't see a problem with this
> > approach; in a begin/end, this flag is already set, so there's no
> > overhead, outside begin/end (ES2) it's required for correct behaviour.
> If this patch is to be applied, it makes sense to also stop setting
> FLUSH_UPDATE_CURRENT in vbo_exec_BeginVertices, and remove the
> !FEATURE_beginend case.
> 
> But then, does it make sense to remove exec vtxfmt in main/vtxfmt.c and the
> related code (a big chunk!), and define ATTR to
> 
>   #define ATTR(...) do {
>  struct vbo_exec_context *exec = &vbo_context(ctx)->exec;
> 
>  if (!exec->ctx->Driver.NeedFlush)
> vbo_exec_BeginVertices(ctx);
> 
>  /* ... */
>   }
> 
> From what I can tell, and I am be wrong, exec vtxfmt and PRE_LOOPBACK are to
> avoid any unnecessary operation in vtxfmt functions, even as cheap as setting
> FLUSH_UPDATE_CURRENT repeatedly.  If that cannot be avoided, there seems to be
> no need to have exec vtxfmt.

This is all ancient code & getting rid of large chunks of it is probably
very appropriate...

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): glsl: Fix ' format not a string literal and no format arguments' warning.

2010-09-16 Thread Keith Whitwell
I saw these warnings also on scons builds.  Vinson - if you set "quiet=no", 
scons will print out the full gcc invocation, there may be a clue there what's 
causing this.

Keith

From: mesa-dev-bounces+keithw=vmware@lists.freedesktop.org 
[mesa-dev-bounces+keithw=vmware@lists.freedesktop.org] On Behalf Of Ian 
Romanick [...@freedesktop.org]
Sent: Thursday, September 16, 2010 2:31 PM
To: mesa-dev@lists.freedesktop.org
Subject: Re: [Mesa-dev] Mesa (master): glsl: Fix '  format not a string 
literal and no format arguments' warning.

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Vinson Lee wrote:
> Module: Mesa
> Branch: master
> Commit: f20f2cc3306310f6fc4c338f91cfac10f98335d3
> URL:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=f20f2cc3306310f6fc4c338f91cfac10f98335d3
>
> Author: Vinson Lee 
> Date:   Wed Sep 15 05:17:57 2010 -0700
>
> glsl: Fix 'format not a string literal and no format arguments' warning.
>
> Fix the following GCC warning.
> loop_controls.cpp: In function 'int calculate_iterations(ir_rvalue*, 
> ir_rvalue*, ir_rvalue*, ir_expression_operation)':
> loop_controls.cpp:88: warning: format not a string literal and no format 
> arguments
>
> ---
>
>  src/glsl/loop_controls.cpp |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/src/glsl/loop_controls.cpp b/src/glsl/loop_controls.cpp
> index 9619d8a..2ef3d30 100644
> --- a/src/glsl/loop_controls.cpp
> +++ b/src/glsl/loop_controls.cpp
> @@ -85,7 +85,7 @@ int
>  calculate_iterations(ir_rvalue *from, ir_rvalue *to, ir_rvalue *increment,
>enum ir_expression_operation op)
>  {
> -   void *mem_ctx = talloc_init(__func__);
> +   void *mem_ctx = talloc_init("%s", __func__);

If __func__ is not showing up as a string literal, something else is
wrong.  This is supposed to be a #define generated by the compiler.  Is
this  a case where we're wrapping __func__ for Visual Studio and
something is going wrong (since this is GCC)?  I also find this odd
because I don't get this warning on my builds... and I enable a lot of
extra warning flags.

>
> ir_expression *const sub =
>new(mem_ctx) ir_expression(ir_binop_sub, from->type, to, from);

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkySHDsACgkQX1gOwKyEAw/n/ACePdgvPa0xwSs02PqSQdnTF1pr
HekAn07NDlPtU2bvUGQSf1W0tFLtOWa1
=R3UQ
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: Fixed a typo in the SCS opcode description.

2010-09-19 Thread keith whitwell
Looks good, thanks Tilman.

Keith

On Sun, Sep 19, 2010 at 8:24 AM, Tilman Sauerbeck  wrote:
> Signed-off-by: Tilman Sauerbeck 
> ---
>  src/gallium/docs/source/tgsi.rst |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/src/gallium/docs/source/tgsi.rst 
> b/src/gallium/docs/source/tgsi.rst
> index e588c5b..4c1f47a 100644
> --- a/src/gallium/docs/source/tgsi.rst
> +++ b/src/gallium/docs/source/tgsi.rst
> @@ -726,7 +726,7 @@ This instruction replicates its result.
>
>   dst.z = 0
>
> -  dst.y = 1
> +  dst.w = 1
>
>
>  .. opcode:: TXB - Texture Lookup With Bias
> --
> 1.7.2.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (d3d1x): d3d1x: add new Direct3D 10/11 COM state tracker for Gallium

2010-09-20 Thread Keith Whitwell
Luca,

This is an amazing achievement -- not least because even excluding
this you've been incredibly productive recently.  I hope you haven't
got any more monster projects about to uncloak...

A couple of questions - it looks like this is a drop-in for the
d3d10/11 runtime, rather than an implementation of the DDI.  I think
that makes sense, but it could also be possible to split it into two
pieces implementing either side of the d3d10 DDI interface.  Any
thoughts on whether that's interesting to you?

Just trying to wrap my head around this, and how D3D10 works on
linux...  Right now, the test applications must be some sort of hybrid
between a regular posix/linux app and a win32/com/etc application,
right?  So you've essentially taken d3d10 plus a minimal amount of
dxgi, com, etc, sufficient to support the graphics apis, and
implemented that on linux+gallium?

I can see this being a useful tool for people porting win32 games to
linux, but at the same time there will probably be ongoing confusion
about which bits of win32 are a part of this platform -- I guess
that's where wine comes in.

Will this codebase work on windows, ie as a drop-in replacement for
the d3d10 runtime?  Or would it, with a bit of work?

Keith

On Mon, Sep 20, 2010 at 8:58 PM, Luca Barbieri
 wrote:
> Module: Mesa
> Branch: d3d1x
> Commit: e80d59faaa410bfc78af64204bc4055b837c7fad
> URL:    
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=e80d59faaa410bfc78af64204bc4055b837c7fad
>
> Author: Luca Barbieri 
> Date:   Sun Sep 12 02:49:36 2010 +0200
>
> d3d1x: add new Direct3D 10/11 COM state tracker for Gallium
>
> This is a new implementation of the Direct3D 11 COM API for Gallium.
>
> Direct3D 10 and 10.1 implementations are also provided, which are
> automatically generated with s/D3D11/D3D10/g plus a bunch of #ifs.
>
> While this in an initial version, most of the code is there (limited
> to what Gallium can express), and tri, gears and texturing demos
> are working.
>
> The primary goal is to realize Gallium's promise of multiple API
> support, and provide an API that can be easily implemented with just
> a very thin wrapper over Gallium, instead of the enormous amount of
> complex code needed for OpenGL.
>
> The secondary goal is to run Windows Direct3D 10/11 games on Linux
> using Wine.
> Wine dlls are currently not provided, but adding them should be
> quite easy.
>
> Fglrx and nvidia drivers can also be supported by writing a Gallium
> driver that talks to them using OpenGL, which is a relatively easy
> task.
> Thanks to the great design of Direct3D 10/11 and closeness to Gallium,
> this approach should not result in detectable overhead, and is the
> most maintainable way to do it, providing a path to switch to the
> open Gallium drivers once they are on par with the proprietary ones.
>
> Currently Wine has a very limited Direct3D 10 implementation, and
> completely lacks a Direct3D 11 implementation.
>
> Note that Direct3D 10/11 are completely different from Direct3D 9
> and earlier, and thus warrant a fully separate implementation.
>
> The third goal is to provide a superior alternative to OpenGL for
> graphics programming on non-Windows systems, particularly Linux
> and other free and open systems.
>
> Thanks to a very clean and well-though design done from scratch,
> the Direct3D 10/11 APIs are vastly better than OpenGL and can be
> supported with orders of magnitude less code and development time,
> as you can see by comparing the lines of code of this commit and
> those in the existing Mesa OpenGL implementation.
>
> This would have been true for the Longs Peak proposal as well, but
> unfortunately it was abandoned by Khronos, leaving the OpenGL
> ecosystem without a graphics API with a modern design.
>
> A binding of Direct3D 10/11 to EGL would solve this issue in the
> most economical way possible, and this would be great to provide
> in Mesa, since DXGI, the API used to bind Direct3D 10/11 to Windows,
> is a bit suboptimal, especially on non-Windows platforms.
>
> Finally, a mature Direct3D 10/11 implementation is intrinsically going
> to be faster and more reliable than an OpenGL implementation, thanks
> to the dramatically smaller API and the segregation of all nontrivial
> work to object creation that the application must perform ahead of
> time.
>
> Currently, this commit contains:
> - Independently created headers for Direct3D 10, 10.1, 11 and DXGI 1.1,
>  partially based on the existing Wine headers for D3D10 and DXGI 1.0
> - A parser for Direct3D 10/11 DXBC and TokenizedProgramFormat (TPF)
> - A shader translator from TokenizedProgramFormat to TGSI
> - Implementation of the Direct3D 11 core interfaces
> - Automatically generated implementation of Direct3D 10 and 10.1
> - Implementation of DXGI using the "native" framework of the EGL st
> - Demos, usable either on Windows or on this implementation
>  - d3d11tri, a clone of tri
>  - d3d11tex, a (multi)texturing demo
>  - d3d11gears, an improved version of glxgears

Re: [Mesa-dev] Mesa (d3d1x): d3d1x: add new Direct3D 10/11 COM state tracker for Gallium

2010-09-21 Thread Keith Whitwell
On Mon, 2010-09-20 at 16:28 -0700, Luca Barbieri wrote:
> > A couple of questions - it looks like this is a drop-in for the
> > d3d10/11 runtime, rather than an implementation of the DDI.
> Yes.
> 
> > I think
> > that makes sense, but it could also be possible to split it into two
> > pieces implementing either side of the d3d10 DDI interface.  Any
> > thoughts on whether that's interesting to you?
> 
> I wrote it this way first of all because it's clearly easier to just
> write the code to support one interface, rather than writing two
> pieces, and it avoids unnecessary reliance on Microsoft interfaces,
> which often tend to be imperfectly documented.
> Not going through the DDI also clearly reduces CPU overhead and keeps
> the codebase simpler.
> 
> I think a DDI implementation over Gallium could just live along as a
> sibling to the COM implementation, sharing common code, which is
> already split out into modules such as d3d1xshader and d3d1xstutil.
> The shader parser and translator can be fully shared and several
> conversions (e.g. DXGI_FORMAT -> pipe_format) are already separate
> from the main code, although perhaps more could be factored out.
> 
> Instead, layering the COM API over the DDI API doesn't necessarily
> seem to be a win, especially because Gallium is so close to the
> D3D10/11 interfaces that it's not even clear that using the DDI is
> much easier than just using Gallium directly.
> 
> I don't think I'll do it myself as an hobby project though.

Sounds good Luca, just interested in your plans for this.

I don't see any reason not to merge this to master straight away -- this
is all self-contained in its own directory & doesn't seem like it will
regress anything else...

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] separate depth and stencil buffers in gallium

2010-09-22 Thread Keith Whitwell
On Wed, Sep 22, 2010 at 10:30 AM, Dave Airlie  wrote:
> So Evergreen hardware appears to have only completely separate depth
> and stencil buffers and doesn't natively support a combnined DS buffer
> from what I can see. I'm awaiting clarification from AMD.
>
> Now gallium and st/mesa seem to be quite dedicated to the whole
> combined DS cause.

What formats exactly does it support?

It's interesting because DX10 (and presumably 11) always talk about
combined buffers - but the abstraction is such (with staging
resources, no direct mapping of the buffers, etc) that there's nothing
which constrains the layout to be an interleaved depth+stencil.   IE.
you could quite happily allocate the combined depth/stencil as a
planar depth buffer and a separate planar stencil buffer - both hidden
behind the same resource handle.

I would have hoped we'd have the same flexibility in gallium -
basically that nobody should be able to tell whether depth & stencil
are swizzled together or separate.

The obvious case where the app & state tracker might be alerted to
your unusual layout is in transfers.  An interim solution would be to
swizzle/unswizzle depth buffer transfer data (or organize for the card
to do so for you).


> I'm mainly posting just wondering if anyone else has considered this
> or any other hardware this might be useful for exists, or if anyone
> can speak to the pitfalls I'll face.
>
> I've got some initial done in 30 mins hacks
> http://cgit.freedesktop.org/~airlied/mesa/log/?h=sep-zs

I'll take a look.

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] separate depth and stencil buffers in gallium

2010-09-22 Thread Keith Whitwell
On Wed, Sep 22, 2010 at 11:15 AM, Dave Airlie  wrote:

>>> I'm mainly posting just wondering if anyone else has considered this
>>> or any other hardware this might be useful for exists, or if anyone
>>> can speak to the pitfalls I'll face.
>>>
>>> I've got some initial done in 30 mins hacks
>>> http://cgit.freedesktop.org/~airlied/mesa/log/?h=sep-zs
>
> So I'm guessing I've taken the wrong approach here from reading this,
> I should probably not expose this to gallium, and just make sure the DDX
> allocates a large enough buffer for two planes.
>

This sounds good, especially if sampling needs yet more logic.

Presumably things like Hyper-Z require their own, additional storage?

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] D3D1x Revert

2010-09-23 Thread Keith Whitwell
>From my point of view, I'd like to get a list of the specific issues the wine 
>guys have, and work through that list resolving each issue in turn - either by 
>modifying the code or demonstrating that the concern is unfounded.  If neither 
>of these is possible, then we need to make a call one way or another.

Looking back at the emails what I have so far is concern about the file 
"tpf.h".   I'd like to understand whether people actually think that this has 
been created improperly, or rather that people believe that it is actually 
legal but does not meet wine's standards and that those standards should also 
now apply to mesa.

IE. are we:
 (a) being alterted to improperly authored code, or
 (b) being asked to impose new restrictions on our project to fit in with the 
choices of another one?

These are two quite different conversations, and I'd like to understand which 
one we're having.

Keith






From: mesa-dev-bounces+keithw=vmware@lists.freedesktop.org 
[mesa-dev-bounces+keithw=vmware@lists.freedesktop.org] On Behalf Of Jose 
Fonseca [jfons...@vmware.com]
Sent: Thursday, September 23, 2010 7:33 AM
To: Corbin Simpson
Cc: mesa-dev@lists.freedesktop.org
Subject: Re: [Mesa-dev] D3D1x Revert

Either WINE developers or Luca will feel alienated. So, I think the discussion 
should be *now*, before taking further decision.

Personally, I still don't understand what's special about this code. If no 
Microsoft code was ever copied or used in d3d1x then I don't see how WINE 
developers are tainted by the d3d1x state tracker presence than by the other 
stuff in master now.

As I said before, we all read third party NDA hardware specs and reference code 
in the past. It is standard practice. Several other members of the community 
had to reverse engineer hardware operation. I think it would be dishonest to 
allows ourselves this much so far, and refuse Luca's code because he read 
publicly available Microsoft docs or reference code.

My understanding is that this is more than about legality of this code: 
Microsoft can sue whoever they want, with or without legal basis, and WINE 
developers want to publicly show beyond doubt that they we're not near even 
miles from Microsoft code, to deter Microsoft to sue them, and so incur in 
legal expenses.

But I don't see how we can accommodate that, and continue to maintain that 
Gallium is about the abstraction of many graphic APIs and many OSes. Even Mesa 
was always about many OSes too.

Personally, I'd like to encourage initiatives of supporting more APIs and more 
OSes to Mesa/Gallium such as this one.

And to be honest, WINE developers did a disservice to themselves by openly 
stating their concerns. They put themselves between the rock and the wall with 
that. For future reference, if people have this sort of doubts, they should 
contact the project maintainers (e.g Brian, Keith) privately.

Jose



From: mesa-dev-bounces+jfonseca=vmware@lists.freedesktop.org 
[mesa-dev-bounces+jfonseca=vmware@lists.freedesktop.org] On Behalf Of 
Corbin Simpson [mostawesomed...@gmail.com]
Sent: Wednesday, September 22, 2010 23:23
To: mesa-dev@lists.freedesktop.org
Subject: [Mesa-dev] D3D1x Revert

Can I revert this merge out of master? The Wine developers that
contribute to Mesa are feeling very alienated by this code, and I
think that it could stand to have some more discussion, especially
about its legality.

~ C.

--
When the facts change, I change my mind. What do you do, sir? ~ Keynes

Corbin Simpson

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: fix util_pack_color for B4G4R4A4

2010-09-24 Thread Keith Whitwell
Looks good to me Marek -- good catch.

Keith

On Thu, 2010-09-23 at 17:42 -0700, Marek Olšák wrote:
> NOTE: This is a candidate for the 7.9 branch.
> ---
>  src/gallium/auxiliary/util/u_pack_color.h |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_pack_color.h 
> b/src/gallium/auxiliary/util/u_pack_color.h
> index aae8b8b..c90b0fd 100644
> --- a/src/gallium/auxiliary/util/u_pack_color.h
> +++ b/src/gallium/auxiliary/util/u_pack_color.h
> @@ -394,7 +394,7 @@ util_pack_color(const float rgba[4], enum pipe_format 
> format, union util_color *
>return;
> case PIPE_FORMAT_B4G4R4A4_UNORM:
>{
> - uc->ub = ((a & 0xf0) << 8) | ((r & 0xf0) << 4) | ((g & 0xf0) << 0) 
> | (b >> 4);
> + uc->us = ((a & 0xf0) << 8) | ((r & 0xf0) << 4) | ((g & 0xf0) << 0) 
> | (b >> 4);
>}
>return;
> case PIPE_FORMAT_A8_UNORM:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] r600g old design -> new design

2010-09-29 Thread Keith Whitwell
On Wed, 2010-09-29 at 03:35 -0700, Michel Dänzer wrote:
> On Die, 2010-09-28 at 11:40 -0400, Jerome Glisse wrote: 
> > 
> > - use score for placing bo, bo placement will be recorded in bo structure 
> > and
> > each time a state is bind bo score will be updated (bo bound as framebuffer
> > will get their score for placing into vram increase while bo bound as small
> > vertex buffer will endup in GTT, also anytime a bo is mapped for transfer 
> > for
> > CPU read its score for GTT placement increase thus bo that are often updated
> > by CPU will more likely place into GTT)
> 
> Beware that the EXA 'classic' scheme originally (with the 'greedy' and
> 'smart' heuristics) used a score like that to determine whether a pixmap
> should reside in VRAM or system memory, and it could result in quite
> erratic / inconsistent / unreproducible behaviour when the score hovered
> around the threshold. It could make fixing or even reproducing problems
> harder than it is already.

Did trying some sort of hysteresis - eg. having different thresholds for
upload vs. download help?  

Keith


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] first attempt at shader stencil export

2010-09-30 Thread Keith Whitwell
On Wed, 2010-09-29 at 23:41 -0700, Dave Airlie wrote:
> some background:
> 
> so on r600g, the only way to directly write to the stencil is via the
> shader, using a transfer would require an extra step to flush the DS
> buffer out via the pipe again to make it useable by the hw as a DS
> buffer. So using the current gallium stencil draw would be a major
> overhead, compared to just doing it properly.
> 
> So to do it properly I decided to expose the
> GL_ARB_shader_stencil_export type functionality.
> 
> http://cgit.freedesktop.org/~airlied/mesa/log/?h=gallium-stencil-export
> 
> 7 commits in here:
> 
> two simple one liners, add the cap to gallium headers, and add
> FRAG_RESULT_STENCIL to mesa.
> then a "fix" to the mesa texstore so it will store S8 in an 8/24 so we
> can put the stencil values in a texture.
> mesa state tracker support to use the new capability to hw accel
> drawpixels on the stencil (outputs to Y of FRAG_RESULT_STENCIL).
> r600g support for the capability take the second
> TGSI_SEMANTIC_POSITION output and use its Y value as stencil (looking
> at this now, I should probably be taking the X value really).
> very initial mesa/GLSL support for the extension (completely untested).
> initial softpipe support for the capability a demonstration.
> 
> TODO:
> probably stop using Y (this was just because the hw uses Y), the GLSL
> extension just defines an integer output for the fragprog.
> fix the 24/8 texstore variant.
> write some test code for piglit and test the GL extension/GLSL bits.
> 
> I'm a lot more interested in the non-GL extension bits as it allows
> stencil writes to work properly on r600g.

Dave,

This all looks great.  I wasn't really aware of this extension, but it
looks like a good way to expose the functionality you need.

Can you make some updates to the gallium/docs directory to reflect the
additions you're making to the gallium interface? 

Keith

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): r600g: use constant buffer instead of register for constant

2010-10-06 Thread Keith Whitwell
Hi Jerome,

I was playing with this driver on my new machine (rv710) & was
impressed with how well it worked.  Unfortunately the next update
everything went black...

I traced it down to this commit - it basically seems like all the
vertex-buffer constants (at least) are ending up as zero, so most
geometry doesn't get transformed properly.  There are a few exceptions
(like trivial/drawelements) which submit clip-space vertices & don't
need transformation.

I don't really know anything about how r600 hardware works, so thought
I'd try & see if anyone else has an idea what's wrong...

Keith

On Thu, Sep 30, 2010 at 6:47 PM, Jerome Glisse
 wrote:
> Module: Mesa
> Branch: master
> Commit: 153105cfbfd8d6ff30de144605016f6e4f2a1b9e
> URL:    
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=153105cfbfd8d6ff30de144605016f6e4f2a1b9e
>
> Author: Jerome Glisse 
> Date:   Thu Sep 30 10:43:26 2010 -0400
>
> r600g: use constant buffer instead of register for constant
>
> Signed-off-by: Jerome Glisse 
>
> ---
>
>  src/gallium/drivers/r600/r600_state.c         |   47 
> +++--
>  src/gallium/drivers/r600/r600d.h              |    5 +++
>  src/gallium/winsys/r600/drm/r600_hw_context.c |   34 +++---
>  src/gallium/winsys/r600/drm/r600d.h           |    5 +++
>  4 files changed, 36 insertions(+), 55 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_state.c 
> b/src/gallium/drivers/r600/r600_state.c
> index 23323f1..23c2e59 100644
> --- a/src/gallium/drivers/r600/r600_state.c
> +++ b/src/gallium/drivers/r600/r600_state.c
> @@ -1148,41 +1148,35 @@ static void r600_set_constant_buffer(struct 
> pipe_context *ctx, uint shader, uint
>                                        struct pipe_resource *buffer)
>  {
>        struct r600_pipe_context *rctx = (struct r600_pipe_context *)ctx;
> -       struct r600_pipe_state *rstate;
> -       struct pipe_transfer *transfer;
> -       unsigned *nconst = NULL;
> -       u32 *ptr, offset;
> +       struct r600_resource *rbuffer = (struct r600_resource*)buffer;
>
>        switch (shader) {
>        case PIPE_SHADER_VERTEX:
> -               rstate = rctx->vs_const;
> -               nconst = &rctx->vs_nconst;
> -               offset = R_03_SQ_ALU_CONSTANT0_0 + 0x1000;
> +               rctx->vs_const_buffer.nregs = 0;
> +               r600_pipe_state_add_reg(&rctx->vs_const_buffer,
> +                                       R_028180_ALU_CONST_BUFFER_SIZE_VS_0,
> +                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
> +                                       0x, NULL);
> +               r600_pipe_state_add_reg(&rctx->vs_const_buffer,
> +                                       R_028980_ALU_CONST_CACHE_VS_0,
> +                                       0, 0x, rbuffer->bo);
> +               r600_context_pipe_state_set(&rctx->ctx, 
> &rctx->vs_const_buffer);
>                break;
>        case PIPE_SHADER_FRAGMENT:
> -               rstate = rctx->ps_const;
> -               nconst = &rctx->ps_nconst;
> -               offset = R_03_SQ_ALU_CONSTANT0_0;
> +               rctx->ps_const_buffer.nregs = 0;
> +               r600_pipe_state_add_reg(&rctx->ps_const_buffer,
> +                                       R_028140_ALU_CONST_BUFFER_SIZE_PS_0,
> +                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
> +                                       0x, NULL);
> +               r600_pipe_state_add_reg(&rctx->ps_const_buffer,
> +                                       R_028940_ALU_CONST_CACHE_PS_0,
> +                                       0, 0x, rbuffer->bo);
> +               r600_context_pipe_state_set(&rctx->ctx, 
> &rctx->ps_const_buffer);
>                break;
>        default:
>                R600_ERR("unsupported %d\n", shader);
>                return;
>        }
> -       if (buffer && buffer->width0 > 0) {
> -               *nconst = buffer->width0 / 16;
> -               ptr = pipe_buffer_map(ctx, buffer, PIPE_TRANSFER_READ, 
> &transfer);
> -               if (ptr == NULL)
> -                       return;
> -               for (int i = 0; i < *nconst; i++, offset += 0x10) {
> -                       rstate[i].nregs = 0;
> -                       r600_pipe_state_add_reg(&rstate[i], offset + 0x0, 
> ptr[i * 4 + 0], 0x, NULL);
> -                       r600_pipe_state_add_reg(&rstate[i], offset + 0x4, 
> ptr[i * 4 + 1], 0x, NULL);
> -                       r600_pipe_state_add_reg(&rstate[i], offset + 0x8, 
> ptr[i * 4 + 2], 0x, NULL);
> -                       r600_pipe_state_add_reg(&rstate[i], offset + 0xC, 
> ptr[i * 4 + 3], 0x, NULL);
> -                       r600_context_pipe_state_set(&rctx->ctx, &rstate[i]);
> -               }
> -               pipe_buffer_unmap(ctx, buffer, transfer);
> -       }
>  }
>
>  static void *r600_create_shader_state(struct pipe_context *ctx,
> @@ -1191,6 +1185,7 @@ static void *r600_create_shader_sta

[Mesa-dev] Fwd: Mesa (master): r600g: use constant buffer instead of register for constant

2010-10-06 Thread Keith Whitwell
(using the correct mesa3d-dev address)


-- Forwarded message --
From: Keith Whitwell 
Date: Wed, Oct 6, 2010 at 9:07 AM
Subject: Re: Mesa (master): r600g: use constant buffer instead of
register for constant
To: mesa-dev@lists.freedesktop.org
Cc: mesa3d-dev 


Hi Jerome,

I was playing with this driver on my new machine (rv710) & was
impressed with how well it worked.  Unfortunately the next update
everything went black...

I traced it down to this commit - it basically seems like all the
vertex-buffer constants (at least) are ending up as zero, so most
geometry doesn't get transformed properly.  There are a few exceptions
(like trivial/drawelements) which submit clip-space vertices & don't
need transformation.

I don't really know anything about how r600 hardware works, so thought
I'd try & see if anyone else has an idea what's wrong...

Keith

On Thu, Sep 30, 2010 at 6:47 PM, Jerome Glisse
 wrote:
> Module: Mesa
> Branch: master
> Commit: 153105cfbfd8d6ff30de144605016f6e4f2a1b9e
> URL:    
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=153105cfbfd8d6ff30de144605016f6e4f2a1b9e
>
> Author: Jerome Glisse 
> Date:   Thu Sep 30 10:43:26 2010 -0400
>
> r600g: use constant buffer instead of register for constant
>
> Signed-off-by: Jerome Glisse 
>
> ---
>
>  src/gallium/drivers/r600/r600_state.c         |   47 
> +++--
>  src/gallium/drivers/r600/r600d.h              |    5 +++
>  src/gallium/winsys/r600/drm/r600_hw_context.c |   34 +++---
>  src/gallium/winsys/r600/drm/r600d.h           |    5 +++
>  4 files changed, 36 insertions(+), 55 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_state.c 
> b/src/gallium/drivers/r600/r600_state.c
> index 23323f1..23c2e59 100644
> --- a/src/gallium/drivers/r600/r600_state.c
> +++ b/src/gallium/drivers/r600/r600_state.c
> @@ -1148,41 +1148,35 @@ static void r600_set_constant_buffer(struct 
> pipe_context *ctx, uint shader, uint
>                                        struct pipe_resource *buffer)
>  {
>        struct r600_pipe_context *rctx = (struct r600_pipe_context *)ctx;
> -       struct r600_pipe_state *rstate;
> -       struct pipe_transfer *transfer;
> -       unsigned *nconst = NULL;
> -       u32 *ptr, offset;
> +       struct r600_resource *rbuffer = (struct r600_resource*)buffer;
>
>        switch (shader) {
>        case PIPE_SHADER_VERTEX:
> -               rstate = rctx->vs_const;
> -               nconst = &rctx->vs_nconst;
> -               offset = R_03_SQ_ALU_CONSTANT0_0 + 0x1000;
> +               rctx->vs_const_buffer.nregs = 0;
> +               r600_pipe_state_add_reg(&rctx->vs_const_buffer,
> +                                       R_028180_ALU_CONST_BUFFER_SIZE_VS_0,
> +                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
> +                                       0x, NULL);
> +               r600_pipe_state_add_reg(&rctx->vs_const_buffer,
> +                                       R_028980_ALU_CONST_CACHE_VS_0,
> +                                       0, 0x, rbuffer->bo);
> +               r600_context_pipe_state_set(&rctx->ctx, 
> &rctx->vs_const_buffer);
>                break;
>        case PIPE_SHADER_FRAGMENT:
> -               rstate = rctx->ps_const;
> -               nconst = &rctx->ps_nconst;
> -               offset = R_03_SQ_ALU_CONSTANT0_0;
> +               rctx->ps_const_buffer.nregs = 0;
> +               r600_pipe_state_add_reg(&rctx->ps_const_buffer,
> +                                       R_028140_ALU_CONST_BUFFER_SIZE_PS_0,
> +                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
> +                                       0x, NULL);
> +               r600_pipe_state_add_reg(&rctx->ps_const_buffer,
> +                                       R_028940_ALU_CONST_CACHE_PS_0,
> +                                       0, 0x, rbuffer->bo);
> +               r600_context_pipe_state_set(&rctx->ctx, 
> &rctx->ps_const_buffer);
>                break;
>        default:
>                R600_ERR("unsupported %d\n", shader);
>                return;
>        }
> -       if (buffer && buffer->width0 > 0) {
> -               *nconst = buffer->width0 / 16;
> -               ptr = pipe_buffer_map(ctx, buffer, PIPE_TRANSFER_READ, 
> &transfer);
> -               if (ptr == NULL)
> -                       return;
> -               for (int i = 0; i < *nconst; i++, offset += 0x10) {
> -                       rstate[i].nregs = 0;
> -                       r600

Re: [Mesa-dev] Mesa (master): r600g: use constant buffer instead of register for constant

2010-10-06 Thread Keith Whitwell
Hmm, same results on the machine's built-in rs880 (whatever that is...)

Keith

On Wed, Oct 6, 2010 at 9:08 AM, Keith Whitwell  wrote:
> (using the correct mesa3d-dev address)
>
>
> -- Forwarded message ------
> From: Keith Whitwell 
> Date: Wed, Oct 6, 2010 at 9:07 AM
> Subject: Re: Mesa (master): r600g: use constant buffer instead of
> register for constant
> To: mesa-dev@lists.freedesktop.org
> Cc: mesa3d-dev 
>
>
> Hi Jerome,
>
> I was playing with this driver on my new machine (rv710) & was
> impressed with how well it worked.  Unfortunately the next update
> everything went black...
>
> I traced it down to this commit - it basically seems like all the
> vertex-buffer constants (at least) are ending up as zero, so most
> geometry doesn't get transformed properly.  There are a few exceptions
> (like trivial/drawelements) which submit clip-space vertices & don't
> need transformation.
>
> I don't really know anything about how r600 hardware works, so thought
> I'd try & see if anyone else has an idea what's wrong...
>
> Keith
>
> On Thu, Sep 30, 2010 at 6:47 PM, Jerome Glisse
>  wrote:
>> Module: Mesa
>> Branch: master
>> Commit: 153105cfbfd8d6ff30de144605016f6e4f2a1b9e
>> URL:    
>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=153105cfbfd8d6ff30de144605016f6e4f2a1b9e
>>
>> Author: Jerome Glisse 
>> Date:   Thu Sep 30 10:43:26 2010 -0400
>>
>> r600g: use constant buffer instead of register for constant
>>
>> Signed-off-by: Jerome Glisse 
>>
>> ---
>>
>>  src/gallium/drivers/r600/r600_state.c         |   47 
>> +++--
>>  src/gallium/drivers/r600/r600d.h              |    5 +++
>>  src/gallium/winsys/r600/drm/r600_hw_context.c |   34 +++---
>>  src/gallium/winsys/r600/drm/r600d.h           |    5 +++
>>  4 files changed, 36 insertions(+), 55 deletions(-)
>>
>> diff --git a/src/gallium/drivers/r600/r600_state.c 
>> b/src/gallium/drivers/r600/r600_state.c
>> index 23323f1..23c2e59 100644
>> --- a/src/gallium/drivers/r600/r600_state.c
>> +++ b/src/gallium/drivers/r600/r600_state.c
>> @@ -1148,41 +1148,35 @@ static void r600_set_constant_buffer(struct 
>> pipe_context *ctx, uint shader, uint
>>                                        struct pipe_resource *buffer)
>>  {
>>        struct r600_pipe_context *rctx = (struct r600_pipe_context *)ctx;
>> -       struct r600_pipe_state *rstate;
>> -       struct pipe_transfer *transfer;
>> -       unsigned *nconst = NULL;
>> -       u32 *ptr, offset;
>> +       struct r600_resource *rbuffer = (struct r600_resource*)buffer;
>>
>>        switch (shader) {
>>        case PIPE_SHADER_VERTEX:
>> -               rstate = rctx->vs_const;
>> -               nconst = &rctx->vs_nconst;
>> -               offset = R_03_SQ_ALU_CONSTANT0_0 + 0x1000;
>> +               rctx->vs_const_buffer.nregs = 0;
>> +               r600_pipe_state_add_reg(&rctx->vs_const_buffer,
>> +                                       R_028180_ALU_CONST_BUFFER_SIZE_VS_0,
>> +                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
>> +                                       0x, NULL);
>> +               r600_pipe_state_add_reg(&rctx->vs_const_buffer,
>> +                                       R_028980_ALU_CONST_CACHE_VS_0,
>> +                                       0, 0x, rbuffer->bo);
>> +               r600_context_pipe_state_set(&rctx->ctx, 
>> &rctx->vs_const_buffer);
>>                break;
>>        case PIPE_SHADER_FRAGMENT:
>> -               rstate = rctx->ps_const;
>> -               nconst = &rctx->ps_nconst;
>> -               offset = R_03_SQ_ALU_CONSTANT0_0;
>> +               rctx->ps_const_buffer.nregs = 0;
>> +               r600_pipe_state_add_reg(&rctx->ps_const_buffer,
>> +                                       R_028140_ALU_CONST_BUFFER_SIZE_PS_0,
>> +                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
>> +                                       0x, NULL);
>> +               r600_pipe_state_add_reg(&rctx->ps_const_buffer,
>> +                                       R_028940_ALU_CONST_CACHE_PS_0,
>> +                                       0, 0x, rbuffer->bo);
>> +               r600_context_pipe_state_set(&rctx->ctx, 
>> &rctx->ps_const_buffer);
>>                break;
>>        default:
>>            

Re: [Mesa-dev] Mesa (master): r600g: use constant buffer instead of register for constant

2010-10-06 Thread Keith Whitwell
Yes, I belive so (at work now, so can't double-check).  I started with
the top of tree last night, which is well past this commit.

Keith

On Wed, 2010-10-06 at 02:12 -0700, Dave Airlie wrote:
> On Wed, Oct 6, 2010 at 6:29 PM, Keith Whitwell  
> wrote:
> > Hmm, same results on the machine's built-in rs880 (whatever that is...)
> >
> 
> You have tested with a commit after this?
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=585e4098aa0cb68a2cfce55ced5c585bd20aba24
> 
> Dave.
> 
> > Keith
> >
> > On Wed, Oct 6, 2010 at 9:08 AM, Keith Whitwell  
> > wrote:
> >> (using the correct mesa3d-dev address)
> >>
> >>
> >> -- Forwarded message --
> >> From: Keith Whitwell 
> >> Date: Wed, Oct 6, 2010 at 9:07 AM
> >> Subject: Re: Mesa (master): r600g: use constant buffer instead of
> >> register for constant
> >> To: mesa-dev@lists.freedesktop.org
> >> Cc: mesa3d-dev 
> >>
> >>
> >> Hi Jerome,
> >>
> >> I was playing with this driver on my new machine (rv710) & was
> >> impressed with how well it worked.  Unfortunately the next update
> >> everything went black...
> >>
> >> I traced it down to this commit - it basically seems like all the
> >> vertex-buffer constants (at least) are ending up as zero, so most
> >> geometry doesn't get transformed properly.  There are a few exceptions
> >> (like trivial/drawelements) which submit clip-space vertices & don't
> >> need transformation.
> >>
> >> I don't really know anything about how r600 hardware works, so thought
> >> I'd try & see if anyone else has an idea what's wrong...
> >>
> >> Keith
> >>
> >> On Thu, Sep 30, 2010 at 6:47 PM, Jerome Glisse
> >>  wrote:
> >>> Module: Mesa
> >>> Branch: master
> >>> Commit: 153105cfbfd8d6ff30de144605016f6e4f2a1b9e
> >>> URL:
> >>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=153105cfbfd8d6ff30de144605016f6e4f2a1b9e
> >>>
> >>> Author: Jerome Glisse 
> >>> Date:   Thu Sep 30 10:43:26 2010 -0400
> >>>
> >>> r600g: use constant buffer instead of register for constant
> >>>
> >>> Signed-off-by: Jerome Glisse 
> >>>
> >>> ---
> >>>
> >>>  src/gallium/drivers/r600/r600_state.c |   47 
> >>> +++--
> >>>  src/gallium/drivers/r600/r600d.h  |5 +++
> >>>  src/gallium/winsys/r600/drm/r600_hw_context.c |   34 +++---
> >>>  src/gallium/winsys/r600/drm/r600d.h   |5 +++
> >>>  4 files changed, 36 insertions(+), 55 deletions(-)
> >>>
> >>> diff --git a/src/gallium/drivers/r600/r600_state.c 
> >>> b/src/gallium/drivers/r600/r600_state.c
> >>> index 23323f1..23c2e59 100644
> >>> --- a/src/gallium/drivers/r600/r600_state.c
> >>> +++ b/src/gallium/drivers/r600/r600_state.c
> >>> @@ -1148,41 +1148,35 @@ static void r600_set_constant_buffer(struct 
> >>> pipe_context *ctx, uint shader, uint
> >>>struct pipe_resource *buffer)
> >>>  {
> >>>struct r600_pipe_context *rctx = (struct r600_pipe_context *)ctx;
> >>> -   struct r600_pipe_state *rstate;
> >>> -   struct pipe_transfer *transfer;
> >>> -   unsigned *nconst = NULL;
> >>> -   u32 *ptr, offset;
> >>> +   struct r600_resource *rbuffer = (struct r600_resource*)buffer;
> >>>
> >>>switch (shader) {
> >>>case PIPE_SHADER_VERTEX:
> >>> -   rstate = rctx->vs_const;
> >>> -   nconst = &rctx->vs_nconst;
> >>> -   offset = R_03_SQ_ALU_CONSTANT0_0 + 0x1000;
> >>> +   rctx->vs_const_buffer.nregs = 0;
> >>> +   r600_pipe_state_add_reg(&rctx->vs_const_buffer,
> >>> +   
> >>> R_028180_ALU_CONST_BUFFER_SIZE_VS_0,
> >>> +   ALIGN_DIVUP(buffer->width0 >> 4, 
> >>> 16),
> >>> +   0x, NULL);
> >>> +   r600_pipe_state_add_reg(&rctx->vs_const_buffer,
> >>> +   R_028980_ALU_CONST_CACHE_VS_0,

Re: [Mesa-dev] Mesa (master): r600g: use constant buffer instead of register for constant

2010-10-07 Thread Keith Whitwell
Hmm, this seems to have been transient issue - a few more package
updates & everything's working again...  Sorry for the noise...

Keith

On Wed, Oct 6, 2010 at 2:25 PM, Jerome Glisse  wrote:
> On Wed, Oct 6, 2010 at 5:19 AM, Keith Whitwell  wrote:
>> Yes, I belive so (at work now, so can't double-check).  I started with
>> the top of tree last night, which is well past this commit.
>>
>> Keith
>
> Strange i am using rv710 without issue, however my rs780 seems to have
> similar issue as one you describe, thought the commit updating the way
> we flush read cache seemed to fixed it. I will try to take a closer look 
> today.
>
> Cheers,
> Jerome
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Drop the "neutral" tnl module

2010-10-13 Thread Keith Whitwell
Looks good Kristian.  This has been worthy of cleanup for some time...

Keith

2010/10/13 Kristian Høgsberg :
> 2010/10/13 Kristian Høgsberg :
>> Just always check for FLUSH_UPDATE_CURRENT and call Driver.BeginVertices
>> when necessary.  By using the unlikely() macros, this ends up as
>> a 10% performance improvement (for isosurf, anyway) over the old,
>> complicated function pointer swapping.
>> ---
>
> I should say that this also fixes the bug we discussed a few weeks back:
>
>  http://lists.freedesktop.org/archives/mesa-dev/2010-September/002950.html
>
> The root cause, it turns out, was that I forgot to set up the neutral
> tnl module in case of a DRI driver that supports both GL and GLES (ie
> has FEATURE_beginend set) but is used for a GLES2 context.  That would
> have been a more minimal patch, but this gets rid of the pointer
> swapping and is faster.
>
> Kristian
>
>>  src/mesa/main/context.c     |    5 ---
>>  src/mesa/main/mtypes.h      |   29 --
>>  src/mesa/main/vtxfmt.c      |   67 
>> +-
>>  src/mesa/vbo/vbo_exec_api.c |   14 
>>  4 files changed, 9 insertions(+), 106 deletions(-)
>>
>> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
>> index 41f30ca..bb2dbf4 100644
>> --- a/src/mesa/main/context.c
>> +++ b/src/mesa/main/context.c
>> @@ -949,11 +949,6 @@ _mesa_initialize_context_for_api(struct gl_context *ctx,
>>
>>    switch (ctx->API) {
>>    case API_OPENGL:
>> -      /* Neutral tnl module stuff */
>> -      _mesa_init_exec_vtxfmt( ctx );
>> -      ctx->TnlModule.Current = NULL;
>> -      ctx->TnlModule.SwapCount = 0;
>> -
>>  #if FEATURE_dlist
>>       ctx->Save = _mesa_create_save_table();
>>       if (!ctx->Save) {
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index aace09d..6702032 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -2942,32 +2942,6 @@ struct gl_matrix_stack
>>  #include "dd.h"
>>
>>
>> -#define NUM_VERTEX_FORMAT_ENTRIES (sizeof(GLvertexformat) / sizeof(void *))
>> -
>> -/**
>> - * Core Mesa's support for tnl modules:
>> - */
>> -struct gl_tnl_module
>> -{
>> -   /**
>> -    * Vertex format to be lazily swapped into current dispatch.
>> -    */
>> -   const GLvertexformat *Current;
>> -
>> -   /**
>> -    * \name Record of functions swapped out.
>> -    * On restore, only need to swap these functions back in.
>> -    */
>> -   /*...@{*/
>> -   struct {
>> -       _glapi_proc * location;
>> -       _glapi_proc function;
>> -   } Swapped[NUM_VERTEX_FORMAT_ENTRIES];
>> -   GLuint SwapCount;
>> -   /*...@}*/
>> -};
>> -
>> -
>>  /**
>>  * Display list flags.
>>  * Strictly this is a tnl-private concept, but it doesn't seem
>> @@ -3231,9 +3205,6 @@ struct gl_context
>>     */
>>    GLboolean mvp_with_dp4;
>>
>> -   /** Core tnl module support */
>> -   struct gl_tnl_module TnlModule;
>> -
>>    /**
>>     * \name Hooks for module contexts.
>>     *
>> diff --git a/src/mesa/main/vtxfmt.c b/src/mesa/main/vtxfmt.c
>> index 284e777..887c7d9 100644
>> --- a/src/mesa/main/vtxfmt.c
>> +++ b/src/mesa/main/vtxfmt.c
>> @@ -34,51 +34,11 @@
>>  #include "vtxfmt.h"
>>  #include "eval.h"
>>  #include "dlist.h"
>> +#include "main/dispatch.h"
>>
>>
>>  #if FEATURE_beginend
>>
>> -
>> -/* The neutral vertex format.  This wraps all tnl module functions,
>> - * verifying that the currently-installed module is valid and then
>> - * installing the function pointers in a lazy fashion.  It records the
>> - * function pointers that have been swapped out, which allows a fast
>> - * restoration of the neutral module in almost all cases -- a typical
>> - * app might only require 4-6 functions to be modified from the neutral
>> - * baseline, and only restoring these is certainly preferable to doing
>> - * the entire module's 60 or so function pointers.
>> - */
>> -
>> -#define PRE_LOOPBACK( FUNC )                                           \
>> -{                                                                      \
>> -   GET_CURRENT_CONTEXT(ctx);                                           \
>> -   struct gl_tnl_module * const tnl = &(ctx->TnlModule);               \
>> -   const int tmp_offset = _gloffset_ ## FUNC ;                         \
>> -                                                                       \
>> -   ASSERT( tnl->Current );                                             \
>> -   ASSERT( tnl->SwapCount < NUM_VERTEX_FORMAT_ENTRIES );               \
>> -   ASSERT( tmp_offset >= 0 );                                          \
>> -                                                                        \
>> -   if (tnl->SwapCount == 0)                                             \
>> -      ctx->Driver.BeginVertices( ctx );                                 \
>> -                                                                        \
>> -   /* Save the swapped function's dispatch entry so it can be */        \
>> -   /* restored later. */                      

Re: [Mesa-dev] Demos (master): mipmap_tunnel: new test to examine mipmap filtering

2010-10-14 Thread Keith Whitwell
Isn't this a quite similar concept to tests/texfilt?

Keith


On Thu, Oct 14, 2010 at 3:55 PM, Brian Paul
 wrote:
> Module: Demos
> Branch: master
> Commit: 4d981d192bcff29fd85c794415148988518c6eae
> URL:    
> http://cgit.freedesktop.org/mesa/demos/commit/?id=4d981d192bcff29fd85c794415148988518c6eae
>
> Author: Brian Paul 
> Date:   Thu Oct 14 08:49:01 2010 -0600
>
> mipmap_tunnel: new test to examine mipmap filtering
>
> ---
>
>  src/tests/Makefile.am     |    1 +
>  src/tests/mipmap_tunnel.c |  250 
> +
>  2 files changed, 251 insertions(+), 0 deletions(-)
>
> diff --git a/src/tests/Makefile.am b/src/tests/Makefile.am
> index 03e3c97..cd7424b 100644
> --- a/src/tests/Makefile.am
> +++ b/src/tests/Makefile.am
> @@ -107,6 +107,7 @@ noinst_PROGRAMS = \
>        mipmap_comp_tests \
>        mipmap_limits \
>        mipmap_view \
> +       mipmap_tunnel \
>        multipal \
>        multitexarray \
>        multiwindow \
> diff --git a/src/tests/mipmap_tunnel.c b/src/tests/mipmap_tunnel.c
> new file mode 100644
> index 000..05c4e9e
> --- /dev/null
> +++ b/src/tests/mipmap_tunnel.c
> @@ -0,0 +1,250 @@
> +/**
> + * Display trilinear mipmap filtering quality.
> + * We look down a long tunnel shape which has a mipmapped texture
> + * applied to it.  Ideally, the transition from one mipmap level to
> + * another should be nice and regular/circular.
> + * This sort of test is frequently seen in online articles about GPU
> + * texture filtering.
> + *
> + * Brian Paul
> + * 13 Oct 2010
> + */
> +
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +
> +static GLfloat LodBias = 0.0;
> +static GLboolean NearestFilter = GL_FALSE;
> +static GLfloat Zpos = -10.0, Zrot = 0.0;
> +static GLuint TexObj;
> +
> +#define TEX_SIZE 1024
> +
> +
> +/** Make a solid-colored texture image */
> +static void
> +MakeImage(int level, int width, int height, const GLubyte color[4])
> +{
> +   const int makeStripes = 0;
> +   GLubyte img[TEX_SIZE * TEX_SIZE * 3];
> +   int i, j;
> +   for (i = 0; i < height; i++) {
> +      for (j = 0; j < width; j++) {
> +         int k = (i * width + j) * 3;
> +         int p = (i / 8) & makeStripes;
> +         if (p == 0) {
> +            img[k + 0] = color[0];
> +            img[k + 1] = color[1];
> +            img[k + 2] = color[2];
> +         }
> +         else {
> +            img[k + 0] = 0;
> +            img[k + 1] = 0;
> +            img[k + 2] = 0;
> +         }
> +      }
> +   }
> +
> +   glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
> +   glTexImage2D(GL_TEXTURE_2D, level, GL_RGB, width, height, 0,
> +                GL_RGB, GL_UNSIGNED_BYTE, img);
> +}
> +
> +
> +/** Make a mipmap in which each level is a different, solid color */
> +static void
> +MakeMipmap(void)
> +{
> +   static const GLubyte colors[12][3] = {
> +      {255, 0, 0},
> +      {0, 255, 0},
> +      {0, 0, 255},
> +      {0, 255, 255},
> +      {255, 0, 255},
> +      {255, 255, 0},
> +      {255, 0, 0},
> +      {0, 255, 0},
> +      {0, 0, 255},
> +      {0, 255, 255},
> +      {255, 0, 255},
> +      {255, 255, 0},
> +   };
> +   int i, sz = TEX_SIZE;
> +
> +   for (i = 0; sz > 0; i++) {
> +      MakeImage(i, sz, sz, colors[i]);
> +      printf("Level %d size: %d x %d\n", i, sz, sz);
> +      sz /= 2;
> +   }
> +}
> +
> +
> +static void
> +Init(void)
> +{
> +   glClearColor(.5, .5, .5, .5);
> +
> +   glGenTextures(1, &TexObj);
> +   glBindTexture(GL_TEXTURE_2D, TexObj);
> +   MakeMipmap();
> +
> +   glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
> +   glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
> +   glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);
> +
> +   printf("GL_RENDERER = %s\n", (char *) glGetString(GL_RENDERER));
> +   printf("GL_VERSION = %s\n", (char *) glGetString(GL_VERSION));
> +}
> +
> +
> +static void
> +DrawTunnel(void)
> +{
> +   const float radius = 10.0, height = 500.0;
> +   const int slices = 24, stacks = 52;
> +   const float bias = 0.995;
> +   GLUquadric *q = gluNewQuadric();
> +
> +   glPushMatrix();
> +      glRotatef(180, 1, 0, 0);
> +      glEnable(GL_TEXTURE_2D);
> +      gluQuadricTexture(q, GL_TRUE);
> +      gluCylinder(q, radius, radius, height, slices, stacks);
> +
> +      glDisable(GL_TEXTURE_2D);
> +      glColor3f(0, 0, 0);
> +      gluQuadricDrawStyle(q, GLU_LINE);
> +      gluCylinder(q, bias*radius, bias*radius, height/4, slices, stacks/4);
> +   glPopMatrix();
> +
> +   gluDeleteQuadric(q);
> +}
> +
> +
> +static void
> +PrintString(const char *s)
> +{
> +   while (*s) {
> +      glutBitmapCharacter(GLUT_BITMAP_8_BY_13, (int) *s);
> +      s++;
> +   }
> +}
> +
> +
> +static void
> +Display(void)
> +{
> +   char str[100];
> +
> +   glBindTexture(GL_TEXTURE_2D, TexObj);
> +
> +   if (NearestFilter) {
> +      glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
> +      glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,
> +                      GL_NEAR

[Mesa-dev] [PATCH 1/3] r600/drm: fix segfaults in winsys create failure path

2010-10-14 Thread Keith Whitwell
Would try to destroy radeon->cman, radeon->kman both which were still
NULL.
---
 src/gallium/winsys/r600/drm/r600_drm.c |   12 +---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/gallium/winsys/r600/drm/r600_drm.c 
b/src/gallium/winsys/r600/drm/r600_drm.c
index 5f175a4..4916843 100644
--- a/src/gallium/winsys/r600/drm/r600_drm.c
+++ b/src/gallium/winsys/r600/drm/r600_drm.c
@@ -179,9 +179,15 @@ struct radeon *radeon_decref(struct radeon *radeon)
return NULL;
}
 
-   radeon->cman->destroy(radeon->cman);
-   radeon->kman->destroy(radeon->kman);
-   drmClose(radeon->fd);
+if (radeon->cman)
+   radeon->cman->destroy(radeon->cman);
+
+if (radeon->kman)
+   radeon->kman->destroy(radeon->kman);
+
+if (radeon->fd >= 0)
+   drmClose(radeon->fd);
+
free(radeon);
return NULL;
 }
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] r600g: emit hardware linewidth

2010-10-14 Thread Keith Whitwell
Tested with demos/pixeltest - line rasterization doesn't seem to be
set up for GL conventions yet, but at least width is respected now.
---
 src/gallium/drivers/r600/r600_state.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 7b0aaef..2c0a200 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -499,7 +499,10 @@ static void *r600_create_rs_state(struct pipe_context *ctx,
tmp = (unsigned)(state->point_size * 8.0);
r600_pipe_state_add_reg(rstate, R_028A00_PA_SU_POINT_SIZE, 
S_028A00_HEIGHT(tmp) | S_028A00_WIDTH(tmp), 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028A04_PA_SU_POINT_MINMAX, 
0x8000, 0x, NULL);
-   r600_pipe_state_add_reg(rstate, R_028A08_PA_SU_LINE_CNTL, 0x0008, 
0x, NULL);
+
+   tmp = (unsigned)(state->line_width * 8.0);
+   r600_pipe_state_add_reg(rstate, R_028A08_PA_SU_LINE_CNTL, 
S_028A08_WIDTH(tmp), 0x, NULL);
+
r600_pipe_state_add_reg(rstate, R_028A0C_PA_SC_LINE_STIPPLE, 
0x0005, 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028A48_PA_SC_MPASS_PS_CNTL, 
0x, 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028C00_PA_SC_LINE_CNTL, 0x0400, 
0x, NULL);
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] r600g: handle unbind of constant buffers

2010-10-14 Thread Keith Whitwell
Statetrackers can unbind a constant buffer slot by calling

   pipe->set_constant_buffer(pipe, shader, slot, NULL)

The driver should unbind the buffer and potentially allow its storage
to be released.
---
 src/gallium/drivers/r600/r600_state.c |   20 
 1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 2c0a200..b5ae187 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -1160,28 +1160,40 @@ static void r600_set_constant_buffer(struct 
pipe_context *ctx, uint shader, uint
 {
struct r600_pipe_context *rctx = (struct r600_pipe_context *)ctx;
struct r600_resource *rbuffer = (struct r600_resource*)buffer;
+   unsigned width = 0;
+   unsigned offset = 0;
+   struct r600_bo *bo = NULL;
+   
+   /* Note that the state tracker can unbind constant buffers by
+* passing NULL here.
+*/
+   if (buffer != NULL) {
+   width = buffer->width0;
+   offset = r600_bo_offset(rbuffer->bo);
+   bo = rbuffer->bo;
+   }
 
switch (shader) {
case PIPE_SHADER_VERTEX:
rctx->vs_const_buffer.nregs = 0;
r600_pipe_state_add_reg(&rctx->vs_const_buffer,
R_028180_ALU_CONST_BUFFER_SIZE_VS_0,
-   ALIGN_DIVUP(buffer->width0 >> 4, 16),
+   ALIGN_DIVUP(width >> 4, 16),
0x, NULL);
r600_pipe_state_add_reg(&rctx->vs_const_buffer,
R_028980_ALU_CONST_CACHE_VS_0,
-   r600_bo_offset(rbuffer->bo) >> 8, 
0x, rbuffer->bo);
+   offset >> 8, 0x, bo);
r600_context_pipe_state_set(&rctx->ctx, &rctx->vs_const_buffer);
break;
case PIPE_SHADER_FRAGMENT:
rctx->ps_const_buffer.nregs = 0;
r600_pipe_state_add_reg(&rctx->ps_const_buffer,
R_028140_ALU_CONST_BUFFER_SIZE_PS_0,
-   ALIGN_DIVUP(buffer->width0 >> 4, 16),
+   ALIGN_DIVUP(width >> 4, 16),
0x, NULL);
r600_pipe_state_add_reg(&rctx->ps_const_buffer,
R_028940_ALU_CONST_CACHE_PS_0,
-   r600_bo_offset(rbuffer->bo) >> 8, 
0x, rbuffer->bo);
+   offset >> 8, 0x, bo);
r600_context_pipe_state_set(&rctx->ctx, &rctx->ps_const_buffer);
break;
default:
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] r600g: handle unbind of constant buffers

2010-10-14 Thread Keith Whitwell
Dave,

Sorry for being confusing, but this patch doesn't actually work - this
is what I think should be happening, but at the moment I have to
return early to avoid segfaults on unbinding constant buffers, ie:

+   if (buffer == NULL) {
+return;
+}
+else {
width = buffer->width0;
offset = r600_bo_offset(rbuffer->bo);
bo = rbuffer->bo;


Otherwise I end up with a crash here:

#0  r600_context_block_emit_dirty (ctx=0x837cc20, draw=0xb1c4)
at src/gallium/winsys/r600/drm/r600_priv.h:153
153 r600_context_bo_reloc(ctx,
(gdb) list
148
149 for (int j = 0; j < block->nreg; j++) {
150 if (block->pm4_bo_index[j]) {
151 /* find relocation */
152 id = block->pm4_bo_index[j];
153 r600_context_bo_reloc(ctx,
154
&block->pm4[block->reloc[id].bo_pm4_index],
155 block->reloc[id].bo);
156 r600_context_bo_flush(ctx,
157 block->reloc[id].flush_flags,


I haven't really figured out the state emit mechanism in this driver
yet...  Have you got any guidance what needs to be done here?

Keith


On Thu, Oct 14, 2010 at 4:42 PM, Keith Whitwell  wrote:
> Statetrackers can unbind a constant buffer slot by calling
>
>   pipe->set_constant_buffer(pipe, shader, slot, NULL)
>
> The driver should unbind the buffer and potentially allow its storage
> to be released.
> ---
>  src/gallium/drivers/r600/r600_state.c |   20 
>  1 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_state.c 
> b/src/gallium/drivers/r600/r600_state.c
> index 2c0a200..b5ae187 100644
> --- a/src/gallium/drivers/r600/r600_state.c
> +++ b/src/gallium/drivers/r600/r600_state.c
> @@ -1160,28 +1160,40 @@ static void r600_set_constant_buffer(struct 
> pipe_context *ctx, uint shader, uint
>  {
>        struct r600_pipe_context *rctx = (struct r600_pipe_context *)ctx;
>        struct r600_resource *rbuffer = (struct r600_resource*)buffer;
> +       unsigned width = 0;
> +       unsigned offset = 0;
> +       struct r600_bo *bo = NULL;
> +
> +       /* Note that the state tracker can unbind constant buffers by
> +        * passing NULL here.
> +        */
> +       if (buffer != NULL) {
> +               width = buffer->width0;
> +               offset = r600_bo_offset(rbuffer->bo);
> +               bo = rbuffer->bo;
> +       }
>
>        switch (shader) {
>        case PIPE_SHADER_VERTEX:
>                rctx->vs_const_buffer.nregs = 0;
>                r600_pipe_state_add_reg(&rctx->vs_const_buffer,
>                                        R_028180_ALU_CONST_BUFFER_SIZE_VS_0,
> -                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
> +                                       ALIGN_DIVUP(width >> 4, 16),
>                                        0x, NULL);
>                r600_pipe_state_add_reg(&rctx->vs_const_buffer,
>                                        R_028980_ALU_CONST_CACHE_VS_0,
> -                                       r600_bo_offset(rbuffer->bo) >> 8, 
> 0x, rbuffer->bo);
> +                                       offset >> 8, 0x, bo);
>                r600_context_pipe_state_set(&rctx->ctx, 
> &rctx->vs_const_buffer);
>                break;
>        case PIPE_SHADER_FRAGMENT:
>                rctx->ps_const_buffer.nregs = 0;
>                r600_pipe_state_add_reg(&rctx->ps_const_buffer,
>                                        R_028140_ALU_CONST_BUFFER_SIZE_PS_0,
> -                                       ALIGN_DIVUP(buffer->width0 >> 4, 16),
> +                                       ALIGN_DIVUP(width >> 4, 16),
>                                        0x, NULL);
>                r600_pipe_state_add_reg(&rctx->ps_const_buffer,
>                                        R_028940_ALU_CONST_CACHE_PS_0,
> -                                       r600_bo_offset(rbuffer->bo) >> 8, 
> 0x, rbuffer->bo);
> +                                       offset >> 8, 0x, bo);
>                r600_context_pipe_state_set(&rctx->ctx, 
> &rctx->ps_const_buffer);
>                break;
>        default:
> --
> 1.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: handle absolute modifier in shader translator

2010-10-14 Thread Keith Whitwell
This was being classed as unsupported in one place but used in others.
Enabling it seems to work fine.
---
 src/gallium/drivers/r600/r600_shader.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 141adcc..b53d478 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -389,11 +389,9 @@ static int tgsi_is_supported(struct r600_shader_ctx *ctx)
}
 #endif
for (j = 0; j < i->Instruction.NumSrcRegs; j++) {
-   if (i->Src[j].Register.Dimension ||
-   i->Src[j].Register.Absolute) {
-   R600_ERR("unsupported src %d (dimension %d|absolute 
%d)\n", j,
-i->Src[j].Register.Dimension,
-i->Src[j].Register.Absolute);
+   if (i->Src[j].Register.Dimension) {
+   R600_ERR("unsupported src %d (dimension %d)\n", j,
+i->Src[j].Register.Dimension);
return -EINVAL;
}
}
@@ -760,6 +758,7 @@ static int tgsi_src(struct r600_shader_ctx *ctx,
if (tgsi_src->Register.Indirect)
r600_src->rel = V_SQ_REL_RELATIVE;
r600_src->neg = tgsi_src->Register.Negate;
+   r600_src->abs = tgsi_src->Register.Absolute;
r600_src->sel += ctx->file_offset[tgsi_src->Register.File];
return 0;
 }
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Proposal for a long-term shader compiler (and IR) architecture

2010-10-18 Thread Keith Whitwell
On Mon, Oct 18, 2010 at 9:18 AM, Jerome Glisse  wrote:
> On Fri, Oct 15, 2010 at 7:44 PM, John Kessenich  wrote:
>> Hi,
>> LunarG has decided to work on an open source, long-term, highly-functional,
>> and modular shader and kernel compiler stack. Attached is our high-level
>> proposal for this compiler architecture (LunarGLASS).  We would like to
>> solicit feedback from the open source community on doing this.
>> I have read several posts here where it seems the time has come for
>> something like this, and in that spirit, I hope this is consistent with the
>> desire and direction many contributors to this list have already alluded to.
>> Perhaps the biggest point of the proposal is to standardize on LLVM as an
>> intermediate representation.  This is actually done at two levels within the
>> proposal; one at a high-level IR close to the source language and one at a
>> low-level IR close to the target architecture.  The full picture is in the
>> attached document.
>> Based on feedback to this proposal, our next step is to more precisely
>> define the two forms of LLVM IR.
>> Please let me know if you have any trouble reading the attached, or any
>> questions, or any feedback regarding the proposal.
>> Thanks,
>> JohnK
>
>
> Just a quick reply (i won't have carefully read through this proposition 
> before
> couple weeks) last time i check LLVM didn't seemed to fit the bill for GPU,
> newer GPU can be seen as close to scalar but not completely, there are
> restriction on instruction packing and the amount of data computation
> unit of gpu can access per cycle, also register allocation is different
> from normal CPU, you don't wan to do register peeling on GPU. So from
> my POV instruction scheduling & packing and register allocation are
> interlace process (where you store variable impact instruction packing).
> Also on newer gpu it makes sense to use a mixed scalar/vector representation
> to preserve things like dot product. Last loop, jump, function have kind
> of unsual restriction unlike any CPU (thought i haven't broad CPU knowledge)
>
> Bottom line is i don't think LLVM is anywhere near what would help us.


I think this is the big question mark with this proposal -- basically
can it be done?

I believe John feels the answer to that is yes, it can, with some
work.  From my point of view, I think I need to actually see it - but
it sounds like this is what John is saying they're going to do.

At a high level, LLVM is very compelling - there's a lot of work going
on for it, a lot of people enhancing it, etc.  Now, if it's possible
to leverage that for shader compilation, I think that's very
interesting.

So basically I think it's necessary to figure out what would
constitute evidence that LLVM is capable of doing the job, and make
getting to that point a priority.

If it can't be done, we'll find out quickly, if it can then we can
stop debating whether or not it's possible.

Keith
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] st/mesa: Reset the index buffer before destroying the pipe context.

2010-11-01 Thread Keith Whitwell
Tilman,

This looks good - it makes sense to also reset the constant buffers,
etc, at the same point...

Keith

On Sun, Oct 31, 2010 at 4:38 PM, Tilman Sauerbeck  wrote:
> Signed-off-by: Tilman Sauerbeck 
> ---
>  src/mesa/state_tracker/st_context.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_context.c 
> b/src/mesa/state_tracker/st_context.c
> index 75fd695..b5ea6d0 100644
> --- a/src/mesa/state_tracker/st_context.c
> +++ b/src/mesa/state_tracker/st_context.c
> @@ -236,6 +236,8 @@ void st_destroy_context( struct st_context *st )
>    }
>    pipe_surface_reference(&st->state.framebuffer.zsbuf, NULL);
>
> +   pipe->set_index_buffer(pipe, NULL);
> +
>    _mesa_delete_program_cache(st->ctx, st->pixel_xfer.cache);
>
>    _vbo_DestroyContext(st->ctx);
> --
> 1.7.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: Avoid unnecessary copy to/from current in vertex format upgrade.

2010-11-02 Thread Keith Whitwell
Francisco,

This looks good - my only comment is that there seem to be two distinct
changes in this patch -- the modification to VBO behaviour when adding a
new attribute being one, but the changes to vbo_exec_draw.c seem to be
an unrelated cleanup.  Is that correct?

Ordinarily I wouldn't bother pointing it out, but VBO is fairly complex
code & being able to isolate any regression to the smallest change
possible would be preferable -- would it be possible to split the patch
into two pieces & commit separately?

Also regarding testing, it makes sense to run through all the mesa-progs
demos/ and samples/ directory, as they tend to have a good selection of
odd cases for this code.

Keith


On Mon, 2010-11-01 at 13:06 -0700, Francisco Jerez wrote:
> Rebuilding the vertex format from scratch every time we see a new
> vertex attribute is rather costly, new attributes can be appended at
> the end avoiding a copy to current and then back again, and the full
> attr pointer recalculation.
> 
> In the not so likely case of an already existing attribute having its
> size increased the old behavior is preserved, this could be optimized
> more, not sure if it's worth it.
> 
> It's a modest improvement in FlightGear (that game punishes the VBO
> module pretty hard in general, framerate goes from some 46 FPS to 50
> FPS with the nouveau classic driver).
> ---
> I've run piglit before and after and apparently there are no
> regressions, Tested-by's on different hardware would be appreciated
> though.
> 
>  src/mesa/vbo/vbo_exec_api.c  |   98
> +-
>  src/mesa/vbo/vbo_exec_draw.c |   13 ++---
>  2 files changed, 63 insertions(+), 48 deletions(-)
> 
> diff --git a/src/mesa/vbo/vbo_exec_api.c b/src/mesa/vbo/vbo_exec_api.c
> index 4988c51..9b2d59f 100644
> --- a/src/mesa/vbo/vbo_exec_api.c
> +++ b/src/mesa/vbo/vbo_exec_api.c
> @@ -220,8 +220,9 @@ static void vbo_exec_wrap_upgrade_vertex( struct
> vbo_exec_context *exec,
> struct gl_context *ctx = exec->ctx;
> struct vbo_context *vbo = vbo_context(ctx);
> GLint lastcount = exec->vtx.vert_count;
> -   GLfloat *tmp;
> -   GLuint oldsz;
> +   GLfloat *old_attrptr[VBO_ATTRIB_MAX];
> +   GLuint old_vtx_size = exec->vtx.vertex_size;
> +   GLuint oldsz = exec->vtx.attrsz[attr];
> GLuint i;
>  
> /* Run pipeline on current vertices, copy wrapped vertices
> @@ -229,86 +230,103 @@ static void
> vbo_exec_wrap_upgrade_vertex( struct vbo_exec_context *exec,
>  */
> vbo_exec_wrap_buffers( exec );
>  
> +   if (unlikely(exec->vtx.copied.nr)) {
> +  /* We're in the middle of a primitive, keep the old vertex
> +   * format around to be able to translate the copied vertices to
> +   * the new format.
> +   */
> +  memcpy(old_attrptr, exec->vtx.attrptr, sizeof(old_attrptr));
> +   }
>  
> -   /* Do a COPY_TO_CURRENT to ensure back-copying works for the case
> -* when the attribute already exists in the vertex and is having
> -* its size increased.  
> -*/
> -   vbo_exec_copy_to_current( exec );
> -
> +   if (unlikely(oldsz)) {
> +  /* Do a COPY_TO_CURRENT to ensure back-copying works for the
> +   * case when the attribute already exists in the vertex and is
> +   * having its size increased.
> +   */
> +  vbo_exec_copy_to_current( exec );
> +   }
>  
> /* Heuristic: Attempt to isolate attributes received outside
>  * begin/end so that they don't bloat the vertices.
>  */
> if (ctx->Driver.CurrentExecPrimitive == PRIM_OUTSIDE_BEGIN_END &&
> -   exec->vtx.attrsz[attr] == 0 && 
> -   lastcount > 8 &&
> -   exec->vtx.vertex_size) {
> +   !oldsz && lastcount > 8 && exec->vtx.vertex_size) {
> +  vbo_exec_copy_to_current( exec );
>reset_attrfv( exec );
> }
>  
> /* Fix up sizes:
>  */
> -   oldsz = exec->vtx.attrsz[attr];
> exec->vtx.attrsz[attr] = newsz;
> -
> exec->vtx.vertex_size += newsz - oldsz;
> exec->vtx.max_vert = ((VBO_VERT_BUFFER_SIZE -
> exec->vtx.buffer_used) / 
>   (exec->vtx.vertex_size * sizeof(GLfloat)));
> exec->vtx.vert_count = 0;
> exec->vtx.buffer_ptr = exec->vtx.buffer_map;
> -   
>  
> -   /* Recalculate all the attrptr[] values
> -*/
> -   for (i = 0, tmp = exec->vtx.vertex ; i < VBO_ATTRIB_MAX ; i++) {
> -  if (exec->vtx.attrsz[i]) {
> -exec->vtx.attrptr[i] = tmp;
> -tmp += exec->vtx.attrsz[i];
> +   if (unlikely(oldsz)) {
> +  /* Size changed, recalculate all the attrptr[] values
> +   */
> +  GLfloat *tmp = exec->vtx.vertex;
> +
> +  for (i = 0 ; i < VBO_ATTRIB_MAX ; i++) {
> +if (exec->vtx.attrsz[i]) {
> +   exec->vtx.attrptr[i] = tmp;
> +   tmp += exec->vtx.attrsz[i];
> +}
> +else
> +   exec->vtx.attrptr[i] = NULL; /* will not be dereferenced
> */
>}
> -  else 
> -exec->vtx.attrptr[i] = NULL; /* will not be dereferenced */
> -   }
>  
> -   /* Copy fro

Re: [Mesa-dev] [PATCH] vbo: Avoid unnecessary copy to/from current in vertex format upgrade.

2010-11-02 Thread Keith Whitwell
On Tue, 2010-11-02 at 11:21 -0700, Francisco Jerez wrote:
> Keith Whitwell  writes:
> 
> > Francisco,
> >
> > This looks good - my only comment is that there seem to be two distinct
> > changes in this patch -- the modification to VBO behaviour when adding a
> > new attribute being one, but the changes to vbo_exec_draw.c seem to be
> > an unrelated cleanup.  Is that correct?
> >
> > Ordinarily I wouldn't bother pointing it out, but VBO is fairly complex
> > code & being able to isolate any regression to the smallest change
> > possible would be preferable -- would it be possible to split the patch
> > into two pieces & commit separately?
> >
> Yes, I can split it in two pieces if you want, but that hunk isn't an
> unrelated cleanup, the bulk of this patch deals with fixing several
> places that assume that vertex attributes are stored in VBO_ATTRIB
> order, vbo_exec_bind_arrays() was one of them.

OK - if they're bound together then don't worry about splitting - it
wasn't easy to tell by looking at the patch.  The change looks good to
me.

Keith



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] r600g: propagate usage flags in texture transfers

2010-11-02 Thread Keith Whitwell
---
 src/gallium/drivers/r600/r600_texture.c |   27 ++-
 1 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 4ebd5b7..7222b43 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include "state_tracker/drm_driver.h"
+#include "pipebuffer/pb_buffer.h"
 #include "r600_pipe.h"
 #include "r600_resource.h"
 #include "r600_state_inlines.h"
@@ -537,6 +538,7 @@ void* r600_texture_transfer_map(struct pipe_context *ctx,
enum pipe_format format = transfer->resource->format;
struct radeon *radeon = (struct radeon *)ctx->screen->winsys;
unsigned offset = 0;
+   unsigned usage = 0;
char *map;
 
if (rtransfer->linear_texture) {
@@ -553,7 +555,30 @@ void* r600_texture_transfer_map(struct pipe_context *ctx,
transfer->box.y / util_format_get_blockheight(format) * 
transfer->stride +
transfer->box.x / util_format_get_blockwidth(format) * 
util_format_get_blocksize(format);
}
-   map = r600_bo_map(radeon, bo, 0, ctx);
+
+   if (transfer->usage & PIPE_TRANSFER_WRITE) {
+   usage |= PB_USAGE_CPU_WRITE;
+
+   if (transfer->usage & PIPE_TRANSFER_DISCARD) {
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_FLUSH_EXPLICIT) {
+   }
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_READ) {
+   usage |= PB_USAGE_CPU_READ;
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_DONTBLOCK) {
+   usage |= PB_USAGE_DONTBLOCK;
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_UNSYNCHRONIZED) {
+   usage |= PB_USAGE_UNSYNCHRONIZED;
+   }
+
+   map = r600_bo_map(radeon, bo, usage, ctx);
if (!map) {
return NULL;
}
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] r600g: propogate resource usage flags to winsys, use to choose bo domains

2010-11-02 Thread Keith Whitwell
This opens the question of what interface the winsys layer should
really have for talking about these concepts.

For now I'm using the existing gallium resource usage concept, but
there is no reason not use terms closer to what the hardware
understands - eg. the domains themselves.
---
 src/gallium/drivers/r600/r600.h   |3 ++-
 src/gallium/drivers/r600/r600_buffer.c|7 ---
 src/gallium/drivers/r600/r600_shader.c|2 +-
 src/gallium/drivers/r600/r600_texture.c   |2 +-
 src/gallium/winsys/r600/drm/r600_bo.c |   20 +---
 src/gallium/winsys/r600/drm/r600_hw_context.c |   13 +
 src/gallium/winsys/r600/drm/r600_priv.h   |1 +
 7 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h
index 62d9832..5ec607b 100644
--- a/src/gallium/drivers/r600/r600.h
+++ b/src/gallium/drivers/r600/r600.h
@@ -112,7 +112,8 @@ struct r600_tiling_info *r600_get_tiling_info(struct radeon 
*radeon);
 /* r600_bo.c */
 struct r600_bo;
 struct r600_bo *r600_bo(struct radeon *radeon,
- unsigned size, unsigned alignment, unsigned 
usage);
+unsigned size, unsigned alignment,
+unsigned binding, unsigned usage);
 struct r600_bo *r600_bo_handle(struct radeon *radeon,
   unsigned handle, unsigned *array_mode);
 void *r600_bo_map(struct radeon *radeon, struct r600_bo *bo, unsigned usage, 
void *ctx);
diff --git a/src/gallium/drivers/r600/r600_buffer.c 
b/src/gallium/drivers/r600/r600_buffer.c
index 455aa2e..3c45d78 100644
--- a/src/gallium/drivers/r600/r600_buffer.c
+++ b/src/gallium/drivers/r600/r600_buffer.c
@@ -86,7 +86,7 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen 
*screen,
rbuffer->r.base.vtbl = &r600_buffer_vtbl;
rbuffer->r.size = rbuffer->r.base.b.width0;
rbuffer->r.domain = r600_domain_from_usage(rbuffer->r.base.b.bind);
-   bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
alignment, rbuffer->r.base.b.bind);
+   bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
alignment, rbuffer->r.base.b.bind, rbuffer->r.base.b.usage);
if (bo == NULL) {
FREE(rbuffer);
return NULL;
@@ -156,8 +156,9 @@ static void *r600_buffer_transfer_map(struct pipe_context 
*pipe,
r600_bo_reference((struct radeon*)pipe->winsys, 
&rbuffer->r.bo, NULL);
rbuffer->num_ranges = 0;
rbuffer->r.bo = r600_bo((struct 
radeon*)pipe->winsys,
-
rbuffer->r.base.b.width0, 0,
-
rbuffer->r.base.b.bind);
+
rbuffer->r.base.b.width0, 0,
+rbuffer->r.base.b.bind,
+
rbuffer->r.base.b.usage);
break;
}
}
diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 4106587..1a0b35d 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -218,7 +218,7 @@ static int r600_pipe_shader(struct pipe_context *ctx, 
struct r600_pipe_shader *s
 
/* copy new shader */
if (shader->bo == NULL) {
-   shader->bo = r600_bo(rctx->radeon, rshader->bc.ndw * 4, 4096, 
0);
+   shader->bo = r600_bo(rctx->radeon, rshader->bc.ndw * 4, 4096, 
0, 0);
if (shader->bo == NULL) {
return -ENOMEM;
}
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 7222b43..9a52cfa 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -294,7 +294,7 @@ r600_texture_create_object(struct pipe_screen *screen,
resource->size = rtex->size;
 
if (!resource->bo) {
-   resource->bo = r600_bo(radeon, rtex->size, 4096, 0);
+   resource->bo = r600_bo(radeon, rtex->size, 4096, base->bind, 
base->usage);
if (!resource->bo) {
FREE(rtex);
return NULL;
diff --git a/src/gallium/winsys/r600/drm/r600_bo.c 
b/src/gallium/winsys/r600/drm/r600_bo.c
index 7d54ff1..de46d50 100644
--- a/src/gallium/winsys/r600/drm/r600_bo.c
+++ b/src/gallium/winsys/r600/drm/r600_bo.c
@@ -29,23 +29,37 @@
 #include "radeon_drm.h"
 #include "r600_priv.h"
 #include "r600d.h"
+#include "radeon_drm.h"
 
 struct r600_bo *r600_bo(struct radeon *radeon,
- unsigned size, unsigned alignment, unsigned 
usage)
+   unsigne

[Mesa-dev] [PATCH 3/5] r600g: use a buffer in GTT as intermediate on texture up and downloads

2010-11-02 Thread Keith Whitwell
Generalize the existing tiled_buffer path in texture transfers for use
in some non-tiled up and downloads.

Use a staging buffer, which the winsys will restrict to GTT memory.

GTT buffers have the major advantage when they are mapped, they are
cachable, which is a very nice property for downloads, usually the CPU
will want to do look at the data it downloaded.
---
 src/gallium/drivers/r600/r600_resource.h |2 +-
 src/gallium/drivers/r600/r600_texture.c  |   85 ++
 2 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_resource.h 
b/src/gallium/drivers/r600/r600_resource.h
index d152285..d24d5a1 100644
--- a/src/gallium/drivers/r600/r600_resource.h
+++ b/src/gallium/drivers/r600/r600_resource.h
@@ -35,7 +35,7 @@ struct r600_transfer {
/* Buffer transfer. */
struct pipe_transfer*buffer_transfer;
unsignedoffset;
-   struct pipe_resource*linear_texture;
+   struct pipe_resource*staging_texture;
 };
 
 /* This gets further specialized into either buffer or texture
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 9a52cfa..8fbe4a0 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -40,8 +40,8 @@
 
 extern struct u_resource_vtbl r600_texture_vtbl;
 
-/* Copy from a tiled texture to a detiled one. */
-static void r600_copy_from_tiled_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
+/* Copy from a full GPU texture to a transfer's staging one. */
+static void r600_copy_to_staging_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
 {
struct pipe_transfer *transfer = (struct pipe_transfer*)rtransfer;
struct pipe_resource *texture = transfer->resource;
@@ -49,15 +49,15 @@ static void r600_copy_from_tiled_texture(struct 
pipe_context *ctx, struct r600_t
 
subdst.face = 0;
subdst.level = 0;
-   ctx->resource_copy_region(ctx, rtransfer->linear_texture,
+   ctx->resource_copy_region(ctx, rtransfer->staging_texture,
subdst, 0, 0, 0, texture, transfer->sr,
transfer->box.x, transfer->box.y, 
transfer->box.z,
transfer->box.width, transfer->box.height);
 }
 
 
-/* Copy from a detiled texture to a tiled one. */
-static void r600_copy_into_tiled_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
+/* Copy from a transfer's staging texture to a full GPU one. */
+static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
 {
struct pipe_transfer *transfer = (struct pipe_transfer*)rtransfer;
struct pipe_resource *texture = transfer->resource;
@@ -67,7 +67,7 @@ static void r600_copy_into_tiled_texture(struct pipe_context 
*ctx, struct r600_t
subsrc.level = 0;
ctx->resource_copy_region(ctx, texture, transfer->sr,
  transfer->box.x, transfer->box.y, 
transfer->box.z,
- rtransfer->linear_texture, subsrc,
+ rtransfer->staging_texture, subsrc,
  0, 0, 0,
  transfer->box.width, transfer->box.height);
 
@@ -435,10 +435,20 @@ int r600_texture_depth_flush(struct pipe_context *ctx,
}
 
 out:
+   /* XXX: only do this if the depth texture has actually changed:
+*/
r600_blit_uncompress_depth_ptr(ctx, rtex);
return 0;
 }
 
+/* Needs adjustment for pixelformat:
+ */
+static INLINE unsigned u_box_volume( const struct pipe_box *box )
+{
+return box->width * box->depth * box->height;
+};
+
+
 struct pipe_transfer* r600_texture_get_transfer(struct pipe_context *ctx,
struct pipe_resource *texture,
struct pipe_subresource sr,
@@ -449,6 +459,35 @@ struct pipe_transfer* r600_texture_get_transfer(struct 
pipe_context *ctx,
struct pipe_resource resource;
struct r600_transfer *trans;
int r;
+   boolean use_staging_texture = FALSE;
+   boolean discard = FALSE;
+
+   if (!(usage & PIPE_TRANSFER_READ) && (usage & PIPE_TRANSFER_DISCARD))
+   discard = TRUE;
+
+   /* We cannot map a tiled texture directly because the data is
+* in a different order, therefore we do detiling using a blit.
+*
+* Also, use a temporary in GTT memory for read transfers, as
+* the CPU is much happier reading out of cached system memory
+* than uncached VRAM.
+*/
+   if (rtex->tiled)
+   use_staging_texture = TRUE;
+
+if (usage & PIPE_TRANSFER_READ &&
+u_box_volume(box) > 1024)
+use_staging_texture = TRUE;
+
+/* XXX

[Mesa-dev] [PATCH 4/5] r600g: remove unused flink, domain fields from r600_resource

2010-11-02 Thread Keith Whitwell
These were being set but not used anywhere.
---
 src/gallium/drivers/r600/r600_buffer.c   |   27 ---
 src/gallium/drivers/r600/r600_resource.h |5 -
 src/gallium/drivers/r600/r600_texture.c  |1 -
 3 files changed, 0 insertions(+), 33 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_buffer.c 
b/src/gallium/drivers/r600/r600_buffer.c
index 3c45d78..ed97b6e 100644
--- a/src/gallium/drivers/r600/r600_buffer.c
+++ b/src/gallium/drivers/r600/r600_buffer.c
@@ -38,32 +38,6 @@
 
 extern struct u_resource_vtbl r600_buffer_vtbl;
 
-u32 r600_domain_from_usage(unsigned usage)
-{
-   u32 domain = RADEON_GEM_DOMAIN_GTT;
-
-   if (usage & PIPE_BIND_RENDER_TARGET) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-   if (usage & PIPE_BIND_DEPTH_STENCIL) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-   if (usage & PIPE_BIND_SAMPLER_VIEW) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-   /* also need BIND_BLIT_SOURCE/DESTINATION ? */
-   if (usage & PIPE_BIND_VERTEX_BUFFER) {
-   domain |= RADEON_GEM_DOMAIN_GTT;
-   }
-   if (usage & PIPE_BIND_INDEX_BUFFER) {
-   domain |= RADEON_GEM_DOMAIN_GTT;
-   }
-   if (usage & PIPE_BIND_CONSTANT_BUFFER) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-
-   return domain;
-}
 
 struct pipe_resource *r600_buffer_create(struct pipe_screen *screen,
 const struct pipe_resource *templ)
@@ -85,7 +59,6 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen 
*screen,
rbuffer->r.base.b.screen = screen;
rbuffer->r.base.vtbl = &r600_buffer_vtbl;
rbuffer->r.size = rbuffer->r.base.b.width0;
-   rbuffer->r.domain = r600_domain_from_usage(rbuffer->r.base.b.bind);
bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
alignment, rbuffer->r.base.b.bind, rbuffer->r.base.b.usage);
if (bo == NULL) {
FREE(rbuffer);
diff --git a/src/gallium/drivers/r600/r600_resource.h 
b/src/gallium/drivers/r600/r600_resource.h
index d24d5a1..7a2d1f4 100644
--- a/src/gallium/drivers/r600/r600_resource.h
+++ b/src/gallium/drivers/r600/r600_resource.h
@@ -45,8 +45,6 @@ struct r600_transfer {
 struct r600_resource {
struct u_resource   base;
struct r600_bo  *bo;
-   u32 domain;
-   u32 flink;
u32 size;
 };
 
@@ -68,9 +66,6 @@ struct r600_resource_texture {
 
 void r600_init_screen_resource_functions(struct pipe_screen *screen);
 
-/* r600_buffer */
-u32 r600_domain_from_usage(unsigned usage);
-
 /* r600_texture */
 struct pipe_resource *r600_texture_create(struct pipe_screen *screen,
const struct pipe_resource *templ);
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 8fbe4a0..c92f634 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -284,7 +284,6 @@ r600_texture_create_object(struct pipe_screen *screen,
pipe_reference_init(&resource->base.b.reference, 1);
resource->base.b.screen = screen;
resource->bo = bo;
-   resource->domain = r600_domain_from_usage(resource->base.b.bind);
rtex->pitch_override = pitch_in_bytes_override;
 
if (array_mode)
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] r600g: set hardware pixel centers according to gl_rasterization_rules

2010-11-02 Thread Keith Whitwell
These were previously being left in the default (D3D) mode.  This mean
that triangles were drawn slightly incorrectly, but also because this
state is relied on by the u_blitter code, all blits were half a pixel
off.
---
 src/gallium/drivers/r600/r600_state.c |5 +
 src/gallium/drivers/r600/r600d.h  |4 
 src/gallium/winsys/r600/drm/r600_hw_context.c |1 +
 src/gallium/winsys/r600/drm/r600d.h   |1 +
 4 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index ccd7421..17e64b1 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -475,6 +475,11 @@ static void *r600_create_rs_state(struct pipe_context *ctx,
r600_pipe_state_add_reg(rstate, R_028A0C_PA_SC_LINE_STIPPLE, 
0x0005, 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028A48_PA_SC_MPASS_PS_CNTL, 
0x, 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028C00_PA_SC_LINE_CNTL, 0x0400, 
0x, NULL);
+
+   r600_pipe_state_add_reg(rstate, R_028C08_PA_SU_VTX_CNTL,
+   
S_028C08_PIX_CENTER_HALF(state->gl_rasterization_rules),
+   0x, NULL);
+
r600_pipe_state_add_reg(rstate, R_028C0C_PA_CL_GB_VERT_CLIP_ADJ, 
0x3F80, 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028C10_PA_CL_GB_VERT_DISC_ADJ, 
0x3F80, 0x, NULL);
r600_pipe_state_add_reg(rstate, R_028C14_PA_CL_GB_HORZ_CLIP_ADJ, 
0x3F80, 0x, NULL);
diff --git a/src/gallium/drivers/r600/r600d.h b/src/gallium/drivers/r600/r600d.h
index a3cb5b8..ae19bfb 100644
--- a/src/gallium/drivers/r600/r600d.h
+++ b/src/gallium/drivers/r600/r600d.h
@@ -2100,6 +2100,10 @@
 #define   G_028C00_LAST_PIXEL(x)   (((x) >> 10) & 0x1)
 #define   C_028C00_LAST_PIXEL  0xFBFF
 #define R_028C04_PA_SC_AA_CONFIG 0x028C04
+#define R_028C08_PA_SU_VTX_CNTL  0x028C08
+#define   S_028C08_PIX_CENTER_HALF(x)  (((x) & 0x1) << 0)
+#define   G_028C08_PIX_CENTER_HALF(x)  (((x) >> 0) & 0x1)
+#define   C_028C08_PIX_CENTER_HALF 0xFFFE
 #define R_028C1C_PA_SC_AA_SAMPLE_LOCS_MCTX   0x028C1C
 #define R_028C48_PA_SC_AA_MASK   0x028C48
 #define R_028810_PA_CL_CLIP_CNTL 0x028810
diff --git a/src/gallium/winsys/r600/drm/r600_hw_context.c 
b/src/gallium/winsys/r600/drm/r600_hw_context.c
index effb228..c33f81e 100644
--- a/src/gallium/winsys/r600/drm/r600_hw_context.c
+++ b/src/gallium/winsys/r600/drm/r600_hw_context.c
@@ -384,6 +384,7 @@ static const struct r600_reg r600_context_reg_list[] = {
{PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028A0C_PA_SC_LINE_STIPPLE, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028A48_PA_SC_MPASS_PS_CNTL, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028C00_PA_SC_LINE_CNTL, 0, 0, 0},
+   {PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028C08_PA_SU_VTX_CNTL, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028C0C_PA_CL_GB_VERT_CLIP_ADJ, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028C10_PA_CL_GB_VERT_DISC_ADJ, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, R600_CONTEXT_REG_OFFSET, 
R_028C14_PA_CL_GB_HORZ_CLIP_ADJ, 0, 0, 0},
diff --git a/src/gallium/winsys/r600/drm/r600d.h 
b/src/gallium/winsys/r600/drm/r600d.h
index d91f773..5ca7456 100644
--- a/src/gallium/winsys/r600/drm/r600d.h
+++ b/src/gallium/winsys/r600/drm/r600d.h
@@ -795,6 +795,7 @@
 #define R_028A48_PA_SC_MPASS_PS_CNTL 0x028A48
 #define R_028C00_PA_SC_LINE_CNTL 0x028C00
 #define R_028C04_PA_SC_AA_CONFIG 0x028C04
+#define R_028C08_PA_SU_VTX_CNTL  0x028C08
 #define R_028C1C_PA_SC_AA_SAMPLE_LOCS_MCTX   0x028C1C
 #define R_028C48_PA_SC_AA_MASK   0x028C48
 #define R_028810_PA_CL_CLIP_CNTL 0x028810
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] r600g: propogate resource usage flags to winsys, use to choose bo domains

2010-11-02 Thread Keith Whitwell
Hmm, after "cleaning" these patches up to mail out, I'm now seeing some
problems with this one...  sigh.  I'll resend shortly.

Keith

On Tue, 2010-11-02 at 11:40 -0700, Keith Whitwell wrote:
> This opens the question of what interface the winsys layer should
> really have for talking about these concepts.
> 
> For now I'm using the existing gallium resource usage concept, but
> there is no reason not use terms closer to what the hardware
> understands - eg. the domains themselves.
> ---
>  src/gallium/drivers/r600/r600.h   |3 ++-
>  src/gallium/drivers/r600/r600_buffer.c|7 ---
>  src/gallium/drivers/r600/r600_shader.c|2 +-
>  src/gallium/drivers/r600/r600_texture.c   |2 +-
>  src/gallium/winsys/r600/drm/r600_bo.c |   20 +---
>  src/gallium/winsys/r600/drm/r600_hw_context.c |   13 +
>  src/gallium/winsys/r600/drm/r600_priv.h   |1 +
>  7 files changed, 35 insertions(+), 13 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h
> index 62d9832..5ec607b 100644
> --- a/src/gallium/drivers/r600/r600.h
> +++ b/src/gallium/drivers/r600/r600.h
> @@ -112,7 +112,8 @@ struct r600_tiling_info *r600_get_tiling_info(struct 
> radeon *radeon);
>  /* r600_bo.c */
>  struct r600_bo;
>  struct r600_bo *r600_bo(struct radeon *radeon,
> -   unsigned size, unsigned alignment, unsigned 
> usage);
> +unsigned size, unsigned alignment,
> +unsigned binding, unsigned usage);
>  struct r600_bo *r600_bo_handle(struct radeon *radeon,
>  unsigned handle, unsigned *array_mode);
>  void *r600_bo_map(struct radeon *radeon, struct r600_bo *bo, unsigned usage, 
> void *ctx);
> diff --git a/src/gallium/drivers/r600/r600_buffer.c 
> b/src/gallium/drivers/r600/r600_buffer.c
> index 455aa2e..3c45d78 100644
> --- a/src/gallium/drivers/r600/r600_buffer.c
> +++ b/src/gallium/drivers/r600/r600_buffer.c
> @@ -86,7 +86,7 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen 
> *screen,
>   rbuffer->r.base.vtbl = &r600_buffer_vtbl;
>   rbuffer->r.size = rbuffer->r.base.b.width0;
>   rbuffer->r.domain = r600_domain_from_usage(rbuffer->r.base.b.bind);
> - bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
> alignment, rbuffer->r.base.b.bind);
> + bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
> alignment, rbuffer->r.base.b.bind, rbuffer->r.base.b.usage);
>   if (bo == NULL) {
>   FREE(rbuffer);
>   return NULL;
> @@ -156,8 +156,9 @@ static void *r600_buffer_transfer_map(struct pipe_context 
> *pipe,
>   r600_bo_reference((struct radeon*)pipe->winsys, 
> &rbuffer->r.bo, NULL);
>   rbuffer->num_ranges = 0;
>   rbuffer->r.bo = r600_bo((struct 
> radeon*)pipe->winsys,
> -  
> rbuffer->r.base.b.width0, 0,
> -  
> rbuffer->r.base.b.bind);
> +
> rbuffer->r.base.b.width0, 0,
> +
> rbuffer->r.base.b.bind,
> +
> rbuffer->r.base.b.usage);
>   break;
>   }
>   }
> diff --git a/src/gallium/drivers/r600/r600_shader.c 
> b/src/gallium/drivers/r600/r600_shader.c
> index 4106587..1a0b35d 100644
> --- a/src/gallium/drivers/r600/r600_shader.c
> +++ b/src/gallium/drivers/r600/r600_shader.c
> @@ -218,7 +218,7 @@ static int r600_pipe_shader(struct pipe_context *ctx, 
> struct r600_pipe_shader *s
>  
>   /* copy new shader */
>   if (shader->bo == NULL) {
> - shader->bo = r600_bo(rctx->radeon, rshader->bc.ndw * 4, 4096, 
> 0);
> + shader->bo = r600_bo(rctx->radeon, rshader->bc.ndw * 4, 4096, 
> 0, 0);
>   if (shader->bo == NULL) {
>   return -ENOMEM;
>   }
> diff --git a/src/gallium/drivers/r600/r600_texture.c 
> b/src/gallium/drivers/r600/r600_texture.c
> index 7222b43..9a52cfa 100644
> --- a/src/gallium/drivers/r600/r600_texture.c
> +++ b/src/gallium/drivers/r600/r600_texture.c
> @@ -294,7 +294,7 @@ r600_texture_create_object(struct pipe_screen *screen,
>   resource->size = rtex->size;
>  
>   if (!resource->bo) {
> - 

[Mesa-dev] Resending r600g fixes & improvements

2010-11-02 Thread Keith Whitwell
Restore lost hunk in patch 2.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] r600g: propagate usage flags in texture transfers

2010-11-02 Thread Keith Whitwell
---
 src/gallium/drivers/r600/r600_texture.c |   27 ++-
 1 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 4ebd5b7..7222b43 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include "state_tracker/drm_driver.h"
+#include "pipebuffer/pb_buffer.h"
 #include "r600_pipe.h"
 #include "r600_resource.h"
 #include "r600_state_inlines.h"
@@ -537,6 +538,7 @@ void* r600_texture_transfer_map(struct pipe_context *ctx,
enum pipe_format format = transfer->resource->format;
struct radeon *radeon = (struct radeon *)ctx->screen->winsys;
unsigned offset = 0;
+   unsigned usage = 0;
char *map;
 
if (rtransfer->linear_texture) {
@@ -553,7 +555,30 @@ void* r600_texture_transfer_map(struct pipe_context *ctx,
transfer->box.y / util_format_get_blockheight(format) * 
transfer->stride +
transfer->box.x / util_format_get_blockwidth(format) * 
util_format_get_blocksize(format);
}
-   map = r600_bo_map(radeon, bo, 0, ctx);
+
+   if (transfer->usage & PIPE_TRANSFER_WRITE) {
+   usage |= PB_USAGE_CPU_WRITE;
+
+   if (transfer->usage & PIPE_TRANSFER_DISCARD) {
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_FLUSH_EXPLICIT) {
+   }
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_READ) {
+   usage |= PB_USAGE_CPU_READ;
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_DONTBLOCK) {
+   usage |= PB_USAGE_DONTBLOCK;
+   }
+
+   if (transfer->usage & PIPE_TRANSFER_UNSYNCHRONIZED) {
+   usage |= PB_USAGE_UNSYNCHRONIZED;
+   }
+
+   map = r600_bo_map(radeon, bo, usage, ctx);
if (!map) {
return NULL;
}
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] r600g: propogate resource usage flags to winsys, use to choose bo domains

2010-11-02 Thread Keith Whitwell
This opens the question of what interface the winsys layer should
really have for talking about these concepts.

For now I'm using the existing gallium resource usage concept, but
there is no reason not use terms closer to what the hardware
understands - eg. the domains themselves.
---
 src/gallium/drivers/r600/r600.h   |3 ++-
 src/gallium/drivers/r600/r600_buffer.c|7 ---
 src/gallium/drivers/r600/r600_shader.c|2 +-
 src/gallium/drivers/r600/r600_texture.c   |2 +-
 src/gallium/winsys/r600/drm/r600_bo.c |   24 +---
 src/gallium/winsys/r600/drm/r600_hw_context.c |   13 +
 src/gallium/winsys/r600/drm/r600_priv.h   |1 +
 7 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h
index 62d9832..5ec607b 100644
--- a/src/gallium/drivers/r600/r600.h
+++ b/src/gallium/drivers/r600/r600.h
@@ -112,7 +112,8 @@ struct r600_tiling_info *r600_get_tiling_info(struct radeon 
*radeon);
 /* r600_bo.c */
 struct r600_bo;
 struct r600_bo *r600_bo(struct radeon *radeon,
- unsigned size, unsigned alignment, unsigned 
usage);
+unsigned size, unsigned alignment,
+unsigned binding, unsigned usage);
 struct r600_bo *r600_bo_handle(struct radeon *radeon,
   unsigned handle, unsigned *array_mode);
 void *r600_bo_map(struct radeon *radeon, struct r600_bo *bo, unsigned usage, 
void *ctx);
diff --git a/src/gallium/drivers/r600/r600_buffer.c 
b/src/gallium/drivers/r600/r600_buffer.c
index 455aa2e..3c45d78 100644
--- a/src/gallium/drivers/r600/r600_buffer.c
+++ b/src/gallium/drivers/r600/r600_buffer.c
@@ -86,7 +86,7 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen 
*screen,
rbuffer->r.base.vtbl = &r600_buffer_vtbl;
rbuffer->r.size = rbuffer->r.base.b.width0;
rbuffer->r.domain = r600_domain_from_usage(rbuffer->r.base.b.bind);
-   bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
alignment, rbuffer->r.base.b.bind);
+   bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
alignment, rbuffer->r.base.b.bind, rbuffer->r.base.b.usage);
if (bo == NULL) {
FREE(rbuffer);
return NULL;
@@ -156,8 +156,9 @@ static void *r600_buffer_transfer_map(struct pipe_context 
*pipe,
r600_bo_reference((struct radeon*)pipe->winsys, 
&rbuffer->r.bo, NULL);
rbuffer->num_ranges = 0;
rbuffer->r.bo = r600_bo((struct 
radeon*)pipe->winsys,
-
rbuffer->r.base.b.width0, 0,
-
rbuffer->r.base.b.bind);
+
rbuffer->r.base.b.width0, 0,
+rbuffer->r.base.b.bind,
+
rbuffer->r.base.b.usage);
break;
}
}
diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 4106587..1a0b35d 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -218,7 +218,7 @@ static int r600_pipe_shader(struct pipe_context *ctx, 
struct r600_pipe_shader *s
 
/* copy new shader */
if (shader->bo == NULL) {
-   shader->bo = r600_bo(rctx->radeon, rshader->bc.ndw * 4, 4096, 
0);
+   shader->bo = r600_bo(rctx->radeon, rshader->bc.ndw * 4, 4096, 
0, 0);
if (shader->bo == NULL) {
return -ENOMEM;
}
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 7222b43..9a52cfa 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -294,7 +294,7 @@ r600_texture_create_object(struct pipe_screen *screen,
resource->size = rtex->size;
 
if (!resource->bo) {
-   resource->bo = r600_bo(radeon, rtex->size, 4096, 0);
+   resource->bo = r600_bo(radeon, rtex->size, 4096, base->bind, 
base->usage);
if (!resource->bo) {
FREE(rtex);
return NULL;
diff --git a/src/gallium/winsys/r600/drm/r600_bo.c 
b/src/gallium/winsys/r600/drm/r600_bo.c
index 7d54ff1..9b9aec5 100644
--- a/src/gallium/winsys/r600/drm/r600_bo.c
+++ b/src/gallium/winsys/r600/drm/r600_bo.c
@@ -29,23 +29,37 @@
 #include "radeon_drm.h"
 #include "r600_priv.h"
 #include "r600d.h"
+#include "radeon_drm.h"
 
 struct r600_bo *r600_bo(struct radeon *radeon,
- unsigned size, unsigned alignment, unsigned 
usage)
+   uns

[Mesa-dev] [PATCH 3/5] r600g: use a buffer in GTT as intermediate on texture up and downloads

2010-11-02 Thread Keith Whitwell
Generalize the existing tiled_buffer path in texture transfers for use
in some non-tiled up and downloads.

Use a staging buffer, which the winsys will restrict to GTT memory.

GTT buffers have the major advantage when they are mapped, they are
cachable, which is a very nice property for downloads, usually the CPU
will want to do look at the data it downloaded.
---
 src/gallium/drivers/r600/r600_resource.h |2 +-
 src/gallium/drivers/r600/r600_texture.c  |   85 ++
 2 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_resource.h 
b/src/gallium/drivers/r600/r600_resource.h
index d152285..d24d5a1 100644
--- a/src/gallium/drivers/r600/r600_resource.h
+++ b/src/gallium/drivers/r600/r600_resource.h
@@ -35,7 +35,7 @@ struct r600_transfer {
/* Buffer transfer. */
struct pipe_transfer*buffer_transfer;
unsignedoffset;
-   struct pipe_resource*linear_texture;
+   struct pipe_resource*staging_texture;
 };
 
 /* This gets further specialized into either buffer or texture
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 9a52cfa..8fbe4a0 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -40,8 +40,8 @@
 
 extern struct u_resource_vtbl r600_texture_vtbl;
 
-/* Copy from a tiled texture to a detiled one. */
-static void r600_copy_from_tiled_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
+/* Copy from a full GPU texture to a transfer's staging one. */
+static void r600_copy_to_staging_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
 {
struct pipe_transfer *transfer = (struct pipe_transfer*)rtransfer;
struct pipe_resource *texture = transfer->resource;
@@ -49,15 +49,15 @@ static void r600_copy_from_tiled_texture(struct 
pipe_context *ctx, struct r600_t
 
subdst.face = 0;
subdst.level = 0;
-   ctx->resource_copy_region(ctx, rtransfer->linear_texture,
+   ctx->resource_copy_region(ctx, rtransfer->staging_texture,
subdst, 0, 0, 0, texture, transfer->sr,
transfer->box.x, transfer->box.y, 
transfer->box.z,
transfer->box.width, transfer->box.height);
 }
 
 
-/* Copy from a detiled texture to a tiled one. */
-static void r600_copy_into_tiled_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
+/* Copy from a transfer's staging texture to a full GPU one. */
+static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct 
r600_transfer *rtransfer)
 {
struct pipe_transfer *transfer = (struct pipe_transfer*)rtransfer;
struct pipe_resource *texture = transfer->resource;
@@ -67,7 +67,7 @@ static void r600_copy_into_tiled_texture(struct pipe_context 
*ctx, struct r600_t
subsrc.level = 0;
ctx->resource_copy_region(ctx, texture, transfer->sr,
  transfer->box.x, transfer->box.y, 
transfer->box.z,
- rtransfer->linear_texture, subsrc,
+ rtransfer->staging_texture, subsrc,
  0, 0, 0,
  transfer->box.width, transfer->box.height);
 
@@ -435,10 +435,20 @@ int r600_texture_depth_flush(struct pipe_context *ctx,
}
 
 out:
+   /* XXX: only do this if the depth texture has actually changed:
+*/
r600_blit_uncompress_depth_ptr(ctx, rtex);
return 0;
 }
 
+/* Needs adjustment for pixelformat:
+ */
+static INLINE unsigned u_box_volume( const struct pipe_box *box )
+{
+return box->width * box->depth * box->height;
+};
+
+
 struct pipe_transfer* r600_texture_get_transfer(struct pipe_context *ctx,
struct pipe_resource *texture,
struct pipe_subresource sr,
@@ -449,6 +459,35 @@ struct pipe_transfer* r600_texture_get_transfer(struct 
pipe_context *ctx,
struct pipe_resource resource;
struct r600_transfer *trans;
int r;
+   boolean use_staging_texture = FALSE;
+   boolean discard = FALSE;
+
+   if (!(usage & PIPE_TRANSFER_READ) && (usage & PIPE_TRANSFER_DISCARD))
+   discard = TRUE;
+
+   /* We cannot map a tiled texture directly because the data is
+* in a different order, therefore we do detiling using a blit.
+*
+* Also, use a temporary in GTT memory for read transfers, as
+* the CPU is much happier reading out of cached system memory
+* than uncached VRAM.
+*/
+   if (rtex->tiled)
+   use_staging_texture = TRUE;
+
+if (usage & PIPE_TRANSFER_READ &&
+u_box_volume(box) > 1024)
+use_staging_texture = TRUE;
+
+/* XXX

[Mesa-dev] [PATCH 4/5] r600g: remove unused flink, domain fields from r600_resource

2010-11-02 Thread Keith Whitwell
These were being set but not used anywhere.
---
 src/gallium/drivers/r600/r600_buffer.c   |   27 ---
 src/gallium/drivers/r600/r600_resource.h |5 -
 src/gallium/drivers/r600/r600_texture.c  |1 -
 3 files changed, 0 insertions(+), 33 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_buffer.c 
b/src/gallium/drivers/r600/r600_buffer.c
index 3c45d78..ed97b6e 100644
--- a/src/gallium/drivers/r600/r600_buffer.c
+++ b/src/gallium/drivers/r600/r600_buffer.c
@@ -38,32 +38,6 @@
 
 extern struct u_resource_vtbl r600_buffer_vtbl;
 
-u32 r600_domain_from_usage(unsigned usage)
-{
-   u32 domain = RADEON_GEM_DOMAIN_GTT;
-
-   if (usage & PIPE_BIND_RENDER_TARGET) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-   if (usage & PIPE_BIND_DEPTH_STENCIL) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-   if (usage & PIPE_BIND_SAMPLER_VIEW) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-   /* also need BIND_BLIT_SOURCE/DESTINATION ? */
-   if (usage & PIPE_BIND_VERTEX_BUFFER) {
-   domain |= RADEON_GEM_DOMAIN_GTT;
-   }
-   if (usage & PIPE_BIND_INDEX_BUFFER) {
-   domain |= RADEON_GEM_DOMAIN_GTT;
-   }
-   if (usage & PIPE_BIND_CONSTANT_BUFFER) {
-   domain |= RADEON_GEM_DOMAIN_VRAM;
-   }
-
-   return domain;
-}
 
 struct pipe_resource *r600_buffer_create(struct pipe_screen *screen,
 const struct pipe_resource *templ)
@@ -85,7 +59,6 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen 
*screen,
rbuffer->r.base.b.screen = screen;
rbuffer->r.base.vtbl = &r600_buffer_vtbl;
rbuffer->r.size = rbuffer->r.base.b.width0;
-   rbuffer->r.domain = r600_domain_from_usage(rbuffer->r.base.b.bind);
bo = r600_bo((struct radeon*)screen->winsys, rbuffer->r.base.b.width0, 
alignment, rbuffer->r.base.b.bind, rbuffer->r.base.b.usage);
if (bo == NULL) {
FREE(rbuffer);
diff --git a/src/gallium/drivers/r600/r600_resource.h 
b/src/gallium/drivers/r600/r600_resource.h
index d24d5a1..7a2d1f4 100644
--- a/src/gallium/drivers/r600/r600_resource.h
+++ b/src/gallium/drivers/r600/r600_resource.h
@@ -45,8 +45,6 @@ struct r600_transfer {
 struct r600_resource {
struct u_resource   base;
struct r600_bo  *bo;
-   u32 domain;
-   u32 flink;
u32 size;
 };
 
@@ -68,9 +66,6 @@ struct r600_resource_texture {
 
 void r600_init_screen_resource_functions(struct pipe_screen *screen);
 
-/* r600_buffer */
-u32 r600_domain_from_usage(unsigned usage);
-
 /* r600_texture */
 struct pipe_resource *r600_texture_create(struct pipe_screen *screen,
const struct pipe_resource *templ);
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 8fbe4a0..c92f634 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -284,7 +284,6 @@ r600_texture_create_object(struct pipe_screen *screen,
pipe_reference_init(&resource->base.b.reference, 1);
resource->base.b.screen = screen;
resource->bo = bo;
-   resource->domain = r600_domain_from_usage(resource->base.b.bind);
rtex->pitch_override = pitch_in_bytes_override;
 
if (array_mode)
-- 
1.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   >