Re: [Mesa-dev] [PATCH] gallium/os: add os_wait_until_zero

2015-06-26 Thread Patrick Baggett
On Fri, Jun 26, 2015 at 11:40 AM, Marek Olšák  wrote:

> If p_atomic_read is fine, then this patch is fine too. So you're
> telling that this should work:
>
> while (p_atomic_read(var));
>
> I wouldn't be concerned about a memory barrier. This is only 1 int, so
> it should make its way into the shared cache eventually.
>
>
Yes, it does make it to the shared cache, but the assumption is that the
compiler will actually generate code to check the memory location more than
one. I've personally been bitten by this assumption - it's a bad one. Ilia
is right. If you have a variable that doesn't appear to modified at all,
but you, the programmer know it will be modified by another thread, you're
asking for an infinite loop. The only guarantee you get is that if this
code ran in isolation on a single thread, it will do what you told it to.
Consider even a trivial transformation:

while(1) {

if(var == 0) break;

}

The compiler can optimize this to a single statement:

if(var != 0) infinite_loop();

...because it produces the same results as the above code when run in
isolation. However, if 'var' is volilate, it cannot assume that the value
will remain the same and cannot apply this "optimization". What's more fun
is that debug mode tends to not apply these sorts of optimizations, so your
code hangs in release builds, and when you check the memory location, you
can see that it has been updated. Commence tearing hair out. Then you look
at the assembly and hit your head on the desk. Or something like that. ;)

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

2015-10-23 Thread Patrick Baggett
On Fri, Oct 23, 2015 at 10:55 AM, Eduardo Lima Mitev 
wrote:

> When both fadd and fmul instructions have at least one operand that is a
> constant and it is only used once, the total number of instructions can
> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
> the constants will be progagated as immediate operands of fmul and fadd.
>
> This patch detects these situations and prevents fusing fmul+fadd into
> ffma.
>
> Shader-db results on i965 Haswell:
>
> total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
> instructions in affected programs: 1124094 -> 1114154 (-0.88%)
> total loops in shared programs:1979 -> 1979 (0.00%)
> helped:7612
> HURT:  843
> GAINED:4
> LOST:  0
> ---
>  .../drivers/dri/i965/brw_nir_opt_peephole_ffma.c   | 31
> ++
>  1 file changed, 31 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> index a8448e7..c7fc15a 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
> return alu;
>  }
>
> +/**
> + * Given a list of (at least two) nir_alu_src's, tells if any of them is a
> + * constant value and is used only once.
> + */
> +static bool
> +any_alu_src_is_a_constant(nir_alu_src srcs[])
> +{
> +   for (unsigned i = 0; i < 2; i++) {
> +  if (srcs[i].src.ssa->parent_instr->type ==
> nir_instr_type_load_const) {
> + nir_load_const_instr *load_const =
> +nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
> +
> + if (list_is_single(&load_const->def.uses) &&
> + list_empty(&load_const->def.if_uses)) {
> +return true;
> + }
> +  }
> +   }
> +
> +   return false;
> +}
> +
>

The comment above this functions reads "Given a list of (at least two)
nir_alu_src's...", but the function checks exactly two. Was it your
intention to support lists with size > 2?


>  static bool
>  brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>  {
> @@ -183,6 +205,15 @@ brw_nir_opt_peephole_ffma_block(nir_block *block,
> void *void_state)
>mul_src[0] = mul->src[0].src.ssa;
>mul_src[1] = mul->src[1].src.ssa;
>
> +  /* If any of the operands of the fmul and any of the fadd is a
> constant,
> +   * we bypass because it will be more efficient as the constants
> will be
> +   * propagated as operands, potentially saving two load_const
> instructions.
> +   */
> +  if (any_alu_src_is_a_constant(mul->src) &&
> +  any_alu_src_is_a_constant(add->src)) {
> + continue;
> +  }
> +
>if (abs) {
>   for (unsigned i = 0; i < 2; i++) {
>  nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,
> --
> 2.5.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] nir: Turn -(b2f(a) + b2f(b) >= 0 into !(a || b).

2016-08-10 Thread Patrick Baggett
> >
> > For now, this patch is
> >
> > Reviewed-by: Ian Romanick 
>

I had a hard time parsing the title: "Turn -(b2f(a) + b2f(b) >= 0 into
!(a || b)"  at first, until I saw the replacement instructions. You're
missing a ')' on the commit line. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nv50: add target->hasDualIssueing()

2016-08-13 Thread Patrick Baggett
On Sat, Aug 13, 2016 at 10:43 AM, Tobias Klausmann
 wrote:
>
>
>
> On 13.08.2016 12:02, Karol Herbst wrote:
>>
>> Signed-off-by: Karol Herbst 
>> ---
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target.h| 1 +
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 7 ++-
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h   | 1 +
>>   3 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
>> index 4a701f7..485ca16 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
>> @@ -222,6 +222,7 @@ public:
>>const Value *) const = 0;
>>// whether @insn can be issued together with @next (order matters)
>> +   virtual bool hasDualIssueing() const { return false; }
>>  virtual bool canDualIssue(const Instruction *insn,
>>const Instruction *next) const { return 
>> false; }
>>  virtual int getLatency(const Instruction *) const { return 1; }
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> index 04ac288..faf2121 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> @@ -605,12 +605,17 @@ int TargetNVC0::getThroughput(const Instruction *i) 
>> const
>>  }
>>   }
>>   +bool TargetNVC0::hasDualIssueing() const

The correct spelling is "issuing". English can be so silly at times...

>> +{
>> +   return getChipset() >= 0xe4;
>> +}
>> +
>>   bool TargetNVC0::canDualIssue(const Instruction *a, const Instruction *b) 
>> const
>>   {
>>  const OpClass clA = operationClass[a->op];
>>  const OpClass clB = operationClass[b->op];
>>   -   if (getChipset() >= 0xe4) {
>> +   if (hasDualIssueing()) {
>> // not texturing
>> // not if the 2nd instruction isn't necessarily executed
>> if (clA == OPCLASS_TEXTURE || clA == OPCLASS_FLOW)
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h
>> index 7d11cd9..3d55da7 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h
>> @@ -57,6 +57,7 @@ public:
>>  virtual bool isPostMultiplySupported(operation, float, int& e) const;
>>  virtual bool mayPredicate(const Instruction *, const Value *) const;
>>   +   virtual bool hasDualIssueing() const;
>>  virtual bool canDualIssue(const Instruction *, const Instruction *) 
>> const;
>>  virtual int getLatency(const Instruction *) const;
>>  virtual int getThroughput(const Instruction *) const;
>
>
> Reviewed-by: Tobias Klausmann 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/9] glsl: Make foreach macros usable from C by adding struct keyword.

2014-06-10 Thread Patrick Baggett
>
>
> Yep, no new warnings.
>
> I tried a little test program
> % cat t.cpp
> class asdf {
> int x;
> };
>
> void f() {
> asdf a;
> struct asdf b;
> class asdf c;
> }
>
C++ never ceases to amaze.


> and I can't make it generate warnings (other than unused variables)
> regardless of whether I define asdf as a class or a struct.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add util_memcpy_cpu_to_le()

2014-07-15 Thread Patrick Baggett
On Tue, Jul 15, 2014 at 11:19 AM, Tom Stellard 
wrote:

> ---
>  src/gallium/auxiliary/util/u_math.h  | 22 ++
>  src/gallium/drivers/radeonsi/si_shader.c |  8 +---
>  2 files changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_math.h
> b/src/gallium/auxiliary/util/u_math.h
> index b9ed197..cd3cf04 100644
> --- a/src/gallium/auxiliary/util/u_math.h
> +++ b/src/gallium/auxiliary/util/u_math.h
> @@ -812,6 +812,28 @@ util_bswap16(uint16_t n)
>(n << 8);
>  }
>
> +static INLINE void*
> +util_memcpy_cpu_to_le(void *dest, void *src, size_t n)
> +{
> +#ifdef PIPE_ARCH_BIG_ENDIAN
> +   size_t i, e;
> +   for (i = 0, e = n % 8; i < e; i++) {
> +   char *d = (char*)dest;
> +   char *s = (char*)src;
> +   d[i] = s[e - i - 1];
> +   }
> +   dest += i;
> +   n -= i;
> +   for (i = 0, e = n / 8; i < e; i++) {
> +   uint64_t *d = (uint64_t*)dest;
> +   uint64_t *s = (uint64_t*)src;
> +   d[i] = util_bswap64(s[e - i - 1]);
> +   }
>

Doesn't this reverse all of the byte (as if it were a list) without
preserving word boundaries? e.g.

|a, b, c, d | e, f, g, h | i, j, k, l | m, n, o, p | ->
|p, o, n, m | l, j, k, i | h, g, f, e | d, c, b, a |

The old code did something like this, didn't it?:
|a, b, c, d | e, f, g, h | i, j, k, l | m, n, o, p | ->
|d, c, b, a | h, g, f, e | l, k, j, i | p, o, n, m |

I don't know which is correct, but it does seem like a behavior change. Or
am I misreading the code?

+   return dest;
> +#else
> +   return memcpy(dest, src, n);
> +#endif
> +}
>
>  /**
>   * Clamp X to [MIN, MAX].
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
> b/src/gallium/drivers/radeonsi/si_shader.c
> index f0650f4..6f0504b 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -2559,13 +2559,7 @@ int si_compile_llvm(struct si_context *sctx, struct
> si_pipe_shader *shader,
> }
>
> ptr = (uint32_t*)sctx->b.ws->buffer_map(shader->bo->cs_buf,
> sctx->b.rings.gfx.cs, PIPE_TRANSFER_WRITE);
> -   if (SI_BIG_ENDIAN) {
> -   for (i = 0; i < binary.code_size / 4; ++i) {
> -   ptr[i] =
> util_cpu_to_le32((*(uint32_t*)(binary.code + i*4)));
> -   }
> -   } else {
> -   memcpy(ptr, binary.code, binary.code_size);
> -   }
> +   util_memcpy_cpu_to_le(ptr, binary.code, binary.code_size);
> sctx->b.ws->buffer_unmap(shader->bo->cs_buf);
>
> free(binary.code);
> --
> 1.8.1.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] util: Add util_memcpy_cpu_to_le32() v2

2014-07-18 Thread Patrick Baggett
On Fri, Jul 18, 2014 at 2:10 PM, Tom Stellard 
wrote:

> v2:
>   - Preserve word boundaries.
> ---
>  src/gallium/auxiliary/util/u_math.h | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/src/gallium/auxiliary/util/u_math.h
> b/src/gallium/auxiliary/util/u_math.h
> index b9ed197..5de181a 100644
> --- a/src/gallium/auxiliary/util/u_math.h
> +++ b/src/gallium/auxiliary/util/u_math.h
> @@ -812,6 +812,23 @@ util_bswap16(uint16_t n)
>(n << 8);
>  }
>
> +static INLINE void*
> +util_memcpy_cpu_to_le32(void *dest, void *src, size_t n)
>

I don't know where Mesa is with C99 standards, but if you are utilizing C99
keywords, I think "restrict" would help here to show that the two pointers
do not overlap. I'm not sure if have to mark 'd' and 's' as restrict to get
the benefit if they are initialized by a typecast, but it probably wouldn't
be a bad idea.

This may be a no-go with C++ however.


> +{
> +#ifdef PIPE_ARCH_BIG_ENDIAN
> +   size_t i, e;
> +   asset(n % 4 == 0);
> +
> +   for (i = 0, e = n / 4; i < e; i++) {
> +   uint32_t *d = (uint32_t*)dest;
> +   uint32_t *s = (uint32_t*)src;
> +   d[i] = util_bswap32(s[i]);
> +   }
> +   return dest;
> +#else
> +   return memcpy(dest, src, n);
> +#endif
> +}
>
>  /**
>   * Clamp X to [MIN, MAX].
> --
> 1.8.1.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Patrick Baggett
I've been hanging on this list for a while, and this isn't the first time
this has been suggested. The general thing that is repeated is basically
this: if you make an API (e.g. OpenGL) that supports S3TC without a
license, you're in trouble, even if it is a passthrough to the hardware,
which also required a license to produce in the first place. I think the
assumption most people make is that if the hardware vendor paid a license
to implement S3TC in an ASIC, then surely simply passing through data is
OK. After all, it is being done without any knowledge of the algorithm,
etc. From a common sense standpoint, I would agree.
However, the note in the S3TC extension itself[1] mentions explicitly
to be wary of such assumptions in the "IP Status" section, and notes that *a
license for one API is not a license for another*. This implies that for an
API to make use of S3TC, it requires a license, which Mesa in general, does
not have, while a hardware vendor might. All of this is theoretical as far
as I've read; I don't think anyone has legally challenged this for open
source drivers and posted the results on this mailing list -- mostly have
stayed away from it with a prejudice. I think the patent was granted in
1999, so at least in the USA, hopefully we don't have too many more years
of this garbage.

Patrick

[1] http://www.opengl.org/registry/specs/EXT/texture_compression_s3tc.txt


On Tue, Aug 13, 2013 at 1:53 PM, Uwe Schmidt <
simon.schm...@cs-systemberatung.de> wrote:

> Hi,
>
> I have read about the issue of implementing the S3TC Extension in Mesa:
> http://dri.freedesktop.org/wiki/S3TC/
>
> As I understood, the problem is, that encoding and decoding S3TC in
> software is covered by patents, while passing S3TC compressed data to the
> GPU is still ok.
>
> AS NOW:
>
> If "force_s3tc_enable" is enabled in Mesa3D, uploading a S3TC encoded
> texture works if format==internalFormat is true. If format!=internalFormat
> is true, it would fail (as i know).
>
> SO MY PROPOSAL:
>
> If 'format' is one of the S3TC types, and format!=internalFormat is true,
> then set internalFormat:=format.
>
> Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't,
> set internalFormat:=format (or any other format, Mesa3D can encode).
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Patrick Baggett
Erm... I'm wondering... why does the S3TC issue come up every few
> months out of it's grave and haunt the list (and your nerves) ?
>
>
I think it is because the issue looks deceptively simple. Hardware is
hardware, right? ASICs do the decompression, not software. Surely blindly
copying bits from one device to another *can't* be patent infringement.
Surely, right? :\

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: pad IBs to a multiple of 8 DWs

2013-09-06 Thread Patrick Baggett
> Any reason for this complicated logic, instead of simply:
>
> while (cs->cdw & 0x7)
> cs->buf[cs->cdw++] = 0x8000;
>
>
Ah, that is eloquently terse; I'm going to have to remember that.

Patrick


> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast |  Debian, X and DRI developer
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta: Allocate texture before initializing texture coordinates

2013-02-22 Thread Patrick Baggett
On Fri, Feb 22, 2013 at 2:23 PM, Ian Romanick  wrote:

> On 02/15/2013 11:20 AM, Anuj Phogat wrote:
>
>> tex->Sright and tex->Ttop are initialized during texture allocation.
>> This fixes depth buffer blitting failures in khronos conformance tests
>> when run on desktop GL 3.0.
>>
>> Note: This is a candidate for stable branches.
>>
>> Signed-off-by: Anuj Phogat 
>>
>
> Reviewed-by: Ian Romanick 
>
> I think there is a lot of room for other improvements in this code.
> Like... why are we doing glReadPixels into malloc memory, then handing that
> same pointer to glTexImage2D.  We should (at least for desktop and GLES3)
> use a PBO.

 ---
>>   src/mesa/drivers/common/meta.c |   17 -
>>   1 files changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/mesa/drivers/common/**meta.c b/src/mesa/drivers/common/*
>> *meta.c
>> index 4e32b50..29a209e 100644
>> --- a/src/mesa/drivers/common/**meta.c
>> +++ b/src/mesa/drivers/common/**meta.c
>> @@ -1910,6 +1910,14 @@ _mesa_meta_BlitFramebuffer(**struct gl_context
>> *ctx,
>> GLuint *tmp = malloc(srcW * srcH * sizeof(GLuint));
>>
>> if (tmp) {
>> +
>> + newTex = alloc_texture(depthTex, srcW, srcH,
>> GL_DEPTH_COMPONENT);
>>
>
Are out of memory conditions handled in alloc_texture?


> + _mesa_ReadPixels(srcX, srcY, srcW, srcH, GL_DEPTH_COMPONENT,
>> +  GL_UNSIGNED_INT, tmp);
>> + setup_drawpix_texture(ctx, depthTex, newTex, GL_DEPTH_COMPONENT,
>> +   srcW, srcH, GL_DEPTH_COMPONENT,
>> +   GL_UNSIGNED_INT, tmp);
>> +
>>/* texcoords (after texture allocation!) */
>>{
>>   verts[0].s = 0.0F;
>> @@ -1928,15 +1936,6 @@ _mesa_meta_BlitFramebuffer(**struct gl_context
>> *ctx,
>>if (!blit->DepthFP)
>>   init_blit_depth_pixels(ctx);
>>
>> - /* maybe change tex format here */
>> - newTex = alloc_texture(depthTex, srcW, srcH,
>> GL_DEPTH_COMPONENT);
>> -
>> - _mesa_ReadPixels(srcX, srcY, srcW, srcH,
>> -  GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, tmp);
>> -
>> - setup_drawpix_texture(ctx, depthTex, newTex,
>> GL_DEPTH_COMPONENT, srcW, srcH,
>> -   GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, tmp);
>> -
>>_mesa_BindProgramARB(GL_**FRAGMENT_PROGRAM_ARB,
>> blit->DepthFP);
>>_mesa_set_enable(ctx, GL_FRAGMENT_PROGRAM_ARB, GL_TRUE);
>>_mesa_ColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);
>>
>>
> __**_
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/**mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb -> argb special case in fast_read_rgba_pixels_memcpy

2013-03-11 Thread Patrick Baggett
On Mon, Mar 11, 2013 at 9:56 AM, Jose Fonseca  wrote:

> I'm surprised this is is faster.
>
> In particular, for big things we'll be touching memory twice.
>
> Did you measure the speed up?
>
> Jose
>

I'm sorry to be dull, but is there a SSE2 implementation of this somewhere
for x86 / x64 CPUs?

Patrick


>
> - Original Message -
> > ---
> >  src/mesa/main/readpix.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c
> > index 349b0bc..0f5c84c 100644
> > --- a/src/mesa/main/readpix.c
> > +++ b/src/mesa/main/readpix.c
> > @@ -285,11 +285,12 @@ fast_read_rgba_pixels_memcpy( struct gl_context
> *ctx,
> >}
> > } else if (copy_xrgb) {
> >/* convert xrgb -> argb */
> > +  int alphaOffset = texelBytes - 1;
> >for (j = 0; j < height; j++) {
> > - GLuint *dst4 = (GLuint *) dst, *map4 = (GLuint *) map;
> > + memcpy(dst, map, width * texelBytes);
> >   int i;
> >   for (i = 0; i < width; i++) {
> > -dst4[i] = map4[i] | 0xff00;  /* set A=0xff */
> > +dst[i * texelBytes + alphaOffset] = 0xff;  /* set A=0xff */
> >   }
> >   dst += dstStride;
> >   map += stride;
> > --
> > 1.8.1.5
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb -> argb special case in fast_read_rgba_pixels_memcpy

2013-03-11 Thread Patrick Baggett
On Mon, Mar 11, 2013 at 1:30 PM, Jose Fonseca  wrote:

> - Original Message -
> > On 03/11/2013 07:56 AM, Jose Fonseca wrote:
> > > I'm surprised this is is faster.
> > >
> > > In particular, for big things we'll be touching memory twice.
> > >
> > > Did you measure the speed up?
> >
> > The second hit is cache-hot, so it may not be too expensive.
>
> Yes, but the size in question is 1900x1200, ie, 9MB, which will trash
> L1-L2 caches, and won't even fit on the L3 cache of several processors.
>
> I'm afraid we'd be optimizing some cases at expense of others.
>
> I think that at very least we should do this in 16KB/32KB or so chunks to
> avoid trashing the lower level caches.
>
> > I suspect
> > memcpy is optimized to fill the cache in a more efficient manner than
> > the old loop.  Since the old loop did a read and a bit-wise or, it's
> > also possible the compiler generated some really dumb code.  We'd have
> > to look at the assembly output to know.
> >
> > As Patrick suggests, there's probably an SSE2 method to do this even
> > faster.  That may be worth investigating.
>
> An SSE2 is quite easy with intrinsics:
>
>   _m128i pixels = _mm_loadu_si128((const __m128i *)src); // could use
> _mm_load_si128 with some checks
>   pixels = _mm_or_si128(pixels, _mm_set1_epi32(0xff00));
>   _mm_storeu_si128((__m128i *)dst, pixels);
>   src += sizeof(__m128i) / sizeof *src;
>   dst += sizeof(__m128i) / sizeof *dst;
>
> the hard part is the runtime check for sse2 support...
>
>
At least for x86-64, there is no runtime check required as SSE2 is
required. The mesa/x86 folder contains runtime CPU code detection already;
I was just browsing it.

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] One definition of C99 inline/__func__ to rule them all.

2013-03-12 Thread Patrick Baggett
On Tue, Mar 12, 2013 at 3:39 PM,  wrote:

> From: José Fonseca 
>
> We were in four already...
> ---
>  include/c99_compat.h  |  105
> +
>  src/egl/main/eglcompiler.h|   44 ++
>  src/gallium/include/pipe/p_compiler.h |   74 ++-
>  src/mapi/mapi/u_compiler.h|   26 ++--
>  src/mesa/main/compiler.h  |   56 ++
>  5 files changed, 125 insertions(+), 180 deletions(-)
>  create mode 100644 include/c99_compat.h
>
> diff --git a/include/c99_compat.h b/include/c99_compat.h
> new file mode 100644
> index 000..39f958f
> --- /dev/null
> +++ b/include/c99_compat.h
> @@ -0,0 +1,105 @@
>
> +/**
> + *
> + * Copyright 2007-2013 VMware, Inc.
> + * All Rights Reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
> CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> +
> **/
> +
> +#ifndef _C99_COMPAT_H_
> +#define _C99_COMPAT_H_
> +
> +
> +/*
> + * C99 inline keyword
> + */
> +#ifndef inline
> +#  ifdef __cplusplus
> + /* C++ supports inline keyword */
> +#  elif defined(__GNUC__)
> +#define inline __inline__
> +#  elif defined(_MSC_VER)
> +#define inline __inline
> +#  elif defined(__ICL)
> +#define inline __inline
> +#  elif defined(__INTEL_COMPILER)
> + /* Intel compiler supports inline keyword */
> +#  elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100)
> +#define inline __inline
> +#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
>

Solaris Studio supports __inline and __inline__


> + /* C99 supports inline keyword */
> +#  elif (__STDC_VERSION__ >= 199901L)
> + /* C99 supports inline keyword */
> +#  else
> +#define inline
> +#  endif
> +#endif
>


The order of the checks will not work as expected. Intel's compiler will
define __GNUC__, and so will clang. The check for __GNUC__ has to be the
last one.



> +
> +
> +/*
> + * C99 restrict keyword
> + *
> + * See also:
> + * -
> http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
> + */
> +#ifndef restrict
> +#  if (__STDC_VERSION__ >= 199901L)
> + /* C99 */
> +#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
> + /* C99 */
>

Solaris Studio supports "_Restrict" when not in C99 mode as well.

#define restrict _Restrict


> +#  elif defined(__GNUC__)
> +#define restrict __restrict__
> +#  elif defined(_MSC_VER)
> +#define restrict __restrict
> +#  else
> +#define restrict /* */
> +#  endif
> +#endif
> +
> +
> +/*
> + * C99 __func__ macro
> + */
> +#ifndef __func__
> +#  if (__STDC_VERSION__ >= 199901L)
> + /* C99 */
> +#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
> + /* C99 */
>

Solaris Studio supports __FUNCTION__ when not in C99 mode.


> +#  elif defined(__GNUC__)
> +#if __GNUC__ >= 2
> +#  define __func__ __FUNCTION__
> +#else
> +#  define __func__ ""
> +#endif
> +#  elif defined(_MSC_VER)
> +#if _MSC_VER >= 1300
> +#  define __func__ __FUNCTION__
> +#else
> +#  define __func__ ""
> +#endif
> +#  else
> +#define __func__ ""
> +#  endif
> +#endif
> +
> +
> +#endif /* _C99_COMPAT_H_ */
> diff --git a/src/egl/main/eglcompiler.h b/src/egl/main/eglcompiler.h
> index 9823693..2499172 100644
> --- a/src/egl/main/eglcompiler.h
> +++ b/src/egl/main/eglcompiler.h
> @@ -31,6 +31,9 @@
>  #define EGLCOMPILER_INCLUDED
>
>
> +#include "c99_compat.h" /* inline, __func__, etc. */
> +
> +
>  /**
>   * Get standard integer types
>   */
> @@ -62,30 +65,7 @@
>  #endif
>
>
> -/**
> - * Function inlining
> - */
> -#ifndef inline
> -#  ifdef __cplusplus
> - /* C++ supports inline keyword */
> -#  elif defined(__GNUC__)
> -#define inline __inline__
> -#  elif defined(_MSC_VER)
> 

[Mesa-dev] Testing optimizer

2013-12-17 Thread Patrick Baggett
Hi all,

Is there a way to see the machine code that is generated by the GLSL
compiler for all GPU instruction sets? For example, I would like to know if
the optimizer optimizes certain (equivalent) constructs (or not), and avoid
them if possible. I know there is a lot to optimization on GPUs that I
don't know, but I'd still like to get some ballpark estimates. For example,
I'm curious whether:

//let p1, p2, p3 be vec2 uniforms

vec4(p1, 0, 0) + vec4(p2, 0, 0) + vec4(p3, 0, 1)

produces identical machine code as:

vec4(p1+p2+p3, 0, 1);

for all architectures supported by Mesa.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Testing optimizer

2013-12-17 Thread Patrick Baggett
On Tue, Dec 17, 2013 at 10:59 AM, Paul Berry wrote:

> On 17 December 2013 08:46, Tom Stellard  wrote:
>
>> On Tue, Dec 17, 2013 at 09:57:31AM -0600, Patrick Baggett wrote:
>> > Hi all,
>> >
>> > Is there a way to see the machine code that is generated by the GLSL
>> > compiler for all GPU instruction sets? For example, I would like to
>> know if
>> > the optimizer optimizes certain (equivalent) constructs (or not), and
>> avoid
>> > them if possible. I know there is a lot to optimization on GPUs that I
>> > don't know, but I'd still like to get some ballpark estimates. For
>> example,
>> > I'm curious whether:
>>
>> Each driver has its own environment variable for dumping machine code.
>>
>> llvmpipe: GALLIVM_DEBUG=asm (I think you need to build mesa
>>  with --enable-debug for this to work)
>> r300g: RADEON_DEBUG=fp,vp
>> r600g, radeonsi: R600_DEBUG=ps,vs
>>
>> I'm not sure what the other drivers use.
>>
>> -Tom
>>
>
> I believe every driver also supports MESA_GLSL=dump, which prints out the
> IR both before and after linking (you'll want to look at the version after
> linking to see what optimizations have been applied, since some
> optimizations happen at link time).  Looking at the IR rather than the
> machine code is more likely to give you the information you need, since
> Mesa performs the same IR-level optimizations on all architectures, whereas
> the optimizations that happen at machine code level are vastly different
> from one driver to the next.
>
>
I do want to see both, actually. For example, if a driver implements a
specific optimization (machine code level) and another driver clearly does
not, then that would be considered "interesting" to me.



> Another thing which might be useful to you is Aras Pranckevičius's
> "glsl-optimizer" project (https://github.com/aras-p/glsl-optimizer),
> which performs Mesa's IR-level optimizations on a shader and then
> translates it from IR back to GLSL.
>
> Paul
>
>
Thanks to everyone for the great tips!


>
>> >
>> > //let p1, p2, p3 be vec2 uniforms
>> >
>> > vec4(p1, 0, 0) + vec4(p2, 0, 0) + vec4(p3, 0, 1)
>> >
>> > produces identical machine code as:
>> >
>> > vec4(p1+p2+p3, 0, 1);
>> >
>> > for all architectures supported by Mesa.
>>
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/9] egl: Don't attempt to redefine stdint.h types with VS 2013.

2014-05-02 Thread Patrick Baggett
On Fri, May 2, 2014 at 10:11 AM,  wrote:

> From: José Fonseca 
>
> Just include stdint.h.
> ---
>  src/egl/main/eglcompiler.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/egl/main/eglcompiler.h b/src/egl/main/eglcompiler.h
> index 53dab54..5ea83d6 100644
> --- a/src/egl/main/eglcompiler.h
> +++ b/src/egl/main/eglcompiler.h
> @@ -37,7 +37,8 @@
>  /**
>   * Get standard integer types
>   */
> -#if (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L)
> +#if (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) || \
> +(defined(_MSC_VER) && _MSC_VER >= 1600)
>

VS 2010 is where the support for  beings. This can be verified by
a quick Google search.



>  #  include 
>  #elif defined(_MSC_VER)
> typedef __int8 int8_t;
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/21] glsl: Store ir_variable::ir_type in 8 bits instead of 32

2014-05-28 Thread Patrick Baggett
On Wed, May 28, 2014 at 2:17 PM, Ian Romanick  wrote:

> On 05/27/2014 08:28 PM, Matt Turner wrote:
> > On Tue, May 27, 2014 at 7:49 PM, Ian Romanick 
> wrote:
> >> From: Ian Romanick 
> >>
> >> No change in the peak ir_variable memory usage in a trimmed apitrace of
> >> dota2 on 64-bit.
> >>
> >> No change in the peak ir_variable memory usage in a trimmed apitrace of
> >> dota2 on 32-bit.
> >>
> >> Signed-off-by: Ian Romanick 
> >> ---
> >>  src/glsl/ir.h | 5 +++--
> >>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
> >> index 7faee74..bc02f6e 100644
> >> --- a/src/glsl/ir.h
> >> +++ b/src/glsl/ir.h
> >> @@ -92,12 +92,13 @@ enum ir_node_type {
> >>   */
> >>  class ir_instruction : public exec_node {
> >>  private:
> >> -   enum ir_node_type ir_type;
> >> +   uint8_t ir_type;
> >>
> >>  public:
> >> inline enum ir_node_type get_ir_type() const
> >> {
> >> -  return this->ir_type;
> >> +  STATIC_ASSERT(ir_type_max < 256);
> >> +  return (enum ir_node_type) this->ir_type;
> >> }
> >>
> >> /**
> >> --
> >> 1.8.1.4
> >
> > Instead of doing this, you can mark the enum type with the PACKED
> > attribute. I did this in a similar change in i965 already. See
> > http://lists.freedesktop.org/archives/mesa-dev/2014-February/054643.html
> >
> > This way we still get enum type checking and warnings out of switch
> > statements and such.
>
> Hmm... that would mean that patch 10 wouldn't strictly be necessary.
> The disadvantage is that the next patch would need (right?) some changes
> for MSVC, especially on 32-bit.  I think it would need to be
>
> #if sizeof(ir_node_type) < sizeof(void *)
>

I don't think the preprocessor can evaluate sizeof().


> # define PADDING_BYTES (sizeof(void *) - sizeof(ir_node_type))
> #else
> # define PADDING_BYTES sizeof(void *)
> #  if (__GNUC__ >= 3)
> #error "GCC did us wrong."
> #  endif
> #endif
>
>   uint8_t padding[PADDING_BYTES];
>
> Seems a little sketchy, but might still be better... hmm...
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/17] swrast: Factor out texture slice counting.

2013-04-22 Thread Patrick Baggett
On Mon, Apr 22, 2013 at 11:14 AM, Eric Anholt  wrote:

> This function going to get used a lot more in upcoming patches.
> ---
>  src/mesa/swrast/s_texture.c |   16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
> index 51048be..36a90dd 100644
> --- a/src/mesa/swrast/s_texture.c
> +++ b/src/mesa/swrast/s_texture.c
> @@ -58,6 +58,14 @@ _swrast_delete_texture_image(struct gl_context *ctx,
> _mesa_delete_texture_image(ctx, texImage);
>  }
>
> +static unsigned int
> +texture_slices(struct gl_texture_image *texImage)
> +{
> +   if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY)
> +  return texImage->Height;
> +   else
> +  return texImage->Depth;
> +}
>
>
I think you can const-qualify 'texImage'.


>  /**
>   * Called via ctx->Driver.AllocTextureImageBuffer()
> @@ -83,11 +91,11 @@ _swrast_alloc_texture_image_buffer(struct gl_context
> *ctx,
>  * We allocate the array for 1D/2D textures too in order to avoid
> special-
>  * case code in the texstore routines.
>  */
> -   swImg->ImageOffsets = malloc(texImage->Depth * sizeof(GLuint));
> +   swImg->ImageOffsets = malloc(texture_slices(texImage) *
> sizeof(GLuint));
> if (!swImg->ImageOffsets)
>return GL_FALSE;
>
> -   for (i = 0; i < texImage->Depth; i++) {
> +   for (i = 0; i < texture_slices(texImage); i++) {
>swImg->ImageOffsets[i] = i * texImage->Width * texImage->Height;
> }
>
> @@ -209,20 +217,20 @@ _swrast_map_teximage(struct gl_context *ctx,
>
> map = swImage->Buffer;
>
> +   assert(slice < texture_slices(texImage));
> +
> if (texImage->TexObject->Target == GL_TEXTURE_3D ||
> texImage->TexObject->Target == GL_TEXTURE_2D_ARRAY) {
>GLuint sliceSize = _mesa_format_image_size(texImage->TexFormat,
>   texImage->Width,
>   texImage->Height,
>   1);
> -  assert(slice < texImage->Depth);
>map += slice * sliceSize;
> } else if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
>GLuint sliceSize = _mesa_format_image_size(texImage->TexFormat,
>   texImage->Width,
>   1,
>   1);
> -  assert(slice < texImage->Height);
>map += slice * sliceSize;
> }
>
> --
> 1.7.10.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glxgears performance higher with software renderer compared to h/w drivers

2013-05-06 Thread Patrick Baggett
I don't think glxgears is the best benchmark for what is a "typical" OpenGL
load (if there is a "typical"). The 60 FPS with your hardware driver sounds
suspiciously like the refresh rate of your screen; perhaps it is
synchronized with the vertical retrace? Since I'm assuming you want to find
the fastest driver, why not try a free and open source game like openarena
to give you a better idea of how they actually perform.


Patrick


On Mon, May 6, 2013 at 9:33 AM, Divick Kishore wrote:

> Hi,
>  I am trying to build s/w only mesa driver. It seems that the
> performance of software only renderer (compiled with
> --with-driver=xlib) is higher than that of h/w drivers. Could someone
> please help me understand what is causing this or if it is expected?
>
> I see that dri based s/w renderer is also slower than xlib/swrast
> driver. So how does dri based s/w rendering work and why is it slower
> than xlib/swrast driver?
>
> I presume that --with-driver=xlib builds s/w only renderer. Please
> correct me if I am wrong.
>
> ./configure -build=x86_64-linux-gnu --with-driver=dri
> --with-dri-drivers="i915 swrast"
>
> --with-dri-driverdir=/home/divick/work/mesa/mesa-8.0.5/build/dri/x86_64-linux-gnu/
>
> --with-dri-searchpath='/home/divick/work/mesa/mesa-8.0.5/build/dri/x86_64-linux-gnu/'
> --enable-glx-tls --enable-shared-glapi --enable-texture-float
> --enable-xa --enable-driglx-direct --with-egl-platforms="x11 drm"
> --enable-gallium-llvm --with-gallium-drivers="swrast i915"
> --enable-gles1 --enable-gles2 --enable-openvg --enable-gallium-egl
> --disable-glu CFLAGS="-Wall -g -O2" CXXFLAGS="-Wall -g -O2"
>
> with LIBGL_ALWAYS_SOFTWARE=1
> glxgears reports:
>
> GL_RENDERER   = Software Rasterizer
> GL_VERSION= 2.1 Mesa 8.0.5
> GL_VENDOR = Mesa Project
>
> fps: ~ 490 fps
>
> Without LIBGL_ALWAYS_SOFTWARE set:
>
> GL_RENDERER   = Mesa DRI Intel(R) Sandybridge Mobile
> GL_VERSION= 3.0 Mesa 8.0.5
> GL_VENDOR = Tungsten Graphics, Inc
>
> fps: ~ 60
>
> When compiled with configure options
>
>  --build=x86_64-linux-gnu --disable-egl --with-gallium-drivers=
> --with-driver=xlib --disable-egl CFLAGS="-Wall -g -O2" CXXFLAGS="-Wall
> -g -O2"
>
> glxgears reports:
>
> GL_RENDERER   = Mesa X11
> GL_VERSION= 2.1 Mesa 8.0.5
> GL_VENDOR = Brian Paul
>
> fps: ~1600
>
> With drivers installed on system and with LIBGL_ALWAYS_SOFTWARE=1:
>
> GL_RENDERER   = Gallium 0.4 on llvmpipe (LLVM 0x209)
> GL_VERSION= 2.1 Mesa 8.0.5
> GL_VENDOR = VMware, Inc.
>
> fps: ~ 1130
>
> Thanks & Regards,
> Divick
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] No configs available with xlib based egl

2013-05-07 Thread Patrick Baggett
Perhaps 16-bit color isn't supported? Maybe try other color bits or set
R/G/B individually and see what happens. Also, there is an eglinfo tool
source code in Mesa that can probably tell you a whole lot more.


Patrick


On Tue, May 7, 2013 at 7:56 AM, Divick Kishore wrote:

> Hi,
> I have compiled mesa with the following options:
>
> .././configure --prefix=~/lib/mesa/swrast/ --build=x86_64-linux-gnu
> --with-gallium-drivers= --with-driver=xlib --enable-egl --enable-gles1
> --enable-gles2 --with-egl-platforms="x11" CFLAGS="-Wall -g -O2"
> CXXFLAGS="-Wall -g -O2"
>
> but when I run a sample app with the following egl config, it returns 0
> configs.
>
> EGLint attr[] = {   // some attributes to set up our egl-interface
>   EGL_BUFFER_SIZE, 16,
>   EGL_RENDERABLE_TYPE,
>   EGL_OPENGL_ES2_BIT,
>   EGL_NONE
>};
>
>EGLConfig  ecfg;
>EGLint num_config;
>if ( !eglChooseConfig( egl_display, attr, &ecfg, 1, &num_config ) ) {
>   cerr << "Failed to choose config (eglError: " << eglGetError()
> << ")" << endl;
>   return 1;
>}
>
>
> The code above prints 'Failed to choose config'.
>
> While the same code works fine when I compile with:
>
> ../../configure --prefix=~/lib/mesa/dri --build=x86_64-linux-gnu
> --with-driver=dri --with-dri-drivers="swrast"
> --with-dri-driverdir=~/lib/mesa/dri/
> --with-dri-searchpath='~/lib/mesa/dri' --enable-glx-tls --enable-xa
> --enable-driglx-direct --with-egl-platforms="x11"
> --enable-gallium-llvm=yes --with-gallium-drivers="swrast"
> --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu
> CFLAGS="-Wall -g -O2" CXXFLAGS="-Wall -g -O2"
>
> Could someone please suggest what could be causing this?
>
> Thanks & Regards,
> Divick
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer

2013-05-24 Thread Patrick Baggett
The only difference I could see is that in the old code you passed
&cb->buffer (which maybe points to a value?) directly into u_upload_data()
where as in the new code, you do pass &cb->buffer as the parameter rbuffer
to r600_upload_const_buffer(), but then inside that function, you do
*rbuffer = NULL before you start, which effectively erases any previous
pointer, so if *rbuffer was examined by u_upload_data(), it may be
different. I don't know if that matters, though.

Patrick


On Fri, May 24, 2013 at 1:07 PM, Tom Stellard  wrote:

> From: Tom Stellard 
>
> ---
>  src/gallium/drivers/radeonsi/r600_buffer.c  | 31
> +
>  src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++---
>  src/gallium/drivers/radeonsi/si_state.c | 29
> +++
>  3 files changed, 51 insertions(+), 35 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c
> b/src/gallium/drivers/radeonsi/r600_buffer.c
> index cdf9988..87763c3 100644
> --- a/src/gallium/drivers/radeonsi/r600_buffer.c
> +++ b/src/gallium/drivers/radeonsi/r600_buffer.c
> @@ -25,6 +25,8 @@
>   *  Corbin Simpson 
>   */
>
> +#include 
> +
>  #include "pipe/p_screen.h"
>  #include "util/u_format.h"
>  #include "util/u_math.h"
> @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context
> *rctx,
> u_upload_data(rctx->uploader, 0, count * ib->index_size,
>   ib->user_buffer, &ib->offset, &ib->buffer);
>  }
> +
> +void r600_upload_const_buffer(struct r600_context *rctx, struct
> si_resource **rbuffer,
> +   const uint8_t *ptr, unsigned size,
> +   uint32_t *const_offset)
> +{
> +   *rbuffer = NULL;
> +
> +   if (R600_BIG_ENDIAN) {
> +   uint32_t *tmpPtr;
> +   unsigned i;
> +
> +   if (!(tmpPtr = malloc(size))) {
> +   R600_ERR("Failed to allocate BE swap buffer.\n");
> +   return;
> +   }
> +
> +   for (i = 0; i < size / 4; ++i) {
> +   tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]);
> +   }
> +
> +   u_upload_data(rctx->uploader, 0, size, tmpPtr,
> const_offset,
> +   (struct pipe_resource**)rbuffer);
> +
> +   free(tmpPtr);
> +   } else {
> +   u_upload_data(rctx->uploader, 0, size, ptr, const_offset,
> +   (struct pipe_resource**)rbuffer);
> +   }
> +}
> diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c
> b/src/gallium/drivers/radeonsi/radeonsi_compute.c
> index 3fb6eb1..035076d 100644
> --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
> +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
> @@ -91,8 +91,11 @@ static void radeonsi_launch_grid(
> struct r600_context *rctx = (struct r600_context*)ctx;
> struct si_pipe_compute *program = rctx->cs_shader_state.program;
> struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
> +   struct si_resource *input_buffer;
> +   uint32_t input_offset = 0;
> +   uint64_t input_va;
> uint64_t shader_va;
> -   unsigned arg_user_sgpr_count;
> +   unsigned arg_user_sgpr_count = 2;
> unsigned i;
> struct si_pipe_shader *shader = &program->kernels[pc];
>
> @@ -109,21 +112,16 @@ static void radeonsi_launch_grid(
> si_pm4_inval_shader_cache(pm4);
> si_cmd_surface_sync(pm4, pm4->cp_coher_cntl);
>
> -   arg_user_sgpr_count = program->input_size / 4;
> -   if (program->input_size % 4 != 0) {
> -   arg_user_sgpr_count++;
> -   }
> +   /* Upload the input data */
> +   r600_upload_const_buffer(rctx, &input_buffer, input,
> +   program->input_size,
> &input_offset);
> +   input_va = r600_resource_va(ctx->screen, (struct
> pipe_resource*)input_buffer);
> +   input_va += input_offset;
>
> -   /* XXX: We should store arguments in memory if we run out of user
> sgprs.
> -*/
> -   assert(arg_user_sgpr_count < 16);
> +   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);
>
> -   for (i = 0; i < arg_user_sgpr_count; i++) {
> -   uint32_t *args = (uint32_t*)input;
> -   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 +
> -   (i * 4),
> -   args[i]);
> -   }
> +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
> +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4,
> S_008F04_BASE_ADDRESS_HI (input_va >> 32) | S_008F04_STRIDE(0));
>
> si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0);
> si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0);
> diff --git a/src/gallium/drivers/radeonsi/si_state.c
> b/src/gallium/drivers/radeonsi/si_state.c
> index dec535c..1e94f7e 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c

Re: [Mesa-dev] forking shared intel directory?

2013-06-21 Thread Patrick Baggett
On Fri, Jun 21, 2013 at 1:29 PM, Eric Anholt  wrote:

> Long ago, when porting FBO and memory management support to i965, I
> merged a bunch of code between the i915 and i965 drivers and put it in
> the intel directory.  I think it served us well for a long time, as both
> drivers got improvements from shared work on that code.  But since then,
> we've talked several times about splitting things back apart (since we
> break i915 much more often than we improve it), so I spent yesterday and
> today looking at what the impact would be.
>
>
I'm not a developer, but I like to keep up with the drivers that I have
hardware for. Please take my opinions with a grain of salt.

When you say you break i915 more than you improve it, do you mean to say
that it is difficult to improve !i915 without breaking i915 and therefore
to improve development speed, it should be forked OR that i915 doesn't
receive enough testing / have maintainers who can resolve the issues and so
it burdens other developers to fix i915 and hence slows development?

The reason I ask if because if it is #2, then it sounds like you should be
looking for someone to volunteer as the official i915 maintainer [and if
none, then fork], but if it is #1, then maintainer or not, it will slow
down your efforts.




> LOC counts (wc -l):
>
> intel/ i915/   i965/ total
> master: 14751  13458   61109 89318
> fork-i915:  0  24322   74978 99300
>
> We duplicate ~1 lines of code, but i915 drops ~4000 lines of code
> From its build and i965 drops ~1000.
>
> context size:
>i915i965
> master:99512   101456
> fork-i915: 99384   100824
>
> There's a bunch of cleanup I haven't done in the branch, like moving
> brw_vtbl.c contents to sensible places, or nuking the intel vs brw split
> that doesn't make any sense any more.
>
> I'm ambivalent about the change.  If the code growth from splitting was
> <7000 lines or so, I'd be happy, but this feels pretty big.  On the
> other hand, the cleanups feel good to me.  I don't know how other
> developers feel.  There's a branch up at fork-i915 of my tree.  If
> people are excited about doing this and I get a bunch of acks for the
> two "copy the code to my directory" commits, I'll do those two then
> start sending out the non-copying changes for review.  If people don't
> like it, I won't be hurt.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] forking shared intel directory?

2013-06-21 Thread Patrick Baggett
On Fri, Jun 21, 2013 at 3:53 PM, Kenneth Graunke wrote:

> On 06/21/2013 01:25 PM, Patrick Baggett wrote:
>
>> I'm not a developer, but I like to keep up with the drivers that I have
>> hardware for. Please take my opinions with a grain of salt.
>>
>> When you say you break i915 more than you improve it, do you mean to say
>> that it is difficult to improve !i915 without breaking i915 and
>> therefore to improve development speed, it should be forked OR that i915
>> doesn't receive enough testing / have maintainers who can resolve the
>> issues and so it burdens other developers to fix i915 and hence slows
>> development?
>>
>> The reason I ask if because if it is #2, then it sounds like you should
>> be looking for someone to volunteer as the official i915 maintainer [and
>> if none, then fork], but if it is #1, then maintainer or not, it will
>> slow down your efforts.
>>
>
> Mostly the former...i915c already supports everything the hardware can do,
> while we're continually adding new features to i965+ (well, mostly gen6+).
>  Things like HiZ, fast color clears, and ETC texture compression support
> affect the common miptree code, but they do nothing for i915 class
> hardware...there's only a potential downside of accidental breakage.
>
> The latter is true as well.  Unfortunately, community work is hampered by
> the fact that Intel hasn't released public documentation for i915 class
> hardware.  From time to time we've tried to find and motivate the right
> people to make that happen, but it hasn't yet.  Most people in the
> community are also more interested in working on the i915g driver.
>
>
Ah, thanks for the explanation, though I guess it doesn't do a whole, whole
lot to answer Eric's question.

On a side note: I was interested in the i915g driver, but I couldn't find
any documentation for it other than some architectural information about
the GPU's pipeline. I'm glad I wasn't just lacking the Google-foo. :\
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] forking shared intel directory?

2013-06-21 Thread Patrick Baggett
> The latter is true as well.  Unfortunately, community work is hampered by
>> the fact that Intel hasn't released public documentation for i915 class
>> hardware.  From time to time we've tried to find and motivate the right
>> people to make that happen, but it hasn't yet.  Most people in the
>> community are also more interested in working on the i915g driver.
>>
>>
> Ah, thanks for the explanation, though I guess it doesn't do a whole,
> whole lot to answer Eric's question.
>
>
That is to say, hearing that there isn't just a lack of maintainer or just
lack of ease for new development doesn't make either option seem better to
me, but you all know what's best here. Thanks for the info!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r300g: add program name check for BSD

2013-06-26 Thread Patrick Baggett
On Wed, Jun 26, 2013 at 2:11 AM, Jonathan Gray  wrote:

> program_invocation_short_name is glibc specific.  Provide an
> alternative using getprogname(), which can be found on *BSD and OS X.
>
> Signed-off-by: Jonathan Gray 
> ---
>  src/gallium/drivers/r300/r300_chipset.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git src/gallium/drivers/r300/r300_chipset.c
> src/gallium/drivers/r300/r300_chipset.c
> index 11061ed..7f51ccb 100644
> --- src/gallium/drivers/r300/r300_chipset.c
> +++ src/gallium/drivers/r300/r300_chipset.c
> @@ -30,6 +30,14 @@
>  #include 
>  #include 
>
> +#undef GET_PROGRAM_NAME
> +#ifdef __GLIBC__
> +#  define GET_PROGRAM_NAME() program_invocation_short_name
>

I think you are missing parentheses on the end of
program_invocation_short_name


> +#else /* *BSD and OS X */
> +#  include 
> +#  define GET_PROGRAM_NAME() getprogname()
> +#endif
> +
>  /* r300_chipset: A file all to itself for deducing the various properties
> of
>   * Radeons. */
>
> @@ -49,7 +57,7 @@ static void r300_apply_hyperz_blacklist(struct
> r300_capabilities* caps)
>  int i;
>
>  for (i = 0; i < Elements(list); i++) {
> -if (strcmp(list[i], program_invocation_short_name) == 0) {
> +if (strcmp(list[i], GET_PROGRAM_NAME()) == 0) {
>  caps->zmask_ram = 0;
>  caps->hiz_ram = 0;
>  break;
> --
> 1.8.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions

2014-02-05 Thread Patrick Baggett
My understanding is that this is like having MAP_UNSYNCHRONIZED on at all
times, even when it isn't "mapped", because it is always mapped (into
memory). Is that correct Jose?

Patrick


On Wed, Feb 5, 2014 at 11:53 AM, Grigori Goronzy  wrote:

> On 05.02.2014 18:08, Jose Fonseca wrote:
>
>> I honestly hope that GL_AMD_pinned_memory doesn't become popular. It
>> would have been alright if it wasn't for this bit in
>> http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says:
>>
>>  2) Can the application still use the buffer using the CPU address?
>>
>>  RESOLVED: YES. However, this access would be completely
>>  non synchronized to the OpenGL pipeline, unless explicit
>>  synchronization is being used (for example, through glFinish or
>> by using
>>  sync objects).
>>
>> And I'm imagining apps which are streaming vertex data doing precisely
>> just that...
>>
>>
> I don't understand your concern, this is exactly the same behavior
> GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly.
> How does apitrace handle it?
>
> Grigori
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] radeonsi: Use util_cpu_to_le32() instead of bswap32() on big-endian systems

2014-02-20 Thread Patrick Baggett
FWIW, memcpy() vs a for() loop has different semantics with respect to
address alignment. I don't know how much it will matter, but last time I
was reading assembly output, copying int[] via for() loop didn't produce a
codepath for 16-byte aligned addresses (allowing for SSE streaming) while
memcpy() has a lot of such logic. This won't matter much unless you have
lots to copy, and of course, compiler optimizations can change, so maybe
this situation has changed.

Patrick


On Thu, Feb 20, 2014 at 8:11 PM, Michel Dänzer  wrote:

> On Don, 2014-02-20 at 10:21 -0800, Tom Stellard wrote:
> >
> > diff --git a/src/gallium/drivers/radeonsi/si_shader.c
> b/src/gallium/drivers/radeonsi/si_shader.c
> > index 54270cd..9b04e6b 100644
> > --- a/src/gallium/drivers/radeonsi/si_shader.c
> > +++ b/src/gallium/drivers/radeonsi/si_shader.c
> > @@ -2335,7 +2335,7 @@ int si_compile_llvm(struct si_context *sctx,
> struct si_pipe_shader *shader,
> >   ptr = (uint32_t*)sctx->b.ws->buffer_map(shader->bo->cs_buf,
> sctx->b.rings.gfx.cs, PIPE_TRANSFER_WRITE);
> >   if (0 /*SI_BIG_ENDIAN*/) {
> >   for (i = 0; i < binary.code_size / 4; ++i) {
> > - ptr[i] = util_bswap32(*(uint32_t*)(binary.code +
> i*4));
> > + ptr[i] =
> util_cpu_to_le32((*(uint32_t*)(binary.code + i*4)));
> >   }
> >   } else {
> >   memcpy(ptr, binary.code, binary.code_size);
>
> We could get rid of the separate *_ENDIAN paths using util_cpu_to_le*().
>
> Either way, the non-clover patches are
>
> Reviewed-by: Michel Dänzer 
>
>
> --
> Earthling Michel Dänzer|  http://www.amd.com
> Libre software enthusiast  |Mesa and X developer
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: Unbreak usage of assert()/debug_assert() inside expressions.

2014-12-12 Thread Patrick Baggett
On Fri, Dec 12, 2014 at 10:17 AM, Roland Scheidegger 
wrote:

> Am 12.12.2014 um 15:09 schrieb Jose Fonseca:
> > From: José Fonseca 
> >
> > f0ba7d897d1c22202531acb70f134f2edc30557d made debug_assert()/assert()
> > unsafe for expressions, but only now with u_atomic.h started to rely on
> > them for Windows this became an issue.
> >
> > This fixes non-debug builds with MSVC.
> > ---
> >  src/gallium/auxiliary/util/u_debug.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/gallium/auxiliary/util/u_debug.h
> b/src/gallium/auxiliary/util/u_debug.h
> > index badd5e2..4c22fdf 100644
> > --- a/src/gallium/auxiliary/util/u_debug.h
> > +++ b/src/gallium/auxiliary/util/u_debug.h
> > @@ -185,7 +185,7 @@ void _debug_assert_fail(const char *expr,
> >  #ifdef DEBUG
> >  #define debug_assert(expr) ((expr) ? (void)0 :
> _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__))
> >  #else
> > -#define debug_assert(expr) do { } while (0 && (expr))
> > +#define debug_assert(expr) (void)(0 && (expr))
> >  #endif
> >
> >
> >
>

Just for my own education, can someone explain what the need for
`debug_assert()` to have any expansion of `expr` at all? Rather, what
breaks with something like:

  #define debug_assert(expr) ((void)0)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Bug 27512] Illegal instruction _mesa_x86_64_transform_points4_general

2016-01-05 Thread Patrick Baggett
Given that there is a _mesa_3dnow_transform_points4_2d in the x86-64 asm
(using MMX/3DNow! is deprecated in x86-64), it appears that this code was
copy-pasted. I wrote a quick patch to change prefetch[w] to prefetcht1,
which is more or less the equivalent in SSE. However, I'm not actually sure
those prefetches really benefit the code since they appear to be monotonic
addresses and hinting only 16 bytes ahead (a cache line is almost always at
least 32 bytes) -- maybe that sort of testing is for another day.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/tiled_memcpy: don't unconditionally use __builtin_bswap32

2016-04-19 Thread Patrick Baggett
On Mon, Apr 18, 2016 at 9:31 PM, Jonathan Gray  wrote:

> Use the defines Mesa configure sets to indicate presence of the bswap32
> builtins.  This lets i965 work on OpenBSD again after the changes that
> were made in 0a5d8d9af42fd77fce1492d55f958da97816961a.
>
> Signed-off-by: Jonathan Gray 
> ---
>  src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> index a549854..c888e46 100644
> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> @@ -64,6 +64,19 @@ ror(uint32_t n, uint32_t d)
> return (n >> d) | (n << (32 - d));
>  }
>
> +static inline uint32_t
> +bswap32(uint32_t n)
> +{
> +#if defined(HAVE___BUILTIN_BSWAP32)
> +   return __builtin_bswap32(n);
> +#else
> +   return (n >> 24) |
> +  ((n >> 8) & 0xff00) |
> +  ((n << 8) & 0x00ff) |
> +  (n << 24);
> +#endif
> +}
>

If I recall, GCC recognizes an open-coded byte swapping funciton and will
replace it with the BSWAP instruction. I'm about 99% sure it is not
necessary to use __built_bswap32() to have the benefits of using BSWAP.
While I understand that you're trying to fix the use of
__builtin_bswap32(), I don't think it is really necessary to continue to
use it in your wrapper function. I'm not sure about -O0 though... anyways,
maybe it isn't worth looking too hard into, but you might be able to drop
some of the ugly #if defined() stuff.



> +
>  /**
>   * Copy RGBA to BGRA - swap R and B.
>   */
> @@ -76,7 +89,7 @@ rgba8_copy(void *dst, const void *src, size_t bytes)
> assert(bytes % 4 == 0);
>
> while (bytes >= 4) {
> -  *d = ror(__builtin_bswap32(*s), 8);
> +  *d = ror(bswap32(*s), 8);
>d += 1;
>s += 1;
>bytes -= 4;
> --
> 2.8.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3][RFC v2] mesa/main/x86: Add sse2 streaming clamping

2014-11-04 Thread Patrick Baggett
On Tue, Nov 4, 2014 at 6:05 AM, Juha-Pekka Heikkila <
juhapekka.heikk...@gmail.com> wrote:

> Signed-off-by: Juha-Pekka Heikkila 
> ---
>  src/mesa/Makefile.am  |   8 +++
>  src/mesa/main/x86/sse2_clamping.c | 103
> ++
>  src/mesa/main/x86/sse2_clamping.h |  49 ++
>  3 files changed, 160 insertions(+)
>  create mode 100644 src/mesa/main/x86/sse2_clamping.c
>  create mode 100644 src/mesa/main/x86/sse2_clamping.h
>
> diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
> index e71bccb..5d3c6f5 100644
> --- a/src/mesa/Makefile.am
> +++ b/src/mesa/Makefile.am
> @@ -111,6 +111,10 @@ if SSE41_SUPPORTED
>  ARCH_LIBS += libmesa_sse41.la
>  endif
>
> +if SSE2_SUPPORTED
> +ARCH_LIBS += libmesa_sse2.la
> +endif
> +
>  MESA_ASM_FILES_FOR_ARCH =
>
>  if HAVE_X86_ASM
> @@ -154,6 +158,10 @@ libmesa_sse41_la_SOURCES = \
> main/streaming-load-memcpy.c
>  libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
>
> +libmesa_sse2_la_SOURCES = \
> +   main/x86/sse2_clamping.c
> +libmesa_sse2_la_CFLAGS = $(AM_CFLAGS) -msse2
> +
>  pkgconfigdir = $(libdir)/pkgconfig
>  pkgconfig_DATA = gl.pc
>
> diff --git a/src/mesa/main/x86/sse2_clamping.c
> b/src/mesa/main/x86/sse2_clamping.c
> new file mode 100644
> index 000..7df1c85
> --- /dev/null
> +++ b/src/mesa/main/x86/sse2_clamping.c
> @@ -0,0 +1,103 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + *Juha-Pekka Heikkila 
> + *
> + */
> +
> +#ifdef __SSE2__
> +#include "main/macros.h"
> +#include "main/x86/sse2_clamping.h"
> +#include 
> +
> +/**
> + * Clamp four float values to [min,max]
> + */
> +static inline void
> +_mesa_clamp_float_rgba(GLfloat src[4], GLfloat result[4], const float min,
> +   const float max)
> +{
> +   __m128  operand, minval, maxval;
> +
> +   operand = _mm_loadu_ps(src);
> +   minval = _mm_set1_ps(min);
> +   maxval = _mm_set1_ps(max);
> +   operand = _mm_max_ps(operand, minval);
> +   operand = _mm_min_ps(operand, maxval);
> +   _mm_storeu_ps(result, operand);
> +}
> +
> +
> +/* Clamp n amount float rgba pixels to [min,max] using SSE2
>

Conceptually, _mesa_streaming_clamp_float_rgba() is clamping a contiguous
array of floats to some min/max value. The fact that they are pixels is
somewhat incidental when looking at it from a stream perspective. It looks
like the code is more or less just operating on n*4 floats. Given that, a
more efficient implementation would check alignment and then use aligned
loads and streaming stores. It doesn't really matter if you straddle pixel
boundaries as long as each float is operated on. I'm not sure how much
effort you want to put into this though. :)


> + */
> +void
> +_mesa_streaming_clamp_float_rgba(const GLuint n, GLfloat rgba_src[][4],
> + GLfloat rgba_dst[][4], const GLfloat min,
> + const GLfloat max)
> +{
> +   int i;
> +
> +   for (i = 0; i < n; i++) {
> +  _mesa_clamp_float_rgba(rgba_src[i], rgba_dst[i], min, max);
> +   }
> +}
> +
> +
> +/* Clamp n amount float rgba pixels to [min,max] using SSE2 and apply
> + * scaling and mapping to components.
> + *
> + * this replace handling of [RGBA] channels:
> + * rgba_temp[RCOMP] = CLAMP(rgba[i][RCOMP], 0.0F, 1.0F);
> + * rgba[i][RCOMP] = rMap[F_TO_I(rgba_temp[RCOMP] * scale[RCOMP])];
> + */
> +void
> +_mesa_clamp_float_rgba_scale_and_map(const GLuint n, GLfloat
> rgba_src[][4],
> + GLfloat rgba_dst[][4], const GLfloat
> min,
> + const GLfloat max,
> + const GLfloat scale[4],
> + const GLfloat* rMap, const GLfloat*
> gMap,
> + const GLfloat

Re: [Mesa-dev] [PATCH v5] gallium/auxiliary: add inc and dec alternative with return (v3)

2014-11-17 Thread Patrick Baggett
On Mon, Nov 17, 2014 at 12:20 PM, Axel Davy  wrote:

> From: Christoph Bumiller 
>
> At this moment we use only zero or positive values.
>
> v2: Implement it for also for Solaris, MSVC assembly
> and enable for other combinations.
>
> v3: Replace MSVC assembly by assert + warning during compilation
>
> Signed-off-by: David Heidelberg 
> ---
>  src/gallium/auxiliary/util/u_atomic.h | 72
> +++
>  1 file changed, 72 insertions(+)
>
> diff --git a/src/gallium/auxiliary/util/u_atomic.h
> b/src/gallium/auxiliary/util/u_atomic.h
> index 2f2b42b..9279073 100644
> --- a/src/gallium/auxiliary/util/u_atomic.h
> +++ b/src/gallium/auxiliary/util/u_atomic.h
> @@ -69,6 +69,18 @@ p_atomic_dec(int32_t *v)
>  }
>
>  static INLINE int32_t
> +p_atomic_inc_return(int32_t *v)
> +{
> +   return __sync_add_and_fetch(v, 1);
> +}
> +
> +static INLINE int32_t
> +p_atomic_dec_return(int32_t *v)
> +{
> +   return __sync_sub_and_fetch(v, 1);
> +}
> +
> +static INLINE int32_t
>  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>  {
> return __sync_val_compare_and_swap(v, old, _new);
> @@ -116,6 +128,18 @@ p_atomic_dec(int32_t *v)
>  }
>
>  static INLINE int32_t
> +p_atomic_inc_return(int32_t *v)
> +{
> +   return __sync_add_and_fetch(v, 1);
> +}
> +
> +static INLINE int32_t
> +p_atomic_dec_return(int32_t *v)
> +{
> +   return __sync_sub_and_fetch(v, 1);
> +}
> +
> +static INLINE int32_t
>  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>  {
> return __sync_val_compare_and_swap(v, old, _new);
> @@ -161,6 +185,18 @@ p_atomic_dec(int32_t *v)
>  }
>
>  static INLINE int32_t
> +p_atomic_inc_return(int32_t *v)
> +{
> +   return __sync_add_and_fetch(v, 1);
> +}
> +
> +static INLINE int32_t
> +p_atomic_dec_return(int32_t *v)
> +{
> +   return __sync_sub_and_fetch(v, 1);
> +}
> +
> +static INLINE int32_t
>  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>  {
> return __sync_val_compare_and_swap(v, old, _new);
> @@ -186,6 +222,8 @@ p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>  #define p_atomic_dec_zero(_v) ((boolean) --(*(_v)))
>  #define p_atomic_inc(_v) ((void) (*(_v))++)
>  #define p_atomic_dec(_v) ((void) (*(_v))--)
> +#define p_atomic_inc_return(_v) ((*(_v))++)
> +#define p_atomic_dec_return(_v) ((*(_v))--)
>  #define p_atomic_cmpxchg(_v, old, _new) (*(_v) == old ? *(_v) = (_new) :
> *(_v))
>
>  #endif
> @@ -197,6 +235,8 @@ p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>
>  #define PIPE_ATOMIC "MSVC x86 assembly"
>
> +#include 
> +
>  #ifdef __cplusplus
>  extern "C" {
>  #endif
> @@ -236,6 +276,24 @@ p_atomic_dec(int32_t *v)
> }
>  }
>
> +#pragma message ( "Warning: p_atomic_dec_return and p_atomic_inc_return
> unimplemented for PIPE_ATOMIC_ASM_MSVC_X86" )
> +
> +static INLINE int32_t
> +p_atomic_inc_return(int32_t *v)
> +{
> +   (void) v;
> +   assert(0);
> +   return 0;
> +}
>

Why isn't _InterlockedIncrement() used here? It is used for the void
functions. If you read the definition of _InterlockedIncrement() it returns
the new value -- isn't that what is needed?


> +
> +static INLINE int32_t
> +p_atomic_dec_return(int32_t *v)
> +{
> +   (void) v;
> +   assert(0);
> +   return 0;
> +}
>

Similarly here.


> +
>  static INLINE int32_t
>  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>  {
> @@ -288,6 +346,12 @@ p_atomic_inc(int32_t *v)
> _InterlockedIncrement((long *)v);
>  }
>
> +static INLINE int32_t
> +p_atomic_inc_return(int32_t *v)
> +{
> +   return _InterlockedIncrement((long *)v);
> +}
> +
>  static INLINE void
>  p_atomic_dec(int32_t *v)
>  {
> @@ -295,6 +359,12 @@ p_atomic_dec(int32_t *v)
>  }
>
>  static INLINE int32_t
> +p_atomic_dec_return(int32_t *v)
> +{
> +   return _InterlockedDecrement((long *)v);
> +}
> +
> +static INLINE int32_t
>  p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
>  {
> return _InterlockedCompareExchange((long *)v, _new, old);
> @@ -329,6 +399,8 @@ p_atomic_dec_zero(int32_t *v)
>
>  #define p_atomic_inc(_v) atomic_inc_32((uint32_t *) _v)
>  #define p_atomic_dec(_v) atomic_dec_32((uint32_t *) _v)
> +#define p_atomic_inc_return(_v) atomic_inc_32_nv((uint32_t *) _v)
> +#define p_atomic_dec_return(_v) atomic_dec_32_nv((uint32_t *) _v)
>
>  #define p_atomic_cmpxchg(_v, _old, _new) \
> atomic_cas_32( (uint32_t *) _v, (uint32_t) _old, (uint32_t) _new)
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] gallium/auxiliary: add inc and dec alternative with return (v3)

2014-11-17 Thread Patrick Baggett
>
>
> Looking at u_atomic.h there is a section that uses
> PIPE_ATOMIC_ASM_MSVC_X86 and has explicit assembly, and there's a
> section that uses PIPE_ATOMIC_MSVC_INTRINSIC and has intrinsics. No
> clue whatsoever what the difference between them is, but presumably it
> doesn't exist solely for the purpose of annoying developers...
>

I can't think of a good reason; I would be interested in knowing why. Last
time I checked, MSVC is terrible at optimizing around __asm{} blocks and if
I recall, only x86 (i.e. 32-bit) supports inline assembly. This is a bit
off-topic though...


>
>   -ilia
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/29] mesa: Add _mesa_swap2_copy and _mesa_swap4_copy

2014-11-19 Thread Patrick Baggett
On Tue, Nov 18, 2014 at 3:23 AM, Iago Toral Quiroga 
wrote:

> We have _mesa_swap{2,4} but these do in-place byte-swapping only. The new
> functions receive an extra parameter so we can swap bytes on a source
> input array and store the results in a (possibly different) destination
> array.
>
>
If this is being split into an "in-place" and "different pointers" version,
I think using the "restrict" keyword would be useful here to improve the
performance. Then, the in-place one cannot be implemented as copy(p,p,n),
but the code isn't overly complicated.



> This is useful to implement byte-swapping in pixel uploads, since in this
> case we need to swap bytes on the src data which is owned by the
> application so we can't do an in-place byte swap.
> ---
>  src/mesa/main/image.c | 25 +
>  src/mesa/main/image.h | 10 --
>  2 files changed, 25 insertions(+), 10 deletions(-)
>
> diff --git a/src/mesa/main/image.c b/src/mesa/main/image.c
> index 4ea5f04..9ad97c5 100644
> --- a/src/mesa/main/image.c
> +++ b/src/mesa/main/image.c
> @@ -41,36 +41,45 @@
>
>
>  /**
> - * Flip the order of the 2 bytes in each word in the given array.
> + * Flip the order of the 2 bytes in each word in the given array (src) and
> + * store the result in another array (dst). For in-place byte-swapping
> this
> + * function can be called with the same array for src and dst.
>   *
> - * \param p array.
> + * \param dst the array where byte-swapped data will be stored.
> + * \param src the array with the source data we want to byte-swap.
>   * \param n number of words.
>   */
>  void
> -_mesa_swap2( GLushort *p, GLuint n )
> +_mesa_swap2_copy( GLushort *dst, GLushort *src, GLuint n )
>  {
> GLuint i;
> for (i = 0; i < n; i++) {
> -  p[i] = (p[i] >> 8) | ((p[i] << 8) & 0xff00);
> +  dst[i] = (src[i] >> 8) | ((src[i] << 8) & 0xff00);
> }
>  }
>
>
>
>  /*
> - * Flip the order of the 4 bytes in each word in the given array.
> + * Flip the order of the 4 bytes in each word in the given array (src) and
> + * store the result in another array (dst). For in-place byte-swapping
> this
> + * function can be called with the same array for src and dst.
> + *
> + * \param dst the array where byte-swapped data will be stored.
> + * \param src the array with the source data we want to byte-swap.
> + * \param n number of words.
>   */
>  void
> -_mesa_swap4( GLuint *p, GLuint n )
> +_mesa_swap4_copy( GLuint *dst, GLuint *src, GLuint n )
>  {
> GLuint i, a, b;
> for (i = 0; i < n; i++) {
> -  b = p[i];
> +  b = src[i];
>a =  (b >> 24)
> | ((b >> 8) & 0xff00)
> | ((b << 8) & 0xff)
> | ((b << 24) & 0xff00);
> -  p[i] = a;
> +  dst[i] = a;
> }
>  }
>
> diff --git a/src/mesa/main/image.h b/src/mesa/main/image.h
> index abd84bf..79c6e68 100644
> --- a/src/mesa/main/image.h
> +++ b/src/mesa/main/image.h
> @@ -33,10 +33,16 @@ struct gl_context;
>  struct gl_pixelstore_attrib;
>
>  extern void
> -_mesa_swap2( GLushort *p, GLuint n );
> +_mesa_swap2_copy( GLushort *dst, GLushort *src, GLuint n );
>
>  extern void
> -_mesa_swap4( GLuint *p, GLuint n );
> +_mesa_swap4_copy( GLuint *dst, GLuint *src, GLuint n );
> +
> +static inline void
> +_mesa_swap2( GLushort *p, GLuint n ) { _mesa_swap2_copy(p, p, n); }
> +
> +static inline void
> +_mesa_swap4( GLuint *p, GLuint n ) { _mesa_swap4_copy(p, p, n); }
>
>  extern GLintptr
>  _mesa_image_offset( GLuint dimensions,
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/29] mesa: Add _mesa_swap2_copy and _mesa_swap4_copy

2014-11-20 Thread Patrick Baggett
>
>
>>
> The restrict keyword is a C99 thing and I don't think it's supported in
> MSVC so that would be a problem.  If it won't build with MSVC then it's a
> non-starter.  If MSVC can handle "restrict", then I don't know that I care
> much either way about 2 functions or 4
>
>
MSVC uses "__restrict" which functions identically -- but if there doesn't
already exist a #define around this "MSVC-ism", then I guess it may be more
work then Iago was really signing up for. But it does exist.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] glsl: fix new gcc6 warnings

2016-02-17 Thread Patrick Baggett
On Wed, Feb 17, 2016 at 3:35 PM, Rob Clark  wrote:
> src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status 
> {anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump*)’ defined 
> but not used [-Wunused-function]
>  lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
>  ^~
>
> The base class method that was intended to be overridden was
> 'visit(ir_loop_jump *ir)', not visit_entire().
>
Has there been a discussion about using the "override" keyword
(C++11)? It sounds like it could catch bugs like this, and if hidden
behind a #define, act as a no-op when C++11 is not supported. Although
obviously the new gcc6 warning is effectively doing much the same
thing...


> Signed-off-by: Rob Clark 
> ---
>  src/compiler/glsl/lower_discard_flow.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/glsl/lower_discard_flow.cpp 
> b/src/compiler/glsl/lower_discard_flow.cpp
> index 9d0a56b..9e3a7c0 100644
> --- a/src/compiler/glsl/lower_discard_flow.cpp
> +++ b/src/compiler/glsl/lower_discard_flow.cpp
> @@ -62,8 +62,8 @@ public:
> {
> }
>
> +   ir_visitor_status visit(ir_loop_jump *ir);
> ir_visitor_status visit_enter(ir_discard *ir);
> -   ir_visitor_status visit_enter(ir_loop_jump *ir);
> ir_visitor_status visit_enter(ir_loop *ir);
> ir_visitor_status visit_enter(ir_function_signature *ir);
>
> @@ -76,7 +76,7 @@ public:
>  } /* anonymous namespace */
>
>  ir_visitor_status
> -lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
> +lower_discard_flow_visitor::visit(ir_loop_jump *ir)
>  {
> if (ir->mode != ir_loop_jump::jump_continue)
>return visit_continue;
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Patrick Baggett
On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
> cycles in affected programs: 9768 -> 9582 (-1.90%)
> helped: 12
> HURT: 0
>
> Broadwell / Skylake
> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
> instructions in affected programs: 626 -> 619 (-1.12%)
> helped: 7
> HURT: 0
>
> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
> cycles in affected programs: 9378 -> 9192 (-1.98%)
> helped: 12
> HURT: 0
>
> G45 and Ironlake showed no change.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 4db8f84..1442ce8 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -108,6 +108,11 @@ optimizations = [
> # inot(a)
> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>
> +   # 0.0 < fabs(a)
> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
some a, you can't say then fabs(a) != 0.0.

Then, the counter-example is when a = 0.0

1) 0.0 != fabs(0.0)
2) 0.0 != 0.0

> +   # 0.0 != a




> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
> +
> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Patrick Baggett
On Thu, Mar 10, 2016 at 3:08 PM, Patrick Baggett
 wrote:
> On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick  wrote:
>> From: Ian Romanick 
>>
>> Sandy Bridge / Ivy Bridge / Haswell
>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>> instructions in affected programs: 564 -> 558 (-1.06%)
>> helped: 6
>> HURT: 0
>>
>> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
>> cycles in affected programs: 9768 -> 9582 (-1.90%)
>> helped: 12
>> HURT: 0
>>
>> Broadwell / Skylake
>> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
>> instructions in affected programs: 626 -> 619 (-1.12%)
>> helped: 7
>> HURT: 0
>>
>> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
>> cycles in affected programs: 9378 -> 9192 (-1.98%)
>> helped: 12
>> HURT: 0
>>
>> G45 and Ironlake showed no change.
>>
>> Signed-off-by: Ian Romanick 
>> ---
>>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
>> b/src/compiler/nir/nir_opt_algebraic.py
>> index 4db8f84..1442ce8 100644
>> --- a/src/compiler/nir/nir_opt_algebraic.py
>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>> @@ -108,6 +108,11 @@ optimizations = [
>> # inot(a)
>> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>>
>> +   # 0.0 < fabs(a)
>> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
> I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
> some a, you can't say then fabs(a) != 0.0.
>
> Then, the counter-example is when a = 0.0
>
> 1) 0.0 != fabs(0.0)
> 2) 0.0 != 0.0
>
Rather, I mean the comment is wrong, but the conclusion that:
0 < fabs(a) <-> a != 0.0
is correct. You can just build a truth table or just observe that when
a == 0, 0 < 0 is false, and
when a != 0.0, fabs(a) will be > 0, so 0 < fabs(a) will be always true.



>> +   # 0.0 != a
>
>
>
>
>> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
>> +
>> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-11 Thread Patrick Baggett
On Fri, Mar 11, 2016 at 10:21 AM, Ian Romanick  wrote:
> On 03/10/2016 01:24 PM, Patrick Baggett wrote:
>> On Thu, Mar 10, 2016 at 3:08 PM, Patrick Baggett
>>  wrote:
>>> On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick  wrote:
>>>> From: Ian Romanick 
>>>>
>>>> Sandy Bridge / Ivy Bridge / Haswell
>>>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>>>> instructions in affected programs: 564 -> 558 (-1.06%)
>>>> helped: 6
>>>> HURT: 0
>>>>
>>>> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
>>>> cycles in affected programs: 9768 -> 9582 (-1.90%)
>>>> helped: 12
>>>> HURT: 0
>>>>
>>>> Broadwell / Skylake
>>>> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
>>>> instructions in affected programs: 626 -> 619 (-1.12%)
>>>> helped: 7
>>>> HURT: 0
>>>>
>>>> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
>>>> cycles in affected programs: 9378 -> 9192 (-1.98%)
>>>> helped: 12
>>>> HURT: 0
>>>>
>>>> G45 and Ironlake showed no change.
>>>>
>>>> Signed-off-by: Ian Romanick 
>>>> ---
>>>>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>>>>  1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
>>>> b/src/compiler/nir/nir_opt_algebraic.py
>>>> index 4db8f84..1442ce8 100644
>>>> --- a/src/compiler/nir/nir_opt_algebraic.py
>>>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>>>> @@ -108,6 +108,11 @@ optimizations = [
>>>> # inot(a)
>>>> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>>>>
>>>> +   # 0.0 < fabs(a)
>>>> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
>>> I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
>>> some a, you can't say then fabs(a) != 0.0.
>>>
>>> Then, the counter-example is when a = 0.0
>>>
>>> 1) 0.0 != fabs(0.0)
>>> 2) 0.0 != 0.0
>>>
>> Rather, I mean the comment is wrong, but the conclusion that:
>> 0 < fabs(a) <-> a != 0.0
>> is correct. You can just build a truth table or just observe that when
>> a == 0, 0 < 0 is false, and
>> when a != 0.0, fabs(a) will be > 0, so 0 < fabs(a) will be always true.
>
> How about if I change it to
>
># 0.0 != fabs(a)  Since fabs(a) >= 0, 0 <= fabs(a) must be true
>
> I think it's trivial to see how to get from "0 < fabs(a)" to "0 !=
> fabs(a)" based on that.
Yeah, I think what gave me a pause when I read was "0.0 != fabs(a)",
because that's not a general mathematical truth unless qualified by "a
!= 0.0". I don't have any particularly strong feelings about the
wording. I personally didn't reason about it using (in)equalities at
all. My logic was mostly based on domain analysis of the expression:
let p(a) := 0 < fabs(a)
p(0) <-> false
p(a) <-> true, for any other value of a
therefore p(a) <-> true when a != 0.0
therefore p(a) <-> a != 0

It's up to you.

>
>>>> +   # 0.0 != a
>>>> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
>>>> +
>>>> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>>>> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>>>> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>>>> --
>>>> 2.5.0
>>>>
>>>> ___
>>>> mesa-dev mailing list
>>>> mesa-dev@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use double-precision pow() when bit_size is 64, powf() otherwise

2016-03-28 Thread Patrick Baggett
> What are the rules in C when you compare a double
> variable with a single constant?
>
> void foo(double d)
> {
> /* Does d get converted to single, or does 0.0f get converted to
>  * double?
>  */
> if (d == 0.0f)
> printf("zero\n");
> }

The 0.0f is converted to a double. One site [1] has a likely looking
reference. :) Sadly, I don't know how to check the C spec directly (I
think that it is not free).

[1] https://www.eskimo.com/~scs/cclass/int/sx4cb.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use double-precision pow() when bit_size is 64, powf() otherwise

2016-03-28 Thread Patrick Baggett
On Mon, Mar 28, 2016 at 1:58 PM, Patrick Baggett
 wrote:
>> What are the rules in C when you compare a double
>> variable with a single constant?
>>
>> void foo(double d)
>> {
>> /* Does d get converted to single, or does 0.0f get converted to
>>  * double?
>>  */
>> if (d == 0.0f)
>> printf("zero\n");
>> }
>
> The 0.0f is converted to a double. One site [1] has a likely looking
> reference. :) Sadly, I don't know how to check the C spec directly (I
> think that it is not free).
>
> [1] https://www.eskimo.com/~scs/cclass/int/sx4cb.html

Nevermind, the spec is available..found the link via Wikipedia.

6.3.1.8 Usual arithmetic conversions
1

Otherwise, if the corresponding real type of either operand is double,
the other operand is converted, without change of type domain, to a
type whose corresponding real type is double.

So yes, 100% sure that it is promoted to a double.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Discussion: C++11 std::future in Mesa

2016-06-01 Thread Patrick Baggett
>
>
> No. Shader compilation can only be asynchronous if it's far enough
> from a draw call and the app doesn't query its status. If it's next to
> a draw call, multithreading is useless. Completely useless.
>

I don't know a lot about the shader compilation/linking process, so
I'm just asking this for my own benefit.

I read that the optimizations take a long time. Is it possible to
create a sort of -O0 version of the shader while the real version is
generated by some thread pool? Or would there be some shaders that
would just fail to run unless optimization took place (and the
developers count on that)?

> We need to get below 33 ms for all shaders needed to be compiled to
> render a frame. If there are 10 VS and 10 PS, one shader must be
> compiled within 1.65 ms on average. I don't see where your random
> guess meets that goal.
>
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patchwork review process (efficiency) questions

2016-06-03 Thread Patrick Baggett
> I will point out a couple notes/observations:
>
> Kernel (drm/dri-devel), xorg, and other related projects use the same
> process, and a lot of us do (or at least at some point have) been
> active in 2 or more of these.
>
> Also, I have seen/used some other processes (gerrit, github pulls,
> etc).. and IMO on those projects the review process ended up being a
> lot more rubber-stamping and less thorough review of the changes.
> There is some value in not making things too "push-button"..

What are people's opinions on patchwork? I'm a regular reader but not
contributor. I find the interface appealing and overall not too
difficult to see recently submitted patches. Is it slower
(workflow-wise)/less convenient to use than email? Or are there
certain use-cases that just don't work?

-- Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Make TexSubImage check negative dimensions sooner.

2016-06-08 Thread Patrick Baggett
Sorry, didn't CC mesa-dev, trying again...

On Wed, Jun 8, 2016 at 4:11 PM, Kenneth Graunke  wrote:
> Two dEQP tests expect INVALID_VALUE errors for negative width/height
> parameters, but get INVALID_OPERATION because they haven't actually
> created a destination image.  This is arguably not a bug in Mesa, as
> there's no specified ordering of error conditions.
>
> However, it's also really easy to make the tests pass, and there's
> no real harm in doing these checks earlier.
>
> Fixes:
> dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_width_height
> dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texsubimage3d_neg_width_height
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/teximage.c | 68 
> ++--
>  1 file changed, 49 insertions(+), 19 deletions(-)
>
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index 58b7f27..d4f8278 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -1102,6 +1102,32 @@ _mesa_legal_texture_dimensions(struct gl_context *ctx, 
> GLenum target,
> }
>  }
>
> +static bool
> +error_check_subtexture_negative_dimensions(struct gl_context *ctx,
> +   GLuint dims,
> +   GLsizei subWidth,
> +   GLsizei subHeight,
> +   GLsizei subDepth,
> +   const char *func)
> +{
> +   /* Check size */
> +   if (subWidth < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(width=%d)", func, subWidth);
> +  return true;
> +   }
> +
> +   if (dims > 1 && subHeight < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(height=%d)", func, subHeight);
> +  return true;
> +   }
> +
> +   if (dims > 2 && subDepth < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(depth=%d)", func, subDepth);
> +  return true;
> +   }
> +

What do you think of a structure like:

switch(dims) {
case 3:
if(subDepth < 0) {
...
}
/* fall through */
case 2:
if(subHeight < 0) {
...
}
   /* fall through *
default:
if(subWidth < 0) {
...
}
}
return true;

I think this would reduce the overall number of expressions to check.
If you just want to check whether any are < 0, you can OR the sign
bits:


int result = 0;
switch(dims) {
case 3: result |= subDepth & (1 << 31);
case 2: result |= subHeight & (1 << 31);
default: result |= subWidth & (1 << 31);
}
return (bool)(result>>31);

...then later call that function to generate a more detailed error
message about specifically which dimension was negative.

> +   return false;
> +}
>
>  /**
>   * Do error checking of xoffset, yoffset, zoffset, width, height and depth
> @@ -1119,25 +1145,6 @@ error_check_subtexture_dimensions(struct gl_context 
> *ctx, GLuint dims,
> const GLenum target = destImage->TexObject->Target;
> GLuint bw, bh, bd;
>
> -   /* Check size */
> -   if (subWidth < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE,
> -  "%s(width=%d)", func, subWidth);
> -  return GL_TRUE;
> -   }
> -
> -   if (dims > 1 && subHeight < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE,
> -  "%s(height=%d)", func, subHeight);
> -  return GL_TRUE;
> -   }
> -
> -   if (dims > 2 && subDepth < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE,
> -  "%s(depth=%d)", func, subDepth);
> -  return GL_TRUE;
> -   }
> -
> /* check xoffset and width */
> if (xoffset < - (GLint) destImage->Border) {
>_mesa_error(ctx, GL_INVALID_VALUE, "%s(xoffset)", func);
> @@ -2104,6 +2111,12 @@ texsubimage_error_check(struct gl_context *ctx, GLuint 
> dimensions,
>return GL_TRUE;
> }
>
> +   if (error_check_subtexture_negative_dimensions(ctx, dimensions,
> +  width, height, depth,
> +  callerName)) {
> +  return GL_TRUE;
> +   }
> +
> texImage = _mesa_select_tex_image(texObj, target, level);
> if (!texImage) {
>/* non-existant texture level */
> @@ -2140,6 +2153,12 @@ texsubimage_error_check(struct gl_context *ctx, GLuint 
> dimensions,
>return GL_TRUE;
> }
>
> +   if (error_check_subtexture_negative_dimensions(ctx, dimensions,
> +  width, height, depth,
> +  callerName)) {
> +  return GL_TRUE;
> +   }
> +
> if (error_check_subtexture_dimensions(ctx, dimensions,
>   texImage, xoffset, yoffset, zoffset,
>   width, height, depth, callerName)) {
> @@ -2497,6 +2516,11 @@ copytexsubimage_error_check(struct gl_context *ctx, 
> GLuint dimensions,
>return GL_TRUE;
>

Re: [Mesa-dev] [PATCH 11/11] glsl: Optimize X / X == 1

2014-08-07 Thread Patrick Baggett
Would this be conformant to GLSL spec if X had a runtime value of 0? Seems
unsafe to replace X / X with 1 without a runtime test...maybe GLSL spec
allows such optimizations.


On Thu, Aug 7, 2014 at 3:51 PM,  wrote:

> From: Thomas Helland 
>
> Shows no changes for shader-db.
>
> Signed-off-by: Thomas Helland 
> ---
>  src/glsl/opt_algebraic.cpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
> index 21bf332..a49752d 100644
> --- a/src/glsl/opt_algebraic.cpp
> +++ b/src/glsl/opt_algebraic.cpp
> @@ -513,6 +513,8 @@ ir_algebraic_visitor::handle_expression(ir_expression
> *ir)
>}
>if (is_vec_one(op_const[1]))
>  return ir->operands[0];
> +  if(ir->operands[0]->equals(ir->operands[1]))
> + return new(mem_ctx) ir_constant(1.0f, 1);
>break;
>
> case ir_binop_dot:
> --
> 2.0.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] use of likey() / unlikely() macros

2013-01-17 Thread Patrick Baggett
On Thu, Jan 17, 2013 at 10:37 AM, Brian Paul  wrote:

>
> In compiler.h we define the likely(), unlikely() macros which wrap GCC's
> __builtin_expect().  But we only use them in a handful of places.
>
> It seems to me that an obvious place to possibly use these would be for GL
> error testing.  For example, in glDrawArrays():
>
>if (unlikely(count <= 0)) {
>   _mesa_error();
>}
>
> Plus, in some of the glBegin/End per-vertex calls such as
> glVertexAttrib3fARB() where we error test the index parameter.
>
> I guess the key question is how much might we gain from this.  I don't
> really have a good feel for the value at this level.  In a tight inner
> loop, sure, but the GL error checking is pretty high-level code.
>
>
This is basically a micro-optimization, to be honest. Not that
micro-optimization is "bad", but while it should "improve" performance, it
would take a lot for that to show up on profiles. In the case of error
checking at the start of a function, you might be lucky to save a few
cycles -- virtually unnoticeable.


> I haven't found much on the web about performance gains from
> __builtin_expect().  Anyone?
>
>
I read a few heresay posts, but this one comes with actual numbers:

http://blog.man7.org/2012/10/how-much-do-builtinexpect-likely-and.html

Long story short: if you're wrong, slower; if you're right, marginal
improvement.

It's use is for changing the ordering of jumps from gcc's default of assume
linear execution. For example, code like this:
---
if(A == NULL) //not likely
return ERR_NULL;

if(B >= MAX) //not likely
   return ERR_MAX;

if(C < MIN) //not likely
   return ERR_MIN;

doStuff();
---

generates jumps around the return statement, so in the normal case, you're
making a jump, which can mean you have a delay and possibly refetch
instructions. If you didn't jump, then CPU will have the "then" part
already loaded in the icache. The "optimal" ordering then is:

if(A != NULL) {
if(B < MAX) {
if(C >= MIN) {
doStuff();
}
else return ERR_MIN;
}
else return ERR_MAX;
}
else return ERR_NULL;

---
In the common case then, the code does not branch, but executes a linear
stream of instructions. On modern x86 CPUs, this matters very little,
except for maybe a few in-order CPUs (maybe Intel Atom?). You're probably a
lot more likely to get some improvements from non-x86 where branch
prediction is weaker or unavailable and/or the CPU is in-order. ARM and
older SPARC CPUs come to mind. Also, some architectures allow you to encode
a branch prediction hint inside of the branch itself, e.g. IA64's
"br.call.sptk.many" Branch / Call / Static Predict Taken / Many Times,
which gcc can take advantage of. Still overall, this is well within the
realm of micro-optimization.

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: re-implement unpacking of DEPTH_COMPONENT32F

2011-11-22 Thread Patrick Baggett
On Tue, Nov 22, 2011 at 2:07 PM, Marek Olšák  wrote:

> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43122
> ---
>  src/mesa/main/format_unpack.c |   10 ++
>  1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/src/mesa/main/format_unpack.c b/src/mesa/main/format_unpack.c
> index 6e2ce7a..52f224a 100644
> --- a/src/mesa/main/format_unpack.c
> +++ b/src/mesa/main/format_unpack.c
> @@ -1751,6 +1751,13 @@ unpack_float_z_Z32(GLuint n, const void *src,
> GLfloat *dst)
>  }
>
>  static void
> +unpack_float_z_Z32F(GLuint n, const void *src, GLfloat *dst)
> +{
> +   const GLfloat *s = ((const GLfloat *) src);
> +   memcpy(dst, s, n * sizeof(float));
> +}
>

Why bother typecasting here in a separate variable 's'?



> +
> +static void
>  unpack_float_z_Z32X24S8(GLuint n, const void *src, GLfloat *dst)
>  {
>const GLfloat *s = ((const GLfloat *) src);
> @@ -1783,6 +1790,9 @@ _mesa_unpack_float_z_row(gl_format format, GLuint n,
>case MESA_FORMAT_Z32:
>   unpack = unpack_float_z_Z32;
>   break;
> +   case MESA_FORMAT_Z32_FLOAT:
> +  unpack = unpack_float_z_Z32F;
> +  break;
>case MESA_FORMAT_Z32_FLOAT_X24S8:
>   unpack = unpack_float_z_Z32X24S8;
>   break;
> --
> 1.7.5.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/i965g: hide that utterly broken driver better

2011-11-28 Thread Patrick Baggett
On Mon, Nov 28, 2011 at 3:32 PM, Daniel Vetter wrote:

> And warn loudly in case people want to use it. Too many tester report
> gpu hangs on irc and we rootcause this ...
>
> Signed-Off-by: Daniel Vetter 
> ---
>  configure.ac |9 -
>  1 files changed, 8 insertions(+), 1 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 8885a6d..4dee3ad 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -658,7 +658,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,swrast"
>  AC_ARG_WITH([gallium-drivers],
> [AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
> [comma delimited Gallium drivers list, e.g.
> -"i915,i965,nouveau,r300,r600,svga,swrast"
> +"i915,nouveau,r300,r600,svga,swrast"
> @<:@default=r300,r600,swrast@:>@])],
> [with_gallium_drivers="$withval"],
> [with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
> @@ -2007,10 +2007,17 @@ if echo "$SRC_DIRS" | grep 'gallium' >/dev/null
> 2>&1; then
> echo "Winsys dirs: $GALLIUM_WINSYS_DIRS"
> echo "Driver dirs: $GALLIUM_DRIVERS_DIRS"
> echo "Trackers dirs:   $GALLIUM_STATE_TRACKERS_DIRS"
> +   if echo "$GALLIUM_DRIVERS_DIRS" | grep i965 > /dev/null 2>&1; then
> +  echo
> +  echo "WARNING: enabling i965 gallium driver"
> +  echo "the i965g driver is currently utterly broken, only
> for adventurours developers"
>

I think the word is "adventurous".


> +  echo
> +   fi
>  else
> echo "Gallium: no"
>  fi
>
> +
>  dnl Libraries
>  echo ""
>  echo "Shared libs: $enable_shared"
> --
> 1.7.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): st/xorg: fix build without LLVM

2011-10-13 Thread Patrick Baggett
Well, trivial answer is that Win32 uses some C/C++ runtime provided by
Microsoft, usually something like MSVCR90.DLL (v9.0) etc. Solaris uses
libC.so, for example. As far as I know, only systems where the GNU C/C++
compiler is main system compiler (and generally therefore the GNU C++
runtime) uses anything named "libstdc++". So I'd expect Free/Net/OpenBSD +
Linux use that naming and probably not much else. On other commercial
UNIXes, if it does exist, it is just for compatibility with C++ programs
compiled using g++.

Patrick

2011/10/13 Marcin Slusarz 

> On Thu, Oct 13, 2011 at 07:54:32PM +0200, Michel Dänzer wrote:
> > On Don, 2011-10-13 at 10:03 -0700, Marcin XXlusarz wrote:
> > > Module: Mesa
> > > Branch: master
> > > Commit: 349e4db99e938f8ee8826b0d27e490c66a1e8356
> > > URL:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=349e4db99e938f8ee8826b0d27e490c66a1e8356
> > >
> > > Author: Marcin Slusarz 
> > > Date:   Thu Oct 13 18:44:40 2011 +0200
> > >
> > > st/xorg: fix build without LLVM
> > >
> > > ---
> > >
> > >  src/gallium/targets/Makefile.xorg |2 ++
> > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/src/gallium/targets/Makefile.xorg
> b/src/gallium/targets/Makefile.xorg
> > > index 9269375..c96eded 100644
> > > --- a/src/gallium/targets/Makefile.xorg
> > > +++ b/src/gallium/targets/Makefile.xorg
> > > @@ -33,6 +33,8 @@ LD = $(CXX)
> > >  LDFLAGS += $(LLVM_LDFLAGS)
> > >  USE_CXX=1
> > >  DRIVER_LINKS += $(LLVM_LIBS) -lm -ldl
> > > +else
> > > +LDFLAGS += -lstdc++
> > >  endif
> >
> > This is wrong. Use g++ for linking libstdc++, gcc [...] -lstdc++ doesn't
> > work everywhere.
>
> It wasn't my invention - I mimicked other targets (with partial exception
> of dri).
> Why gcc -lstdc++ doesn't work everywhere?
>
> ---
> From: Marcin Slusarz 
> Subject: [PATCH] gallium/targets: use g++ for linking
>
> As pointed by Michel Dänzer, gcc -lstdc++ "doesn't work everywhere",
> because ...
> Use g++ for linking and remove redundant LDFLAGS += -lstdc++.
> ---
>  src/gallium/targets/Makefile.dri   |2 --
>  src/gallium/targets/Makefile.va|4 +---
>  src/gallium/targets/Makefile.vdpau |4 +---
>  src/gallium/targets/Makefile.xorg  |5 +
>  src/gallium/targets/Makefile.xvmc  |4 +---
>  5 files changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/src/gallium/targets/Makefile.dri
> b/src/gallium/targets/Makefile.dri
> index 857ebfe..a26b3ee 100644
> --- a/src/gallium/targets/Makefile.dri
> +++ b/src/gallium/targets/Makefile.dri
> @@ -4,8 +4,6 @@
>  ifeq ($(MESA_LLVM),1)
>  LDFLAGS += $(LLVM_LDFLAGS)
>  DRIVER_EXTRAS = $(LLVM_LIBS)
> -else
> -LDFLAGS += -lstdc++
>  endif
>
>  MESA_MODULES = \
> diff --git a/src/gallium/targets/Makefile.va
> b/src/gallium/targets/Makefile.va
> index 7ced430..b6ee595 100644
> --- a/src/gallium/targets/Makefile.va
> +++ b/src/gallium/targets/Makefile.va
> @@ -17,8 +17,6 @@ STATE_TRACKER_LIB =
> $(TOP)/src/gallium/state_trackers/va/libvatracker.a
>  ifeq ($(MESA_LLVM),1)
>  LDFLAGS += $(LLVM_LDFLAGS)
>  DRIVER_EXTRAS = $(LLVM_LIBS)
> -else
> -LDFLAGS += -lstdc++
>  endif
>
>  # XXX: Hack, VA public funcs aren't exported
> @@ -39,7 +37,7 @@ OBJECTS = $(C_SOURCES:.c=.o) \
>  default: depend symlinks $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME)
>
>  $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME): $(OBJECTS) $(PIPE_DRIVERS)
> $(STATE_TRACKER_LIB) $(TOP)/$(LIB_DIR)/gallium Makefile
> -   $(MKLIB) -o $(LIBBASENAME) -linker '$(CC)' -ldflags '$(LDFLAGS)' \
> +   $(MKLIB) -o $(LIBBASENAME) -linker '$(CXX)' -ldflags '$(LDFLAGS)' \
>-major $(VA_MAJOR) -minor $(VA_MINOR) $(MKLIB_OPTIONS) \
>-install $(TOP)/$(LIB_DIR)/gallium \
>$(OBJECTS) $(STATE_TRACKER_LIB) $(PIPE_DRIVERS) $(LIBS)
> $(DRIVER_EXTRAS)
> diff --git a/src/gallium/targets/Makefile.vdpau
> b/src/gallium/targets/Makefile.vdpau
> index c634915..f6b89ad 100644
> --- a/src/gallium/targets/Makefile.vdpau
> +++ b/src/gallium/targets/Makefile.vdpau
> @@ -17,8 +17,6 @@ STATE_TRACKER_LIB =
> $(TOP)/src/gallium/state_trackers/vdpau/libvdpautracker.a
>  ifeq ($(MESA_LLVM),1)
>  LDFLAGS += $(LLVM_LDFLAGS)
>  DRIVER_EXTRAS = $(LLVM_LIBS)
> -else
> -LDFLAGS += -lstdc++
>  endif
>
>  # XXX: Hack, VDPAU public funcs aren't exported if we link to
> libvdpautracker.a :(
> @@ -39,7 +37,7 @@ OBJECTS = $(C_SOURCES:.c=.o) \
>  default: depend symlinks $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME)
>
>  $(TOP)/$(LIB_DIR)/gallium/$(LIBNAME): $(OBJECTS) $(PIPE_DRIVERS)
> $(STATE_TRACKER_LIB) $(TOP)/$(LIB_DIR)/gallium Makefile
> -   $(MKLIB) -o $(LIBBASENAME) -linker '$(CC)' -ldflags '$(LDFLAGS)' \
> +   $(MKLIB) -o $(LIBBASENAME) -linker '$(CXX)' -ldflags '$(LDFLAGS)' \
>-major $(VDPAU_MAJOR) -minor $(VDPAU_MINOR) $(MKLIB_OPTIONS)
> \
>-install $(TOP)/$(LIB_DIR)/gallium \
>$(OBJECTS) $(STATE_TRACKER_LIB) $(PIPE_DRIVERS) $(LIBS)
> $(DRIVER_EXTRAS)
> diff --git a/src/gallium/targets/Makefile.xorg
> b/src/gallium

Re: [Mesa-dev] [PATCH] mesa: loosen small matrix determinant check

2012-07-30 Thread Patrick Baggett
On Mon, Jul 30, 2012 at 4:31 AM, Pekka Paalanen  wrote:

> On Tue, 24 Jul 2012 11:31:59 -0600
> Brian Paul  wrote:
>
> > When computing a matrix inverse, if the determinant is too small we
> could hit
> > a divide by zero.  There's a check to prevent this (we basically give up
> on
> > computing the inverse and return the identity matrix.)  This patches
> loosens
> > this test to fix a lighting bug reported by Lars Henning Wendt.
> >
> > NOTE: This is a candidate for the 8.0 branch.
> > ---
> >  src/mesa/math/m_matrix.c |2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
> > index 02aedba..ef377ee 100644
> > --- a/src/mesa/math/m_matrix.c
> > +++ b/src/mesa/math/m_matrix.c
> > @@ -513,7 +513,7 @@ static GLboolean invert_matrix_3d_general( GLmatrix
> *mat )
> >
> > det = pos + neg;
> >
> > -   if (det*det < 1e-25)
> > +   if (det < 1e-25)
> >return GL_FALSE;
> >
> > det = 1.0F / det;
>
> Hi,
>
> just a fly-by question; doesn't that break if determinant is negative?
> I.e. reflection transformations.
>
> Yeah, I think you need a fabsf() there.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Thanks To All!

2011-05-02 Thread Patrick Baggett
I just wanted to say "thanks!" to everyone who has been taking part of
Mesa3D. I have an R500-based card and it is good to know that it still
functions on Linux even after ATI/AMD decided it was too old too support.
Not only that, it still receives improvements from Mesa. I even hear
whispers that those cards might function on Power architecture systems, and
I can't help but finding myself impressed. Good job to you all and keep up
the good work.

Patrick Baggett
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] r128 status

2011-05-04 Thread Patrick Baggett
All,

Is the ATI Rage128 DRI (r128) driver still supported? Does anyone happen to
know of the status?

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: silence some compilation warnings.

2011-05-12 Thread Patrick Baggett
I would be wary of assuming you can typecast long -> pointer, or pointer ->
long. On 64-bit Windows,  sizeof(int) == sizeof(long) == 4 but sizeof(void*)
== 8. On 64-bit Linux (gcc), sizeof(int) == 4, sizeof(long) == sizeof(void*)
== 8. It would be better to use  with uintptr_t -- it was designed
to solve this problem exactly. If you insist on using long, why not use long
long (C99) which is 64-bits on both platforms.



On Thu, May 12, 2011 at 3:49 AM, zhigang gong wrote:

> glu.h: typedef void (GLAPIENTRYP _GLUfuncptr)(); causes the following
>   warning: function declaration isn't a prototype.
> egl:   When convert a (void *) to a int type, it's better to
>   convert to long firstly, otherwise in 64 bit envirnonment, it
>   causes compilation warning.
> ---
>  include/GL/glu.h|2 +-
>  src/egl/drivers/dri2/egl_dri2.c |4 ++--
>  src/egl/drivers/dri2/platform_drm.c |4 ++--
>  src/egl/drivers/dri2/platform_x11.c |2 +-
>  src/egl/main/eglapi.c   |2 +-
>  5 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/include/GL/glu.h b/include/GL/glu.h
> index cd967ac..ba2228d 100644
> --- a/include/GL/glu.h
> +++ b/include/GL/glu.h
> @@ -284,7 +284,7 @@ typedef GLUtesselator GLUtriangulatorObj;
>  #define GLU_TESS_MAX_COORD 1.0e150
>
>  /* Internal convenience typedefs */
> -typedef void (GLAPIENTRYP _GLUfuncptr)();
> +typedef void (GLAPIENTRYP _GLUfuncptr)(void);
>
>  GLAPI void GLAPIENTRY gluBeginCurve (GLUnurbs* nurb);
>  GLAPI void GLAPIENTRY gluBeginPolygon (GLUtesselator* tess);
> diff --git a/src/egl/drivers/dri2/egl_dri2.c
> b/src/egl/drivers/dri2/egl_dri2.c
> index afab679..f5f5ac3 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -835,7 +835,7 @@ dri2_create_image_khr_renderbuffer(_EGLDisplay
> *disp, _EGLContext *ctx,
>struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
>struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
>struct dri2_egl_image *dri2_img;
> -   GLuint renderbuffer = (GLuint) buffer;
> +   GLuint renderbuffer =  (unsigned long) buffer;
>
>if (renderbuffer == 0) {
>   _eglError(EGL_BAD_PARAMETER, "dri2_create_image_khr");
> @@ -870,7 +870,7 @@ dri2_create_image_mesa_drm_buffer(_EGLDisplay
> *disp, _EGLContext *ctx,
>
>(void) ctx;
>
> -   name = (EGLint) buffer;
> +   name = (unsigned long) buffer;
>
>err = _eglParseImageAttribList(&attrs, disp, attr_list);
>if (err != EGL_SUCCESS)
> diff --git a/src/egl/drivers/dri2/platform_drm.c
> b/src/egl/drivers/dri2/platform_drm.c
> index 68912e3..cea8418 100644
> --- a/src/egl/drivers/dri2/platform_drm.c
> +++ b/src/egl/drivers/dri2/platform_drm.c
> @@ -596,7 +596,7 @@ dri2_get_device_name(int fd)
>   goto out;
>}
>
> -   device_name = udev_device_get_devnode(device);
> +   device_name = (char*)udev_device_get_devnode(device);
>if (!device_name)
>   goto out;
>device_name = strdup(device_name);
> @@ -690,7 +690,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
>memset(dri2_dpy, 0, sizeof *dri2_dpy);
>
>disp->DriverData = (void *) dri2_dpy;
> -   dri2_dpy->fd = (int) disp->PlatformDisplay;
> +   dri2_dpy->fd = (long) disp->PlatformDisplay;
>
>dri2_dpy->driver_name = dri2_get_driver_for_fd(dri2_dpy->fd);
>if (dri2_dpy->driver_name == NULL)
> diff --git a/src/egl/drivers/dri2/platform_x11.c
> b/src/egl/drivers/dri2/platform_x11.c
> index 5d4ac6a..90136f4 100644
> --- a/src/egl/drivers/dri2/platform_x11.c
> +++ b/src/egl/drivers/dri2/platform_x11.c
> @@ -784,7 +784,7 @@ dri2_create_image_khr_pixmap(_EGLDisplay *disp,
> _EGLContext *ctx,
>
>(void) ctx;
>
> -   drawable = (xcb_drawable_t) buffer;
> +   drawable = (xcb_drawable_t) (long)buffer;
>xcb_dri2_create_drawable (dri2_dpy->conn, drawable);
>attachments[0] = XCB_DRI2_ATTACHMENT_BUFFER_FRONT_LEFT;
>buffers_cookie =
> diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
> index 336ec23..9063752 100644
> --- a/src/egl/main/eglapi.c
> +++ b/src/egl/main/eglapi.c
> @@ -1168,7 +1168,7 @@ eglQueryModeStringMESA(EGLDisplay dpy, EGLModeMESA
> mode)
>  EGLDisplay EGLAPIENTRY
>  eglGetDRMDisplayMESA(int fd)
>  {
> -   _EGLDisplay *dpy = _eglFindDisplay(_EGL_PLATFORM_DRM, (void *) fd);
> +   _EGLDisplay *dpy = _eglFindDisplay(_EGL_PLATFORM_DRM, (void *)
> (long)fd);
>return _eglGetDisplayHandle(dpy);
>  }
>
> --
> 1.7.3.1
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] xorg/nouveau: blacklist all pre NV30 cards

2011-06-05 Thread Patrick Baggett
Wasn't nouveau targeted to provide HW acceleration for old cards like the
TNT2, or has that idea been killed?

Patrick

On Sun, Jun 5, 2011 at 2:06 PM, Marcin Slusarz wrote:

> On Tue, May 17, 2011 at 12:20:14AM +0200, Marcin Slusarz wrote:
> > Bail out early in probe, so other driver can take control of the card.
> > Doing it in screen_create would be too late.
> > ---
> >  src/gallium/targets/xorg-nouveau/nouveau_xorg.c |   44
> ++-
> >  1 files changed, 35 insertions(+), 9 deletions(-)
>
> ping
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] is it possible to dynamic load OSMesa?

2011-07-15 Thread Patrick Baggett
If libOSMesa.so is separate library, then isn't libGL.so too? You're calling
glGetIntegerv() from libGL.so but not from libOSMesa.so -- try doing
dlsym("glGetIntegerv") and removing libGL.so from the link line.

Patrick

On Fri, Jul 15, 2011 at 2:41 PM, Paul Gotzel  wrote:

> Hello,
>
> I've downloaded the latest 7.10.3 and I need to be able to dynamically load
> OSMesa.  Is this possible?  I've tried to use dlopen and dlsym to load the
> functions and all the OSMesa calls return success but when I make the gl
> calls I get:
>
> GL User Error: glGetIntegerv called without a rendering context
> GL User Error: glGetIntegerv called without a rendering context
> GL User Error: glGetIntegerv called without a rendering context
>
> Any help would be appreciated.
>
> Thanks,
> Paul
>
> My sample program is as follows.  I compile it with the same flags as the
> rest of the demo programs without linking to OSMesa.
>
> static void *
> loadOSMesa()
> {
>   return dlopen("libOSMesa.so", RTLD_DEEPBIND | RTLD_NOW | RTLD_GLOBAL);
> }
>
> static OSMesaContext
> dynOSMesaCreateContext()
> {
>   typedef OSMesaContext (*CreateContextProto)( GLenum , GLint , GLint ,
> GLint , OSMesaContext );
>   static void *createPfunc = NULL;
>   CreateContextProto createContext;
>   if (createPfunc == NULL)
>   {
> void *handle = loadOSMesa();
> if (handle)
> {
>   createPfunc = dlsym(handle, "OSMesaCreateContextExt");
> }
>   }
>
>   if (createPfunc)
>   {
> createContext = (CreateContextProto)(createPfunc);
> return (*createContext)(GL_RGBA, 16, 0, 0, NULL);
>   }
>   return 0;
> }
>
> static GLboolean
> dynOSMesaMakeCurrent(OSMesaContext cid, void * win, GLenum type, GLsizei w,
> GLsizei h)
> {
>   typedef GLboolean (*MakeCurrentProto)(OSMesaContext, void *, GLenum,
> GLsizei, GLsizei);
>   static void *currentPfunc = NULL;
>   MakeCurrentProto makeCurrent;
>   if (currentPfunc == NULL)
>   {
> void *handle = loadOSMesa();
> if (handle)
> {
>   currentPfunc = dlsym(handle, "OSMesaMakeCurrent");
> }
>   }
>   if (currentPfunc)
>   {
> makeCurrent = (MakeCurrentProto)(currentPfunc);
> return (*makeCurrent)(cid, win, type, w, h);
>   }
>   return GL_FALSE;
> }
>
> int
> main(int argc, char *argv[])
> {
>OSMesaContext ctx;
>void *buffer;
>
>ctx = dynOSMesaCreateContext();
>if (!ctx) {
>   printf("OSMesaCreateContext failed!\n");
>   return 0;
>}
>
>int Width = 100;
>int Height = 100;
>
>/* Allocate the image buffer */
>buffer = malloc( Width * Height * 4 * sizeof(GLubyte) );
>if (!buffer) {
>   printf("Alloc image buffer failed!\n");
>   return 0;
>}
>
>/* Bind the buffer to the context and make it current */
>if (!dynOSMesaMakeCurrent( ctx, buffer, GL_UNSIGNED_BYTE, Width, Height
> )) {
>   printf("OSMesaMakeCurrent failed!\n");
>   return 0;
>}
>
>
>{
>   int z, s, a;
>   glGetIntegerv(GL_DEPTH_BITS, &z);
>   glGetIntegerv(GL_STENCIL_BITS, &s);
>   glGetIntegerv(GL_ACCUM_RED_BITS, &a);
>   printf("Depth=%d Stencil=%d Accum=%d\n", z, s, a);
>}
>
>return 0;
> }
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] rationale for GLubyte pointers for strings?

2011-07-19 Thread Patrick Baggett
SGI invented OpenGL and offered it first on their IRIX platform. SGI's
MIPSpro compiler has the "char" datatype as unsigned by default, so the
compiler would likely complain if assigning a GLbyte pointer to an
[unsigned] character pointer. Thus, to do something like

char* ext = glGetString(GL_VENDOR);

doesn't require a cast on IRIX, while the same code would require a cast
using other compilers due to the aforementioned problem.

Patrick


On Tue, Jul 19, 2011 at 1:44 PM, Allen Akin  wrote:

> On Tue, Jul 19, 2011 at 12:20:54PM -0600, tom fogal wrote:
> | glGetString and gluErrorString, plus maybe some other functions, return
> | GLubyte pointers instead of simply character pointers...
> | What's the rationale here?
>
> I agree, it's odd.  I don't remember the rationale, but my best guess is
> that it papered over some compatibility issue with another language
> binding (probably Fortran).  I suppose there's a very slight possibility
> that it sprang from a compatibility issue with Cray.
>
> Allen
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Implement HW accelerated GL_SELECT

2011-08-02 Thread Patrick Baggett
Might want to fix the copyright message at the top of source file. ;)

On Tue, Aug 2, 2011 at 8:37 AM, Micael Dias  wrote:

> ---
>  src/mesa/main/mtypes.h   |7 +
>  src/mesa/state_tracker/st_cb_feedback.c  |   21 +-
>  src/mesa/state_tracker/st_draw.h |   17 +
>  src/mesa/state_tracker/st_draw_select_emul.c |  463
> ++
>  src/mesa/SConscript  |1 +
>  src/mesa/sources.mak |1 +
>  6 files changed, 505 insertions(+), 5 deletions(-)
>  create mode 100644 src/mesa/state_tracker/st_draw_select_emul.c
>
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index b881183..10222d8 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1721,6 +1721,13 @@ struct gl_selection
>GLboolean HitFlag;  /**< hit flag */
>GLfloat HitMinZ;/**< minimum hit depth */
>GLfloat HitMaxZ;/**< maximum hit depth */
> +   struct gl_selection_emul /* data related to hw accelerated GL_SELECT */
> +   {
> +  GLboolean hw_unsupported;
> +  struct gl_framebuffer *fbo;
> +  GLuint renderBuffer_depth;
> +  GLuint renderBuffer_color;
> +   } emul;
>  };
>
>
> diff --git a/src/mesa/state_tracker/st_cb_feedback.c
> b/src/mesa/state_tracker/st_cb_feedback.c
> index 9b85a39..9382895 100644
> --- a/src/mesa/state_tracker/st_cb_feedback.c
> +++ b/src/mesa/state_tracker/st_cb_feedback.c
> @@ -276,17 +276,28 @@ st_RenderMode(struct gl_context *ctx, GLenum newMode
> )
>  {
>struct st_context *st = st_context(ctx);
>struct draw_context *draw = st->draw;
> +   bool hw_acc_path = _mesa_getenv("MESA_HW_SELECT") &&
> !ctx->Select.emul.hw_unsupported;
>
>if (newMode == GL_RENDER) {
>   /* restore normal VBO draw function */
>   vbo_set_draw_func(ctx, st_draw_vbo);
>}
>else if (newMode == GL_SELECT) {
> -  if (!st->selection_stage)
> - st->selection_stage = draw_glselect_stage(ctx, draw);
> -  draw_set_rasterize_stage(draw, st->selection_stage);
> -  /* Plug in new vbo draw function */
> -  vbo_set_draw_func(ctx, st_feedback_draw_vbo);
> +  if (hw_acc_path) {
> + if (st_select_emul_begin(ctx)) {
> +vbo_set_draw_func(ctx, st_select_draw_func);
> + }
> + else {
> +hw_acc_path = false;
> + }
> +  }
> +  if (!hw_acc_path) {
> + if (!st->selection_stage)
> +st->selection_stage = draw_glselect_stage(ctx, draw);
> + draw_set_rasterize_stage(draw, st->selection_stage);
> + /* Plug in new vbo draw function */
> + vbo_set_draw_func(ctx, st_feedback_draw_vbo);
> +  }
>}
>else {
>   if (!st->feedback_stage)
> diff --git a/src/mesa/state_tracker/st_draw.h
> b/src/mesa/state_tracker/st_draw.h
> index a7b50ce..d27e321 100644
> --- a/src/mesa/state_tracker/st_draw.h
> +++ b/src/mesa/state_tracker/st_draw.h
> @@ -87,5 +87,22 @@ pointer_to_offset(const void *ptr)
>return (unsigned) (((unsigned long) ptr) & 0xUL);
>  }
>
> +/* Functions used by the hw accelerated GL_SELECT emulator
> + */
> +extern bool
> +st_select_emul_begin(struct gl_context *ctx);
> +
> +extern void
> +st_select_emul_end(struct gl_context *ctx);
> +
> +extern void
> +st_select_draw_func(struct gl_context *ctx,
> +const struct gl_client_array **arrays,
> +const struct _mesa_prim *prims,
> +GLuint nr_prims,
> +const struct _mesa_index_buffer *ib,
> +GLboolean index_bounds_valid,
> +GLuint min_index,
> +GLuint max_index);
>
>  #endif
> diff --git a/src/mesa/state_tracker/st_draw_select_emul.c
> b/src/mesa/state_tracker/st_draw_select_emul.c
> new file mode 100644
> index 000..78065dd
> --- /dev/null
> +++ b/src/mesa/state_tracker/st_draw_select_emul.c
> @@ -0,0 +1,463 @@
>
> +/**
> + *
> + * Copyright .
> + * All Rights Reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS 

Re: [Mesa-dev] GPL'd vl_mpeg12_bitstream.c

2011-08-12 Thread Patrick Baggett
Why not ask the original author to relicense?

2011/8/12 Marek Olšák 

> 2011/8/12 Christian König :
> > Am Freitag, den 12.08.2011, 10:49 -0400 schrieb Younes Manton:
> >> Sorry, by incompatible I didn't mean that you couldn't use them
> >> together, but that one is more restrictive than the other. Like the
> >> discussion you quoted states, if you combine MIT and GPL you have to
> >> satisfy both of them, which means you have to satisfy the GPL. I
> >> personally don't care that much, but unfortunately with the way
> >> gallium is built it affects more than just VDPAU.
> >>
> >> Every driver in lib/gallium includes that code, including swrast_dri
> >> (softpipe), r600_dri, etc, and libGL loads those drivers. If you build
> >> with the swrast config instead of DRI I believe galllium libGL
> >> statically links with softpipe, so basically my understanding is that
> >> anyone linking with gallium libGL (both swrast and DRI configs) has to
> >> satisfy the GPL now.
> > A crap, your right. I've forgotten that GPL has even a problem when code
> > is just linked in, compared to being used.
> >
> >> Maybe someone else who is more familiar with these sorts of things can
> >> comment and confirm that this is accurate and whether or not it's a
> >> problem.
> > I already asked around in my AMD team, and the general answer was: Oh
> > fuck I've no idea, please don't give me a headache. I could asked around
> > a bit more, but I don't think we get a definitive answer before xmas.
> >
> > As a short term solution we could compile that code conditionally, and
> > only enable it when the VDPAU state tracker is enabled. But as the long
> > term solution the code just needs a rewrite, beside having a license
> > problem, it is just not very optimal. The original code is something
> > like a decade old, and is using a whole bunch of quirks which are not
> > useful by today’s standards (not including the sign in mv tables for
> > example). ffmpegs/libavs implementation for example is something like
> > halve the size and even faster, but uses more memory for table lookups.
> > But that code is also dual licensed under the GPL/LGPL.
> >
> > Using LGPL code instead could also be a solution, because very important
> > parts of Mesa (the GLSL parser for example) is already licensed under
> > that, but I'm also not an expert with that also.
>
> Even though the GLSL parser is licensed under LGPL (because Bison is),
> there is a special exception that we may license it under whatever
> licence we want if we don't make software that does exactly what Bison
> does. So the whole GLSL compiler is actually licensed under the MIT
> license. There was one LGPL dependency (talloc), but Intel has paid
> special attention to get rid of that. My recollection is nobody wanted
> LGPL or GPL code in Mesa.
>
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] DEATH to old drivers!

2011-08-24 Thread Patrick Baggett
My Voodoo3 3500 AGP just wept.

On Wed, Aug 24, 2011 at 4:36 PM, Eric Anholt  wrote:

> On Wed, 24 Aug 2011 12:11:32 -0700, Ian Romanick 
> wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > I'd like to propose giving the ax to a bunch of old, unmaintained
> > drivers.  I've been doing a bunch of refactoring and reworking of core
> > Mesa code, and these drivers have been causing me problems for a number
> > of reasons.
>
> Acked!
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Four questions about DRI1 drivers

2012-03-01 Thread Patrick Baggett
Now I'm curious. Is it the case that every DRI1 driver *could be* a DRI2
driver with enough effort? Not talking about emulating hardware features.

Patrick

On Thu, Mar 1, 2012 at 1:46 PM, Dave Airlie  wrote:

> On Thu, Mar 1, 2012 at 7:25 PM, Connor Behan 
> wrote:
> > On 01/03/12 01:36 AM, Dave Airlie wrote:
> >>
> >> You can still build r128_dri.so from Mesa 7.11 and it will work with
> later
> >> Mesa libGLs fine. You just can't build it from Mesa 8.0 source anymore.
> >
> > Really? Even if no one updates r128 to stay compatible with new libGLs
> and
> > no one updating libGL gives a second thought as to whether that update
> will
> > break r128? I thought the whole point of removing DRI1 drivers is that
> most
> > of you are too pressured to keep that promise. If the plan really is to
> > update libGL carefully so that DRI1 drivers will always work with it,
> then
> > it seems like their removal does nothing but save a few MB of space on
> the
> > git server.
>
> Thats the plan, some distros have to keep shipping older drivers, but
> also want to ship newer drivers.
>
> the libGL -> driver interface is a lot more standard than the internal
> mesa<->driver interfaces, and are not the same thing.
>
> Removing the drivers allowed major simplification of mesa internal
> interfaces not the GL->driver interface.
>
> It doesn't save any space on the git server since git holds all the
> history ever.
>
> Dave.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] IROUND() issue

2012-05-18 Thread Patrick Baggett
On Fri, May 18, 2012 at 11:28 AM, Brian Paul  wrote:

> On 05/18/2012 10:11 AM, Jose Fonseca wrote:
>
>>
>>
>> - Original Message -
>>
>>>
>>> A while back I noticed that the piglit roundmode-pixelstore and
>>> roundmode-getinteger tests pass on my 64-bit Fedora system but fail
>>> on
>>> a 32-bit Ubuntu system.  Both glGetIntegerv() and glPixelStoref()
>>>  use
>>> the IROUND() function to convert floats to ints.
>>>
>>> The implementation if IROUND() that uses the x86 fistp instruction is
>>> protected with:
>>>
>>> #if defined(USE_X86_ASM)&&  defined(__GNUC__)&&  defined(__i386__)
>>>
>>>
>>> but that evaluates to 0 on x86-64 (neither USE_X86_ASM nor __i386__
>>> are defined) so we use the C fallback:
>>>
>>> #define IROUND(f)  ((int) (((f)>= 0.0F) ? ((f) + 0.5F) : ((f) -
>>> 0.5F)))
>>>
>>> The C version of IROUND() does what we want for the piglit tests but
>>> not the x86 version.  I think the default x86 rounding mode is
>>> FE_UPWARD so that explains the failures.
>>>
>>>
>>> So I think I'd like to do the following:
>>>
>>> 1. Enable the x86 fistp-based functions in imports.h for x86-64.
>>>
>>
>> It's illegal/inneficient to use x87 on x86-64. We should use the
>> appropriate SSE intrisinsic instead.
>>
>
The instruction is "cvtss2si". Even if you use SSE here, you depend on the
rounding mode in the MXCSR register, which means you'll have to set that,
because some applications change this mode to use a faster or more precise
rounding mode. It's the parallel problem that you have with "fistp".


>
>>  2. Rename IROUND() to IROUND_FAST() and define it as float->int
>>> conversion by whatever method is fastest.
>>>
>>> 3. Define IROUND() as round to nearest int.  For the x86 fistp
>>> implementation this would involve setting/restoring the rounding
>>> mode.
>>>
>>
If I recall, it is generally run with some other rounding mode other than
"truncate" by default, so usually float -> int conversions that involve
truncation (C cast) require changing the rounding mode *to truncation*.
This was such a problem that in SSE3 there is "fisttp" which is "FP integer
store with truncation". I guess though if the default rounding mode causes
problems, there isn't much that can be done but change it each time.

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS

2012-09-24 Thread Patrick Baggett
Concurrency::precise_math::signbit(), and only as of VS 2012 runtimes. This
is an awfully high bar for such a simple function.



On Mon, Sep 24, 2012 at 1:43 PM, Matt Turner  wrote:

> On Mon, Sep 24, 2012 at 11:02 AM, Brian Paul  wrote:
> > On 09/24/2012 10:49 AM, Matt Turner wrote:
> >>
> >> Module: Mesa
> >> Branch: master
> >> Commit: 0f3ba405eada72e1ab4371948315b28608903927
> >> URL:
> >>
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=0f3ba405eada72e1ab4371948315b28608903927
> >>
> >> Author: Matt Turner
> >> Date:   Fri Sep 14 16:04:40 2012 -0700
> >>
> >> Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS
> >>
> >> signbit() appears to be available everywhere (even MSVC according to
> >> MSDN), so let's use it instead of open-coding some messy and confusing
> >> bit twiddling macros.
> >>
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805
> >> Reviewed-by: Paul Berry
> >> Suggested-by: Ian Romanick
> >>
> >> ---
> >>
> >>   configure.ac   |7 +++
> >>   src/mesa/main/macros.h |   21 ++---
> >>   2 files changed, 9 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/configure.ac b/configure.ac
> >> index 4193496..cb65467 100644
> >> --- a/configure.ac
> >> +++ b/configure.ac
> >> @@ -499,6 +499,13 @@ AC_SUBST([DLOPEN_LIBS])
> >>   dnl See if posix_memalign is available
> >>   AC_CHECK_FUNC([posix_memalign], [DEFINES="$DEFINES
> >> -DHAVE_POSIX_MEMALIGN"])
> >>
> >> +dnl signbit() is a macro in glibc's math.h, so AC_CHECK_FUNC fails. To
> >> handle
> >> +dnl this, use AC_CHECK_DECLS and fallback to AC_CHECK_FUNC in case it
> >> fails.
> >> +AC_CHECK_DECLS([signbit],[],
> >> +   AC_CHECK_FUNC([signbit],[],
> >> + AC_MSG_ERROR([could not find signbit()])),
> >> +   [#include])
> >> +
> >>   dnl SELinux awareness.
> >>   AC_ARG_ENABLE([selinux],
> >>   [AS_HELP_STRING([--enable-selinux],
> >> diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
> >> index 04d59d7..7b7fd1b 100644
> >> --- a/src/mesa/main/macros.h
> >> +++ b/src/mesa/main/macros.h
> >> @@ -693,31 +693,14 @@ NORMALIZE_3FV(GLfloat v[3])
> >>   static inline GLboolean
> >>   IS_NEGATIVE(float x)
> >>   {
> >> -#if defined(USE_IEEE)
> >> -   fi_type fi;
> >> -   fi.f = x;
> >> -   return fi.i<  0;
> >> -#else
> >> -   return x<  0.0F;
> >> -#endif
> >> +   return signbit(x) != 0;
> >>   }
> >>
> >> -
> >>   /** Test two floats have opposite signs */
> >>   static inline GLboolean
> >>   DIFFERENT_SIGNS(GLfloat x, GLfloat y)
> >>   {
> >> -#if defined(USE_IEEE)
> >> -   fi_type xfi, yfi;
> >> -   xfi.f = x;
> >> -   yfi.f = y;
> >> -   return !!((xfi.i ^ yfi.i)&  (1u<<  31));
> >> -#else
> >> -   /* Could just use (x*y<0) except for the flatshading requirements.
> >> -* Maybe there's a better way?
> >> -*/
> >> -   return ((x) * (y)<= 0.0F&&  (x) - (y) != 0.0F);
> >> -#endif
> >> +   return signbit(x) != signbit(y);
> >>   }
> >>
> >>
> >
> > Looks like we don't have signbit() on Windows.  We build with scons
> there so
> > the autoconf check isn't applicable.  I'll post a patch in a bit.
> >
> > -Brian
>
> MSDN claims that Windows does have signbit():
> http://msdn.microsoft.com/en-us/library/hh308342.aspx
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 tiling halves the frame rate

2012-10-30 Thread Patrick Baggett
Is your screen refresh rate 70 Hz? Because if so, that means that it's
syncing to the vblank on Mesa, and not doing so on the proprietary one.

Patrick

On Mon, Oct 29, 2012 at 8:24 PM, Tzvetan Mikov  wrote:

> On 10/28/2012 12:56 PM, Tzvetan Mikov wrote:
>
>> On 10/28/2012 04:26 AM, Marek Olšák wrote:
>> No, there is no X11 at all. I am running my tests on a very bare system
>> with EGL only, hoping to minimize the test surface and isolate any
>> interferences.
>>
>> I will try it though (it will also enable me to compare against the
>> proprietary drivers as a baseline, I guess).
>>
>
> This is not directly related to tiling, but I installed the proprietary
> drivers on the same hardware, and I am getting about 3X the performance.
> (From 70 FPS to 225 FPS in 1920x1200 on a HD6460).
>
> Is it known what the main reason is for such a dramatic performance
> difference between the Mesa R600 driver and proprietary driver? This is a
> very simple test app rendering two textured rectangles on screen, so I am
> guessing the difference must be due to something fundamental.
>
>
> regards,
> Tzvetan
> __**_
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/**mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] GL 3.1 on Radeon HD 4670?

2012-10-31 Thread Patrick Baggett
Hi all,

I've got a really weird duck of system: an Itanium2 system running Linux
3.7.0-rc3 with the newest libdrm and mesa git from yesterday. I configured
it with --enable-texture-float and the radeon DRI driver. When I use
glxinfo, I see that it is Mesa 9.1-devel but only OpenGL 3.0. Is that
because my version glxinfo doesn't create the appropriate context? Is there
an updated version of glxinfo that does? Or a flag that I should pass to
only consider core contexts?

Thanks for making all of this open source and possible too; I probably have
one of the only ia64 systems with GL >= 3.0 in the world!

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GL 3.1 on Radeon HD 4670?

2012-10-31 Thread Patrick Baggett
DOH. I'm sorry, I read that Mesa supported GL 3.1 and somehow I generalized
that to all drivers. Thanks for that TODO list. I guess I need to start
reading about the R700 architecture...

Patrick

On Wed, Oct 31, 2012 at 1:28 PM, Alex Deucher  wrote:

> On Wed, Oct 31, 2012 at 1:11 PM, Patrick Baggett
>  wrote:
> > Hi all,
> >
> > I've got a really weird duck of system: an Itanium2 system running Linux
> > 3.7.0-rc3 with the newest libdrm and mesa git from yesterday. I
> configured
> > it with --enable-texture-float and the radeon DRI driver. When I use
> > glxinfo, I see that it is Mesa 9.1-devel but only OpenGL 3.0. Is that
> > because my version glxinfo doesn't create the appropriate context? Is
> there
> > an updated version of glxinfo that does? Or a flag that I should pass to
> > only consider core contexts?
> >
>
> The open source r600g driver only supports GL 3.0 at the moment.  See
> this document to see what's still missing:
> http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt
>
> Alex
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (d3d1x): d3d1x: add new Direct3D 10/11 COM state tracker for Gallium

2010-09-22 Thread Patrick Baggett

 On 9/21/2010 9:37 PM, James McKenzie wrote:
 I just read a comment that you made in the Mesa mailing list and am 
seriously concerned about it:


It only attempts to prevent reverse engineering, disassembly and 
decompilation, and does not grant
you distribution rights under copyright law in the case that you 
distribute Microsoft code to run

on non-Windows platform or license it under a copyleft license.

Please keep in mind that Microsoft has been 'tightening' their EULA 
agreements.  Before you go out

and do something like what you expressed here:

"I just looked at the d3d11TokenizedProgramFormat.h header because the 
documentation on MSDN says

that the shader bytecode format is documented in that file"

bear that Microsoft does track what is being done with their code. If 
you step over the line, they
will 'nail' you.  They give absolutely NO warning. One user decided to 
work with WMP 10 to get it
to work under RedHat.  He was given a court order to 'cease and 
desist' all work in this manner.


This is why the Wine project and its predecessors Project Odinn and 
the WineOS/2 project have always
kept a hands off and no peaking 'under the hood' policy.  No FOSS 
project wants a visit from any Justice
Department, any Copyright office and lastly no visit from the Thugs 
Who Work for Microsoft.


Again, if you feel that what you did is justified, then you are so.  
Also, keep in mind that your code
can be blocked in countries that strictly enforce the Microsoft EULAs 
(the United States is one of them.)


Very respectfully,

James McKenzie
I'm not a lawyer, and I don't play one on TV.  However, I have been 
around for the US versus IBM and US versus Microsoft cases

The US Government lost both of them.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
I hate hate hate to ask, but doesn't this precedent 
http://en.wikipedia.org/wiki/Sega_v._Accolade cover it? It seems like 
it'd be hard to argue that this examining of copyrighted material wasn't 
done for the sake of interoperability and that there were other means of 
figuring out what these tokens might be. This probably isn't the place 
to have such a discussion, but I'd bet $20 that this would be upheld, 
you know, if Microsoft didn't appeal the case until the defendant ran 
out of money. Such is America. :\


Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] os: add spinlocks

2010-12-15 Thread Patrick Baggett
UP = Uniprocessor system, (S)MP = (Symmetric) multiprocessor system.

On Wed, Dec 15, 2010 at 2:23 AM, Marek Olšák  wrote:

> On Tue, Dec 14, 2010 at 8:10 PM, Thomas Hellstrom 
> wrote:
>
>> Hmm,
>>
>> for the uninformed, where do we need to use spinlocks in gallium and how
>> do
>> we avoid using them on an UP system?
>>
>
> I plan to use spinlocks to guard very simple code like the macro
> remove_from_list, which might be, under some circumstances, called too
> often. Entering and leaving a mutex is quite visible in callgrind.
>
> What does UP stand for?
>
> Marek
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Truncated extensions string

2011-03-11 Thread Patrick Baggett
I feel like there is some kind of underlying lesson that we, OpenGL app
programmers, should be getting out of this...

What about a psuedo-database of app -> extension list rather than by year?
Surely Quake3 doesn't make use of but <= 10 extensions. I'd imagine the same
holds true for other old games as well. A simple "strings" on their binary
could figure that out...

On Fri, Mar 11, 2011 at 2:14 PM, Kenneth Graunke wrote:

> On Friday, March 11, 2011 10:46:31 AM José Fonseca wrote:
> > On Fri, 2011-03-11 at 09:04 -0800, Eric Anholt wrote:
> > > On Fri, 11 Mar 2011 10:33:13 +, José Fonseca 
> wrote:
> > > > The problem from
> > > >
> > > >
> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12493.h
> > > > tml
> > > >
> > > > is back, and now a bit worse -- it causes Quake3 arena demo to crash
> > > > (at least the windows version). The full version works fine. I'm not
> > > > sure what other applications are hit by this. See the above thread
> for
> > > > more background.
> > > >
> > > >
> > > > There are two major approaches:
> > > >
> > > > 1) sort extensions chronologically instead of alphabetically. See
> > > > attached patch for that
> > > >
> > > >   - for those who prefer to see extensions sorted alphabetically in
> > > >
> > > > glxinfo, we could modify glxinfo to sort then before displaying
> > > >
> > > > 2) detect broken applications (i.e., by process name), and only sort
> > > > extensions strings chronologically then
> > > >
> > > > Personally I think that varying behavior based on process name is a
> > > > ugly and brittle hack, so I'd prefer 1), but I just want to put this
> > > > on my back above all, so whatever works is also fine by me.
> > >
> > > If this is just a hack for one broken application, and we think that
> > > building in a workaround for this particular broken application is
> > > important (I don't), I still prefer an obvious hack for that broken
> > > application like feeding it a tiny extension string that it cares
> about,
> > > instead of reordering the extension list.
> >
> > There are many versions of Quake3 out there, some fixed, others not, and
> > others enhanced. This means a tiny string would prevent any Quake3
> > application from finding newer extensions. So I think that if we go for
> > the application name detection then we should present the whole
> > extension string sorted chronologically, instead of giving a tiny
> > string.
> >
> > Jose
>
> I agree with José - it's not one broken application, it's a number of old,
> sometimes closed-source games that we can't change.
>
> I'm not sure how changing the sorting solves the problem, anyway - the
> amount
> of data returned would still overflow the buffer, possibly wreaking havoc.
>  I'd
> rather avoid that.
>
> Ian and I talked about this a year ago, and the solution I believe we came
> up
> with was to use a driconf option or environment variable:
>
> If MESA_MAX_EXTENSION_YEAR=2006, then glGetString would only return
> extensions
> created in 2006 or earlier.  The rationale is that if a game came out in
> 2006,
> it won't know about any extensions from 2007 anyway, so advertising them is
> useless.  The fixed-size buffer is also almost certainly large enough to
> handle
> this cut-down list of extensions.
>
> This should be trivial to do now that you already have the years for each
> extension...just store them in the table, rather than in comments, and
> check
> before listing an extension.
>
> A driconf option is nice because it allows this to be overridden in .drirc
> on
> a per-app basis, rather than having to set an environment variable.  It
> might
> be a bit more work though.
>
> --Kenneth
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Naked DXTn support via ARB_texture_compression?

2011-03-20 Thread Patrick Baggett
Offhand, anyone know when these patents expire?

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev