This adds seamless sampling for cubemap boundaries if requested.
The corner case averaging is messy but seems like it should be spec
compliant.
The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even if all my directions are fully correct!
Signed-of
https://bugs.freedesktop.org/show_bug.cgi?id=58137
Priority: medium
Bug ID: 58137
Assignee: mesa-dev@lists.freedesktop.org
Summary: [r300g, r600g] corruption on 0 A.D. game with postproc
effects enabled
Severity: normal
C
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Am 2012-12-11 00:47, schrieb Ian Romanick:
>> [ 760.187261] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command
>> stream ! [ 760.192898] radeon :01:00.0:
>> evergreen_cs_track_validate_stencil:602 stencil read bo base
>> 4148500480 not aligned wi
We already have the Mesa version in the version string, isn't that enough
to detect Mesa?
---
src/mesa/state_tracker/st_cb_strings.c |5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/src/mesa/state_tracker/st_cb_strings.c
b/src/mesa/state_tracker/st_cb_strings.c
index 213
This will break apps that expect the current to tweak their behavior. Changing
it now will cause pain to app developers and ourselves, and I honestly don't
the good of it.
For good or bad we have these strings. So I'd prefer we focused on making our
drivers rock solid so that app developers do
On 11 December 2012 13:57, Marek Olšák wrote:
> We already have the Mesa version in the version string, isn't that enough
> to detect Mesa?
In theory, although the vendor string would IMO be the expected place for that.
___
mesa-dev mailing list
mesa-dev
Am 11.12.2012 10:52, schrieb Dave Airlie:
> This adds seamless sampling for cubemap boundaries if requested.
>
> The corner case averaging is messy but seems like it should be spec
> compliant.
>
> The face direction stuff is also a bit messy, I've no idea if that could
> or should be simpler, or
Just a few minor things below.
On 12/11/2012 02:52 AM, Dave Airlie wrote:
This adds seamless sampling for cubemap boundaries if requested.
The corner case averaging is messy but seems like it should be spec
compliant.
The face direction stuff is also a bit messy, I've no idea if that could
or
From: Tom Stellard
Every call to _cl_program::build() was erasing the binaries and logs for
every device associated with the program. This is incorrect because
it is possible to build a program for only a subset of devices and so
any device not being build should not have this information erased
From: Tom Stellard
---
src/gallium/state_trackers/clover/api/program.cpp | 7 +++-
.../state_trackers/clover/core/compiler.hpp| 12 +-
src/gallium/state_trackers/clover/core/program.cpp | 12 --
src/gallium/state_trackers/clover/core/program.hpp | 3 +-
.../state_trackers/clov
On 12/10/2012 03:28 PM, Matt Turner wrote:
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
Fixes es3conform's element_index_uint_constants test.
---
src/mesa/main/context.c |3 +++
src
The Align parameter is a power of two, so 16 results in 64K
alignment. Additional to that even 16 byte alignment doesn't
make any sense, so just remove it.
Signed-off-by: Christian König
---
lib/Target/AMDGPU/AMDILISelLowering.cpp |1 -
1 file changed, 1 deletion(-)
diff --git a/lib/Target/
Signed-off-by: Christian König
---
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp| 10 --
lib/Target/AMDGPU/AMDGPUMCInstLower.h |5 -
.../AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp | 10 +-
lib/Target/AMDGPU/MCTargetDesc/SIMCCodeEmitter.cpp |6
They seem to work fine.
Signed-off-by: Christian König
---
lib/Target/AMDGPU/SIInstructions.td |8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/Target/AMDGPU/SIInstructions.td
b/lib/Target/AMDGPU/SIInstructions.td
index ea8de91..008652f 100644
--- a/lib/Target/
Branch if we have enough instructions so that it makes sense.
Also remove branches if they don't make sense.
Signed-off-by: Christian König
---
lib/Target/AMDGPU/SILowerControlFlow.cpp | 49 ++
1 file changed, 49 insertions(+)
diff --git a/lib/Target/AMDGPU/SILower
Unlike SGPRs VGPRs doesn't need to be aligned.
Signed-off-by: Christian König
---
lib/Target/AMDGPU/SIRegisterInfo.td | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lib/Target/AMDGPU/SIRegisterInfo.td
b/lib/Target/AMDGPU/SIRegisterInfo.td
index e52311a..c3f13
Could you commit and push it to master?
Am Mi, 5. Dezember 2012, 09:31:48 schrieben Sie:
> On Tue, Dec 4, 2012 at 12:50 PM, Tobias Droste wrote:
> > Anyone interested? ;-)
> >
> > I would just push it, but I don't have the rights to do so.
>
> Looks reasonable to me.
>
> Reviewed-by: Alex Deuc
On Tue, Dec 11, 2012 at 12:49 PM, Tobias Droste wrote:
> Could you commit and push it to master?
Done. Thanks!
Alex
>
> Am Mi, 5. Dezember 2012, 09:31:48 schrieben Sie:
>> On Tue, Dec 4, 2012 at 12:50 PM, Tobias Droste wrote:
>> > Anyone interested? ;-)
>> >
>> > I would just push it, but I d
Tom Stellard writes:
> From: Tom Stellard
>
> Every call to _cl_program::build() was erasing the binaries and logs for
> every device associated with the program. This is incorrect because
> it is possible to build a program for only a subset of devices and so
> any device not being build shoul
Some shaders experience resets more than others, which skews the numbers
reported. Attempt to correct for this by linearly scaling according to
the number of resets that happen.
Note that will not be accurate if invocations of shaders have varying
times and longer invocations are more likely to r
Sometimes I've got a patch for a performance optimization that's not
showing a statistically significant performance difference on reported
FPS, but still seems like a good idea because it ought to reduce time
spent in the shader. If I can see the total number of cycles spent in
the shader stage b
I'm about to emit other kinds of writes besides time deltas, and it
turns out with the frequency of resets, we couldn't really use the old
time delta write() function more than once in a shader.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 53 +---
src/mesa/drivers/dr
Kenneth Graunke writes:
> On 12/07/2012 02:58 PM, Eric Anholt wrote:
>> This came from an idea by Ben Segovia. 16-wide pixel shaders are very
>> important for latency hiding on i965, so we want to try really hard to
>> get them. If scheduling an instruction makes some set of instructions
>> ava
On 12/10/2012 03:51 PM, Eric Anholt wrote:
Marek Olšák writes:
There are 2 ways. I prefer the former:
GALLIUM_MSAA=n
__GL_FSAA_MODE=n
Tested with ETQW, which doesn't support MSAA on Linux. This is
the only way to get MSAA there.
This sounds like something that would be nice to add as
On 12/10/2012 02:28 PM, Matt Turner wrote:
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
Fixes es3conform's element_index_uint_constants test.
---
src/mesa/main/context.c |3 +++
src
On Tue, Dec 11, 2012 at 11:00 AM, Ian Romanick wrote:
> On 12/10/2012 02:28 PM, Matt Turner wrote:
>>
>> The ES 3 spec says that the minumum allowable value is 2^24-1, but the
>> GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
>>
>> Fixes es3conform's element_index_uint_co
On 12/10/2012 04:06 PM, Jordan Justen wrote:
On Mon, Dec 10, 2012 at 2:28 PM, Matt Turner wrote:
@@ -966,6 +973,15 @@ find_value(const char *func, GLenum pname, void **p, union
value *v)
int api;
api = ctx->API;
+ /* We index into the table_set[] list of per-API hash tables using
On 12/10/2012 02:28 PM, Matt Turner wrote:
Fixes the transform_feedback2_init_defaults test from es3conform.
The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
TRANSFORM_FEEDBACK_ACTIVE.
---
src/mesa/main/get.c |8 +++-
src/mesa/main/get_hash_params.py | 10
On 12/10/2012 02:28 PM, Matt Turner wrote:
From GL/GLES/GL_CORE and GLES2 -> GL/GL_CORE/GLES2.
Yes, we really were exposing ES2_compatibility queries on ES 1.
---
src/mesa/main/get_hash_params.py | 16 ++--
1 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/src/mes
On Tue, Dec 11, 2012 at 11:08 AM, Ian Romanick wrote:
> On 12/10/2012 02:28 PM, Matt Turner wrote:
>>
>> Fixes the transform_feedback2_init_defaults test from es3conform.
>>
>> The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
>> TRANSFORM_FEEDBACK_ACTIVE.
>> ---
>> src/mesa/main/get.c
On 12/10/2012 02:28 PM, Matt Turner wrote:
Other than the comments on 6, 11, and 12, the series is
Reviewed-by: Ian Romanick
---
src/mesa/main/get.c | 16
src/mesa/main/get_hash_generator.py |8 +++-
src/mesa/main/mtypes.h |1 +
On Tue, Dec 11, 2012 at 11:12 AM, Ian Romanick wrote:
> On 12/10/2012 02:28 PM, Matt Turner wrote:
>>
>> From GL/GLES/GL_CORE and GLES2 -> GL/GL_CORE/GLES2.
>>
>> Yes, we really were exposing ES2_compatibility queries on ES 1.
>> ---
>> src/mesa/main/get_hash_params.py | 16 ++--
>
On 12/11/2012 11:14 AM, Matt Turner wrote:
On Tue, Dec 11, 2012 at 11:08 AM, Ian Romanick wrote:
On 12/10/2012 02:28 PM, Matt Turner wrote:
Fixes the transform_feedback2_init_defaults test from es3conform.
The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
TRANSFORM_FEEDBACK_ACTIVE.
I'm not familiar enough with the existing code to feel comfortable
reviewing it, but I've run it through a full piglit test run (using
tests/all.tests w/ GL/GLX enabled) without noticing any issues.
Also, Reaction Quake 3 performance went up by ~25% as a result of this
series on my Radeon 6850
On Tue, Dec 11, 2012 at 8:24 PM, Aaron Watry wrote:
> I'm not familiar enough with the existing code to feel comfortable reviewing
> it, but I've run it through a full piglit test run (using tests/all.tests w/
> GL/GLX enabled) without noticing any issues.
>
> Also, Reaction Quake 3 performance we
On Mon, Dec 10, 2012 at 3:47 PM, Marek Olšák wrote:
> u_upload_mgr suballocates memory from a large buffer and maps the allocated
> range (unsychronized), which is perfect for short-lived staging buffers.
>
> This reduces the number of relocations sent to the kernel.
Series looks good to me.
Rev
This is redundant since we're calling draw_bind_fragment_shader()
which already does a flush.
v2: the redundant flush in llvmpipe_set_constant_buffer() has
already been removed by commit 3427466e6dbbb8db7c1ecda6b3859ca1cc5827a3
---
src/gallium/drivers/llvmpipe/lp_state_fs.c |2 --
1 files cha
On 12/11/2012 11:04 AM, Ian Romanick wrote:
On 12/10/2012 04:06 PM, Jordan Justen wrote:
On Mon, Dec 10, 2012 at 2:28 PM, Matt Turner wrote:
@@ -966,6 +973,15 @@ find_value(const char *func, GLenum pname, void
**p, union value *v)
int api;
api = ctx->API;
+ /* We index into the ta
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +
1 file changed, 21 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index f428a83..c520364 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i96
---
src/mesa/drivers/dri/i965/brw_fs.cpp |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c520364..62800b1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_
The compute-to-mrf code is really twitchy, and it's hard to construct
GLSL testcases for it. This unit test is also really hard to work with
(for example, if your instruction is removed by dead code elimination,
you end up inspecting something irrelevant), but I did use it for
debugging some of th
The way our visitor works, scalar expression/swizzle results that get
stored in channels other than .x will have an intermediate MOV from
their result in the .x channel to the real .y (or whatever) channel, and
similarly for vec2/vec3 results.
By knowing how to adjust DP4-type instructions for opt
No statistically significant performance difference on glbenchmark 2.7
(n=60). It reduces cycles spent in the vertex shader by 3.3% +/- 0.8%
(n=5), but that's only about .3% of all cycles spent according to the
fixed shader_time.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 129 +
This patch series adds varying packing to Mesa, so that we can handle
varyings composed of things other than vec4's without using up extra
varying components.
For the initial implementation I've chosen a strategy that operates
exclusively at the GLSL IR level, so that it doesn't require the
cooper
This patch modifies the clip distance lowering pass so that the new
symbol it generates (glClipDistanceMESA) is added to the shader's
symbol table.
This will allow a later patch to modify the linker so that it finds
transform feedback varyings using the symbol table rather than having
to iterate t
Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations(). This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback, link_invalidate_variable_loc
Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.
This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.
In future pat
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4. In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.
This patch introduces a field ir_variable::l
This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.
This paves the way for varying packing, which
This patch further subdivides the loop that assigns varying locations
into two phases: one phase to match up the varyings between shader
stages, and one phase to assign them varying locations.
In between the two phases the matched varyings are stored in a new
data structure called varying_matches.
This patch paves the way for varying packing by adding a sorting step
before varying assignment, which sorts the varyings into an order that
increases the likelihood of being able to find an efficient packing.
First, varyings are sorted into "packing classes" by considering
attributes that can't b
This lowering pass generates GLSL code that manually packs varyings
into vec4 slots, for the benefit of back-ends that don't support
packed varyings natively.
No functional change--the lowering pass is not yet used.
---
src/glsl/Makefile.sources | 1 +
src/glsl/ir_optimization.h
This patch implements varying packing within varyings that are
composed of multiple vectors of size less than 4 (e.g. arrays of
vec2's, or matrices with height less than 4).
Previously, such varyings used up a full 4-wide varying slot for each
constituent vector, meaning that some of the component
This patch implements varying packing between varyings.
Previously, each varying occupied components 0 through N-1 of its
assigned varying slot, so there was no way to pack two varyings into
the same slot. For example, if the varyings were a float, a vec2, a
vec3, and another vec2, they would be
https://bugs.freedesktop.org/show_bug.cgi?id=42516
Brian Paul changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
Previously, if the client program didn't specify a stride when setting
up a vertex attribute, we used _mesa_sizeof_type() to compute the size
of the type, and multiplied it by the number of components.
This didn't work for the 2_10_10_10 formats, since _mesa_sizeof_type()
returns -1 for those type
> For the initial implementation I've chosen a strategy that operates
> exclusively at the GLSL IR level, so that it doesn't require the
> cooperation of the driver back-ends.
Wouldn't this negatively affect performance of some GPUs?
Not sure if relevant for Mesa, but e.g. on PowerVR SGX it's re
57 matches
Mail list logo