[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes
https://bugs.freedesktop.org/show_bug.cgi?id=91020 Bug ID: 91020 Summary: Mesa's demo / tools won't compile since EGL changes Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Demos Assignee: mesa-dev@lists.freedesktop.org Reporter: m...@fireburn.co.uk QA Contact: mesa-dev@lists.freedesktop.org Created attachment 116577 --> https://bugs.freedesktop.org/attachment.cgi?id=116577&action=edit Build logs Compile fails with EGL_SCREEN_BIT_MESA -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls
On Wed, 2015-06-17 at 23:02 -0400, Ilia Mirkin wrote: > This was apparently missed when KHR_debug or ARB_sso support was added. SSO was still missing when I added KHR_debug, but this should have been picked up by the object label piglit test I wrote. However the test was broken, it failed to fail :( I've sent a patch to fix this: http://lists.freedesktop.org/archives/piglit/2015-June/016281.html Your patch looks good to me and passes the fix piglit test. Reviewed-by: Timothy Arceri > Add label support to pipeline objects just like all the other > debug-related objects. > > Signed-off-by: Ilia Mirkin > Cc: "10.5 10.6" > --- > src/mesa/main/mtypes.h | 2 ++ > src/mesa/main/objectlabel.c | 10 -- > src/mesa/main/pipelineobj.c | 21 +++-- > src/mesa/main/pipelineobj.h | 3 +++ > 4 files changed, 24 insertions(+), 12 deletions(-) > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index ffa7f0c..983b9dc 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -2815,6 +2815,8 @@ struct gl_pipeline_object > > mtx_t Mutex; > > + GLchar *Label; /**< GL_KHR_debug */ > + > /** > * Programs used for rendering > * > diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c > index aecb5b1..5626054 100644 > --- a/src/mesa/main/objectlabel.c > +++ b/src/mesa/main/objectlabel.c > @@ -30,6 +30,7 @@ > #include "enums.h" > #include "fbobject.h" > #include "objectlabel.h" > +#include "pipelineobj.h" > #include "queryobj.h" > #include "samplerobj.h" > #include "shaderobj.h" > @@ -214,8 +215,13 @@ get_label_pointer(struct gl_context *ctx, GLenum > identifier, GLuint name, >} >break; > case GL_PROGRAM_PIPELINE: > - /* requires GL 4.2 */ > - goto invalid_enum; > + { > + struct gl_pipeline_object *pipe = > +_mesa_lookup_pipeline_object(ctx, name); > + if (pipe) > +labelPtr = &pipe->Label; > + } > + break; > default: >goto invalid_enum; > } > diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c > index b4795ff..279ae20 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -65,6 +65,7 @@ _mesa_delete_pipeline_object(struct gl_context *ctx, > > _mesa_reference_shader_program(ctx, &obj->ActiveProgram, NULL); > mtx_destroy(&obj->Mutex); > + free(obj->Label); > ralloc_free(obj); > } > > @@ -136,8 +137,8 @@ _mesa_free_pipeline_data(struct gl_context *ctx) > * a non-existent ID. The spec defines ID 0 as being technically > * non-existent. > */ > -static inline struct gl_pipeline_object * > -lookup_pipeline_object(struct gl_context *ctx, GLuint id) > +struct gl_pipeline_object * > +_mesa_lookup_pipeline_object(struct gl_context *ctx, GLuint id) > { > if (id == 0) >return NULL; > @@ -225,7 +226,7 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield > stages, GLuint program) > { > GET_CURRENT_CONTEXT(ctx); > > - struct gl_pipeline_object *pipe = lookup_pipeline_object(ctx, pipeline); > + struct gl_pipeline_object *pipe = _mesa_lookup_pipeline_object(ctx, > pipeline); > struct gl_shader_program *shProg = NULL; > GLbitfield any_valid_stages; > > @@ -337,7 +338,7 @@ _mesa_ActiveShaderProgram(GLuint pipeline, GLuint program) > { > GET_CURRENT_CONTEXT(ctx); > struct gl_shader_program *shProg = NULL; > - struct gl_pipeline_object *pipe = lookup_pipeline_object(ctx, pipeline); > + struct gl_pipeline_object *pipe = _mesa_lookup_pipeline_object(ctx, > pipeline); > > if (program != 0) { >shProg = _mesa_lookup_shader_program_err(ctx, program, > @@ -399,7 +400,7 @@ _mesa_BindProgramPipeline(GLuint pipeline) > */ > if (pipeline) { >/* non-default pipeline object */ > - newObj = lookup_pipeline_object(ctx, pipeline); > + newObj = _mesa_lookup_pipeline_object(ctx, pipeline); >if (!newObj) { > _mesa_error(ctx, GL_INVALID_OPERATION, > "glBindProgramPipeline(non-gen name)"); > @@ -468,7 +469,7 @@ _mesa_DeleteProgramPipelines(GLsizei n, const GLuint > *pipelines) > > for (i = 0; i < n; i++) { >struct gl_pipeline_object *obj = > - lookup_pipeline_object(ctx, pipelines[i]); > + _mesa_lookup_pipeline_object(ctx, pipelines[i]); > >if (obj) { > assert(obj->Name == pipelines[i]); > @@ -568,7 +569,7 @@ _mesa_IsProgramPipeline(GLuint pipeline) > { > GET_CURRENT_CONTEXT(ctx); > > - struct gl_pipeline_object *obj = lookup_pipeline_object(ctx, pipeline); > + struct gl_pipeline_object *obj = _mesa_lookup_pipeline_object(ctx, > pipeline); > if (obj == NULL) >return GL_FALSE; > > @@ -582,7 +583,7 @@ void GLAPIENTRY > _mesa_GetProgramPipelineiv(GLuint pipeline, GLenum pname, GLint *params) > { > GET_CURRENT_CONTEXT(ctx); > - struct gl_pipeline_object *pipe = lookup_pipeli
[Mesa-dev] [PATCH] clover: Implement image attribute getters
Image attributes are passed to the kernel as hidden parameters after the image attribute itself. An llvm pass replaces the getter builtins to the appropriate parameters. --- src/gallium/state_trackers/clover/core/kernel.cpp | 26 +++ src/gallium/state_trackers/clover/core/kernel.hpp | 13 ++-- src/gallium/state_trackers/clover/core/memory.cpp | 2 +- .../state_trackers/clover/llvm/invocation.cpp | 81 +- 4 files changed, 116 insertions(+), 6 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index 0756f06..291c799 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -185,6 +185,13 @@ kernel::exec_context::bind(intrusive_ptr _q, } } + // Bind image attribute args. + for (const auto& arg: kern._args) { + if (auto img_arg = dynamic_cast(arg.get())) { + img_arg->bind_attributes(*this); + } + } + // Create a new compute state if anything changed. if (!st || q != _q || cs.req_local_mem != mem_local || @@ -465,6 +472,25 @@ kernel::constant_argument::unbind(exec_context &ctx) { } void +kernel::image_argument::bind_attributes(exec_context &ctx) { + cl_image_format format = img->format(); + cl_uint attributes[] = { + static_cast(img->width()), + static_cast(img->height()), + static_cast(img->depth()), + format.image_channel_data_type, + format.image_channel_order}; + for (unsigned i = 0; i < 5; ++i) { + auto v = bytes(attributes[i]); + + extend(v, module::argument::zero_ext, sizeof(cl_uint)); + byteswap(v, ctx.q->device().endianness()); + align(ctx.input, sizeof(cl_uint)); + insert(ctx.input, v); + } +} + +void kernel::image_rd_argument::set(size_t size, const void *value) { if (size != sizeof(cl_mem)) throw error(CL_INVALID_ARG_SIZE); diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp index d6432a4..8c15b2f 100644 --- a/src/gallium/state_trackers/clover/core/kernel.hpp +++ b/src/gallium/state_trackers/clover/core/kernel.hpp @@ -190,7 +190,14 @@ namespace clover { pipe_surface *st; }; - class image_rd_argument : public argument { + class image_argument : public argument { + public: + void bind_attributes(exec_context &ctx); + protected: + image *img; + }; + + class image_rd_argument : public image_argument { public: virtual void set(size_t size, const void *value); virtual void bind(exec_context &ctx, @@ -198,11 +205,10 @@ namespace clover { virtual void unbind(exec_context &ctx); private: - image *img; pipe_sampler_view *st; }; - class image_wr_argument : public argument { + class image_wr_argument : public image_argument { public: virtual void set(size_t size, const void *value); virtual void bind(exec_context &ctx, @@ -210,7 +216,6 @@ namespace clover { virtual void unbind(exec_context &ctx); private: - image *img; pipe_surface *st; }; diff --git a/src/gallium/state_trackers/clover/core/memory.cpp b/src/gallium/state_trackers/clover/core/memory.cpp index 055336a..b852e68 100644 --- a/src/gallium/state_trackers/clover/core/memory.cpp +++ b/src/gallium/state_trackers/clover/core/memory.cpp @@ -189,7 +189,7 @@ image2d::image2d(clover::context &ctx, cl_mem_flags flags, const cl_image_format *format, size_t width, size_t height, size_t row_pitch, void *host_ptr) : - image(ctx, flags, format, width, height, 0, + image(ctx, flags, format, width, height, 1, row_pitch, 0, height * row_pitch, host_ptr) { } diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 9b91fee..a33d450 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -80,6 +80,7 @@ using namespace clover; namespace { + #if 0 void build_binary(const std::string &source, const std::string &target, @@ -340,17 +341,65 @@ namespace { PM.run(*mod); } + const llvm::MDNode * + get_kernel_metadata(const llvm::Function *kernel_func) { + auto mod = kernel_func->getParent(); + auto kernels_node = mod->getNamedMetadata("opencl.kernels"); + if (!kernels_node) { + return nullptr; + } + + const llvm::MDNode *kernel_node = nullptr; + for (unsigned i = 0; i < kernels_node->getNumOperands(); ++i) { +#if HAVE_LLVM >= 0x0306 + auto func = llvm::mdconst::dyn_extract( +#else + auto func = llvm::dyn_cast( +#endif + kernels_node->getOperand(i)->getOp
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 1:23 AM, Connor Abbott wrote: > On Wed, Jun 17, 2015 at 12:02 PM, Rob Clark wrote: >> On Wed, Jun 17, 2015 at 2:27 PM, Connor Abbott wrote: >>> So, as is, this patch isn't quite correct. When I originally wrote >>> NIR, the idea was that the size of each instruction would be explicit >>> -- that is, each instruction has it's own size, and the size of >>> registers/SSA values was merely a hint to say "by the way, you >>> actually need this many components based on all the things that use >>> this value/register." In other words, you could smash num_components >>> to 4 for everything, and it would still work. Then, we could just >>> shrink num_components on demand as we got rid of uses of things. >>> That's why we have functions like nir_tex_instr_src_size(), >>> nir_tex_instr_dest_size(), and nir_ssa_alu_instr_src_components() that >>> seem like they're doing the same thing that these functions -- in most >>> cases, you actually want those functions instead of these. Maybe we >>> were figuring out the register/value # of components a few times, and >>> perhaps some of times it was broken, but it seems like adding these >>> functions would only add to the confusion. >>> >>> Now, in hindsight, it seems like that might not be the best idea. In >>> other words, I can see how it would make sense to turn >>> nir_tex_instr_src_size() etc. into assertions in the validator that >>> check that nir_(src|dest)_num_components() equals what you would >>> expect, and it's probably a good idea. But I don't want people to use >>> these functions without knowing what they're doing until we do that >>> cleanup. >> >> hmm, fortunately I hadn't pushed my loops branch yet, since still need >> to work out how to make variables/arrays work w/ >1 block (since in >> nir-land variables are not ssa).. >> >> (really I want phi's for variables too.. the way I turn arrays into >> fanin/collect fanout/split works on the backend for dealing with >> arrays in ssa form (other than making instruction graph large) but the >> way I go from nir to ir3 currently assumes I get nir phi's everywhere >> I need an ir3 phi) > > Right... we explicitly decided not to support SSA form for arrays in > NIR, since it seemed like a pretty bad idea. SSA form assumes that > inserting copies for the things you're SSA-ifying is relatively > inexpensive, and both SSA-based register allocation and algorithms for > converting out of SSA make no guarantees about not inserting > potentially unnecessary copies. This is a good compromise for smaller > things, but not for your array of (say) 64 things where inserting an > extra copy is rather disastrous. You can do it if you want and shoot > yourself in the foot, but NIR isn't going to help you there -- we > won't write a to-SSA pass for something which doesn't make much sense > in the first place. > in ir3 my solution is to add sufficient dependencies between instructions so the array accesses don't get re-ordered and they all collapse down to a single name per array element/slot >> >> Anyways, maybe I'll just move the helpers into ir3 for now until the >> is_packed stuff is purged.. > > is_packed? what does that have to do with it? I suspect that there > might be some places where you're using this function instead of the > others I mentioned, and you actually want to use the latter, although > I haven't seen the code so I can't be sure. But of course, there's > always the option of actually going and fixing it :) because I was in a hurry and didn't actually read your original reply (and assumed it was continuation of discussion about nuking is_packed from irc) So the place where I was using it is actually for phi instructions (and actually just for an assert at the moment), so those fxns you mention don't do me much good. Somehow, I think it is still a good idea to un-open-code the figuring out of # of src/dst components. Sure it makes sense to have validate code ensure that those are same as nir_tex_instr_src_size(), nir_tex_instr_dest_size(), and nir_ssa_alu_instr_src_components() give. But that doesn't preclude adding these two functions. BR, -R > >> >> BR, >> -R >> >> >>> >>> On Mon, Jun 8, 2015 at 12:45 PM, Rob Clark wrote: From: Rob Clark I need something like this in a couple places. And didn't see anything like it anywhere. Signed-off-by: Rob Clark --- v2: Added similar helper for nir_src, and cleaned up a few places that open coded this. There are a couple left (such as validate_alu_src()) but that handle is_packed differently so I thought it best to leave them as-is. src/glsl/nir/nir.h | 18 ++ src/glsl/nir/nir_from_ssa.c | 10 ++ src/glsl/nir/nir_validate.c | 4 +--- 3 files changed, 21 insertions(+), 11 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 697d37e..06bbb0c 100644 --- a/src/glsl/nir/nir.h +++ b
Re: [Mesa-dev] [Mesa-stable] [PATCH] egl/x11: Set version of swrastLoader to 2
On 16 June 2015 at 16:39, Ian Romanick wrote: > Gak... I thought we fixed all of those. :( > Ops seems like I've missed it. Good news is that this is the last one :-) > Reviewed-by: Ian Romanick > > On 06/15/2015 08:08 PM, Boyan Ding wrote: >> which it actually implements instead of the newest version defined in >> dri_interface.h >> >> Cc: "10.5 10.6" >> Signed-off-by: Boyan Ding Reviewed-by: Emil Velikov and committed to master. Thanks ! Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals
Hi Boyan, On 13 June 2015 at 08:33, Boyan Ding wrote: > The call to dri2_x11_add_configs_for_visuals (previously > dri2_add_configs_for_visuals) was moved downwards in commit f8c5b8a1, > but appeared again in its original position after its rename in > d019cd81. Remove it. > I believe you're bang on the spot here. The latter commit mentions only about the renaming, so it seems that the hunk got back in as the patch was rebased. Adding Chad to the Cc list, just in case we've missed something :-) Fwiw the patch is Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/6] mesa: don't rebind constant buffers after every state change if GS is active
Strange I was under the impression that there are apps that make use of GS, albeit not too many. On the perf side - I was thinking about the hardware (i.e. regardless if the driver does extra state-tracking or not) - would there be the optimisation mentioned, would there be a "stall" in the pipeline, due to the "new" values being flushed/fetched/etc. Now that I think about it, only a few of the HW guys may know the answer on this one, so don't bother with this. Thanks Emil On 16 June 2015 at 20:56, Marek Olšák wrote: > There are probably 0 apps using GS, so the answer is 0. > > The hardware doesn't ignore anything. It only does what it's told to do. > > The radeonsi driver doesn't check if the state change is redundant or not. > > Marek > > On Tue, Jun 16, 2015 at 10:13 PM, Emil Velikov > wrote: >> Hi Marek, >> >> Out of curiosity: >> Any rough idea of how much of a perf. improvement this might bring ? >> Would the hardware ignore the newly (re)bound const. bufs, when the >> values are unchanged ? >> >> Thanks >> Emil >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: rename LLVM_VERSION_PATCH to avoid conflict with llvm-config.h
On 16 June 2015 at 20:18, Tom Stellard wrote: > On Tue, Jun 16, 2015 at 08:07:57PM +0100, Emil Velikov wrote: >> On 13 June 2015 at 19:16, Marek Olšák wrote: >> > From: Marek Olšák >> > >> > --- >> > configure.ac | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/configure.ac b/configure.ac >> > index 34d1ac9..e6d947e 100644 >> > --- a/configure.ac >> > +++ b/configure.ac >> > @@ -1929,7 +1929,7 @@ if test "x$enable_gallium_llvm" = xyes; then >> > LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker >> > instrumentation" >> > LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option >> > objcarcopts profiledata" >> > fi >> > -DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT >> > -DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH" >> > +DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT >> > -DMESA_LLVM_VERSION_PATCH=$LLVM_VERSION_PATCH" >> >> Silly questions: >> If LLVM already sets LLVM_VERSION_PATCH shouldn't we be using it, >> rather than setting our own ? Perhaps we can drop the define >> altogether, considering that we're not using it ? >> > > Depending on the version of llvm and the build system that was used > LLVM_VERSION_PATH may or not be defined, Is there a version where this is resolved, for all build systems ? > so we can't rely on it. > Currently none of code does, thus my question about dropping it altogether. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/14] meta: Fix transfer operations check in meta pbo path for readpixels
On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: > Without this patch, arb_color_buffer_float-readpixels test fails, when > forced to use meta pbo path. > > Signed-off-by: Anuj Phogat > Cc: > --- > src/mesa/drivers/common/meta_tex_subimage.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/common/meta_tex_subimage.c > b/src/mesa/drivers/common/meta_tex_subimage.c > index d2474f5..00364f8 100644 > --- a/src/mesa/drivers/common/meta_tex_subimage.c > +++ b/src/mesa/drivers/common/meta_tex_subimage.c > @@ -273,12 +273,14 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, > GLuint dims, > format == GL_COLOR_INDEX) >return false; > > - if (ctx->_ImageTransferState) > - return false; > - > - That test uses glReadPixels so it should call this with tex_image set to NULL and it should flow through the if you have below. The call to _mesa_get_readpixels_transfer_ops that you add below looks like it does what we want for a pixel read from a framebuffer instead of simply checking ctx->_ImageTransferState directly. I suppose this is what fixes the test, right? The patch also removes the ctx->_ImageTransferState check for the case where we are reading from a real texture (tex_image != NULL), that seems unrelated to fixing arb_color_buffer_float-readpixels... Looking at the texture read code from getteximage.c it seems like this should be fine since that file does not seem to use that field for anything either, so I guess the check might not valid in this case. I think it would be nice if you updated the changelog to explain these things. Iago > + /* Don't use meta path for readpixels in below conditions. */ > if (!tex_image) { >rb = ctx->ReadBuffer->_ColorReadBuffer; > + > + if (_mesa_get_readpixels_transfer_ops(ctx, rb->Format, format, > +type, GL_FALSE)) > + return false; > + >if (_mesa_need_rgb_to_luminance_conversion(rb->Format, format)) > return false; > } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: rename LLVM_VERSION_PATCH to avoid conflict with llvm-config.h
We'll need to check the patch version for tessellation and amdgpu. Marek On Thu, Jun 18, 2015 at 2:22 PM, Emil Velikov wrote: > On 16 June 2015 at 20:18, Tom Stellard wrote: >> On Tue, Jun 16, 2015 at 08:07:57PM +0100, Emil Velikov wrote: >>> On 13 June 2015 at 19:16, Marek Olšák wrote: >>> > From: Marek Olšák >>> > >>> > --- >>> > configure.ac | 2 +- >>> > 1 file changed, 1 insertion(+), 1 deletion(-) >>> > >>> > diff --git a/configure.ac b/configure.ac >>> > index 34d1ac9..e6d947e 100644 >>> > --- a/configure.ac >>> > +++ b/configure.ac >>> > @@ -1929,7 +1929,7 @@ if test "x$enable_gallium_llvm" = xyes; then >>> > LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker >>> > instrumentation" >>> > LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option >>> > objcarcopts profiledata" >>> > fi >>> > -DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT >>> > -DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH" >>> > +DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT >>> > -DMESA_LLVM_VERSION_PATCH=$LLVM_VERSION_PATCH" >>> >>> Silly questions: >>> If LLVM already sets LLVM_VERSION_PATCH shouldn't we be using it, >>> rather than setting our own ? Perhaps we can drop the define >>> altogether, considering that we're not using it ? >>> >> >> Depending on the version of llvm and the build system that was used >> LLVM_VERSION_PATH may or not be defined, > Is there a version where this is resolved, for all build systems ? > >> so we can't rely on it. >> > Currently none of code does, thus my question about dropping it altogether. > > -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/14] mesa: Fix conditions to test signed, unsigned integer format
On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: > Signed-off-by: Anuj Phogat > Cc: > --- > src/mesa/main/readpix.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c > index caa2648..a9416ef 100644 > --- a/src/mesa/main/readpix.c > +++ b/src/mesa/main/readpix.c > @@ -160,10 +160,12 @@ _mesa_readpixels_needs_slow_path(const struct > gl_context *ctx, GLenum format, >srcType = _mesa_get_format_datatype(rb->Format); > >if ((srcType == GL_INT && > + _mesa_is_enum_format_integer(format) && > (type == GL_UNSIGNED_INT || > type == GL_UNSIGNED_SHORT || > type == GL_UNSIGNED_BYTE)) || >(srcType == GL_UNSIGNED_INT && > + _mesa_is_enum_format_integer(format) && > (type == GL_INT || > type == GL_SHORT || > type == GL_BYTE))) { As far as I understand this code we are trying to see if we can use memcpy to directly copy the contents of the framebuffer to the destination buffer. In that case, as long as the src/dst types have different sign we can't just use memcpy, right? In fact it looks like we might need to expand the checks to include the cases where srcType is GL_(UNSIGNED_)SHORT and GL_(UNSIGNED_)BYTE as well. That said, I think this code is not necessary with the call to _mesa_format_matches_format_and_type that we do immediately after this, since that will check that the framebuffer format exactly matches the destination format anyway, which is a much tighter check than this. In fact, a quick piglit run without these checks does not seem to break any tests on i965. Gallium uses these two functions in a slightly different way in st_cb_readpixels.c though, so I wonder if their use case requires these checks to exist in this function anyway. Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] EGL: Add pbuffer support for drm platform
On Thu 11 Jun 2015, Ying Liu wrote: > Add pbuffer support for drm platform, because some customers are still using > this feature. > > Signed-off-by: Ying Liu > --- > src/egl/drivers/dri2/egl_dri2.c | 2 +- > src/egl/drivers/dri2/platform_drm.c | 18 ++ > 2 files changed, 15 insertions(+), 5 deletions(-) Ying, I wrote a simple test for this patch that calls glClear on the pbuffer. The test crashes deep in the Intel driver. The test code is attached. Below is the backtrace and crashing line of code. I applied your patch to 6b0378e483ba53359545ac8b774dbdd81c2fab3f. file: src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 450+> if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS) 451| clear_type = REP_CLEAR; 452| #0 0x74f344db in brw_meta_fast_clear (brw=0x77f97040, fb=0xdbbe70, buffers=2, partial_clear=false) at brw_meta_fast_clear.c:450 #1 0x74ebf2f2 in brw_clear (ctx=0x77f97040, mask=2) at brw_clear.c:247 #2 0x74ad0baa in _mesa_Clear (mask=16384) at main/clear.c:224 #3 0x00400e4c in main () at examples/play-gbm-pbuffer.c:69 The miptree pointer 'mt' is NULL. That suggests that Mesa never allocated buffer storage for the pbuffer. If you want to continue with this patch, this is what you need to: - Fix the crash. I suspect that you need to create a gbm_bo for the pbuffer in dri2_drm_create_pbuffer_surface() and bind it to the egl_dri2_surface or DRI2Drawable, similar to how platform_x11.c allocates a pixmap for pbuffer surfaces. - Submit a Piglit test for gbm pbuffers to pig...@lists.freedesktop.org. The test should go somewhere under the 'tests/egl' directory. You may want to start with my test code, but add color probing to it. After the glClear, call glReadPixels and then verify the pixels have the correct color. That would a be a very simple test, and I think it's sufficient for this patch. BUT... I don't understand how this customer could possibly use GBM pbuffers, except very badly. Whatever the customer is doing with pbuffers, there is better way with better performance and less bugs. Does your customer actually render to the pbuffer? Or does your customer create the pbuffer only for the sake of providing a surface to eglMakeCurrent, and then never actually use it? If the customer is using the pbuffer as a dummy surface for eglMakeCurrent, then there is a better alternative. In nearly all EGL implementations (the closed source ones too), it is legal to call eglMakeCurrent() *without* a surface. That is, you can do this: eglMakeCurrent(dpy, EGL_NO_SURFACE, EGL_NO_SURFACE, context); // Chad's little gbm pbuffer test. // file: try-gbm-pbuffer.c // #define _POSIX_C_SOURCE 200809L #include #include #include #include #include #include #include #include #include #include int main(void) { bool ok; int fd = open("/dev/dri/renderD128", O_RDWR | O_CLOEXEC); assert(fd >= 0); struct gbm_device *gbm_dev = gbm_create_device(fd); assert(gbm_dev); EGLDisplay dpy = eglGetDisplay(gbm_dev); assert(dpy); EGLint major, minor; ok = eglInitialize(dpy, &major, &minor); assert(ok); EGLint config_attrs[] = { EGL_SURFACE_TYPE, EGL_PBUFFER_BIT, EGL_RENDERABLE_TYPE,EGL_OPENGL_ES2_BIT, EGL_BUFFER_SIZE,32, EGL_RED_SIZE, 8, EGL_GREEN_SIZE, 8, EGL_BLUE_SIZE, 8, EGL_ALPHA_SIZE, 8, EGL_NONE, }; EGLConfig config = 0; EGLint num_configs = 0; eglChooseConfig(dpy, config_attrs, &config, 1, &num_configs); assert(num_configs >= 1); assert(config); EGLSurface surf = eglCreatePbufferSurface(dpy, config, NULL); assert(surf); EGLint ctx_attrs[] = { EGL_CONTEXT_MAJOR_VERSION_KHR, 3, EGL_CONTEXT_MINOR_VERSION_KHR, 0, EGL_NONE, }; EGLContext ctx = eglCreateContext(dpy, config, NULL, ctx_attrs); assert(ctx); ok = eglMakeCurrent(dpy, surf, surf, ctx); assert(ok); glClearColor(1, 0, 0, 1); glClear(GL_COLOR_BUFFER_BIT); eglSwapBuffers(dpy, surf); glFinish(); return 0; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe: Truncate the binned constants to max const buffer size.
Tested with Ilia Mirkin's gzdoom.trace and "arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test without my earlier fix to fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE. --- src/gallium/auxiliary/gallivm/lp_bld_limits.h | 6 +- src/gallium/drivers/llvmpipe/lp_setup.c | 5 - 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_limits.h b/src/gallium/auxiliary/gallivm/lp_bld_limits.h index 49064fe..db50351 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_limits.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_limits.h @@ -51,8 +51,12 @@ #define LP_MAX_TGSI_PREDS 16 +#define LP_MAX_TGSI_CONSTS 4096 + #define LP_MAX_TGSI_CONST_BUFFERS 16 +#define LP_MAX_TGSI_CONST_BUFFER_SIZE (LP_MAX_TGSI_CONSTS * sizeof(float[4])) + /* * For quick access we cache registers in statically * allocated arrays. Here we define the maximum size @@ -100,7 +104,7 @@ gallivm_get_shader_param(enum pipe_shader_cap param) case PIPE_SHADER_CAP_MAX_OUTPUTS: return 32; case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE: - return sizeof(float[4]) * 4096; + return LP_MAX_TGSI_CONST_BUFFER_SIZE; case PIPE_SHADER_CAP_MAX_CONST_BUFFERS: return PIPE_MAX_CONSTANT_BUFFERS; case PIPE_SHADER_CAP_MAX_TEMPS: diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c index 56292c6..4c8167a 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup.c +++ b/src/gallium/drivers/llvmpipe/lp_setup.c @@ -1069,10 +1069,13 @@ try_update_scene_state( struct lp_setup_context *setup ) if (setup->dirty & LP_SETUP_NEW_CONSTANTS) { for (i = 0; i < Elements(setup->constants); ++i) { struct pipe_resource *buffer = setup->constants[i].current.buffer; - const unsigned current_size = setup->constants[i].current.buffer_size; + const unsigned current_size = MIN2(setup->constants[i].current.buffer_size, +LP_MAX_TGSI_CONST_BUFFER_SIZE); const ubyte *current_data = NULL; int num_constants; + STATIC_ASSERT(DATA_BLOCK_SIZE >= LP_MAX_TGSI_CONST_BUFFER_SIZE); + if (buffer) { /* resource buffer */ current_data = (ubyte *) llvmpipe_resource_data(buffer); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 4:48 AM, Rob Clark wrote: > On Thu, Jun 18, 2015 at 1:23 AM, Connor Abbott wrote: >> On Wed, Jun 17, 2015 at 12:02 PM, Rob Clark wrote: >>> On Wed, Jun 17, 2015 at 2:27 PM, Connor Abbott wrote: So, as is, this patch isn't quite correct. When I originally wrote NIR, the idea was that the size of each instruction would be explicit -- that is, each instruction has it's own size, and the size of registers/SSA values was merely a hint to say "by the way, you actually need this many components based on all the things that use this value/register." In other words, you could smash num_components to 4 for everything, and it would still work. Then, we could just shrink num_components on demand as we got rid of uses of things. That's why we have functions like nir_tex_instr_src_size(), nir_tex_instr_dest_size(), and nir_ssa_alu_instr_src_components() that seem like they're doing the same thing that these functions -- in most cases, you actually want those functions instead of these. Maybe we were figuring out the register/value # of components a few times, and perhaps some of times it was broken, but it seems like adding these functions would only add to the confusion. Now, in hindsight, it seems like that might not be the best idea. In other words, I can see how it would make sense to turn nir_tex_instr_src_size() etc. into assertions in the validator that check that nir_(src|dest)_num_components() equals what you would expect, and it's probably a good idea. But I don't want people to use these functions without knowing what they're doing until we do that cleanup. >>> >>> hmm, fortunately I hadn't pushed my loops branch yet, since still need >>> to work out how to make variables/arrays work w/ >1 block (since in >>> nir-land variables are not ssa).. >>> >>> (really I want phi's for variables too.. the way I turn arrays into >>> fanin/collect fanout/split works on the backend for dealing with >>> arrays in ssa form (other than making instruction graph large) but the >>> way I go from nir to ir3 currently assumes I get nir phi's everywhere >>> I need an ir3 phi) >> >> Right... we explicitly decided not to support SSA form for arrays in >> NIR, since it seemed like a pretty bad idea. SSA form assumes that >> inserting copies for the things you're SSA-ifying is relatively >> inexpensive, and both SSA-based register allocation and algorithms for >> converting out of SSA make no guarantees about not inserting >> potentially unnecessary copies. This is a good compromise for smaller >> things, but not for your array of (say) 64 things where inserting an >> extra copy is rather disastrous. You can do it if you want and shoot >> yourself in the foot, but NIR isn't going to help you there -- we >> won't write a to-SSA pass for something which doesn't make much sense >> in the first place. >> > > in ir3 my solution is to add sufficient dependencies between > instructions so the array accesses don't get re-ordered and they all > collapse down to a single name per array element/slot It's not about getting reordered, it's about interference. The problem is that as soon as you do basically any optimization at all (even copy propagation), you can wind up with a situation where the sources and destination of a phi node interfere with each other and you have to insert extra mov's. And even if you keep everything exactly the same, with an SSA-based register allocator, there's always the chance that it'll screw up anyways and allocate something over your array and force you to insert a mov. You could force the array to be allocated to a single hardware register, but then it's not really an SSA value anymore -- it's a hardware register, and you can't treat it like an SSA value anymore in your allocator, and so adding phi nodes and whatnot for it in your IR doesn't make much sense. > >>> >>> Anyways, maybe I'll just move the helpers into ir3 for now until the >>> is_packed stuff is purged.. >> >> is_packed? what does that have to do with it? I suspect that there >> might be some places where you're using this function instead of the >> others I mentioned, and you actually want to use the latter, although >> I haven't seen the code so I can't be sure. But of course, there's >> always the option of actually going and fixing it :) > > because I was in a hurry and didn't actually read your original reply > (and assumed it was continuation of discussion about nuking is_packed > from irc) > > So the place where I was using it is actually for phi instructions > (and actually just for an assert at the moment), so those fxns you > mention don't do me much good. Ok... for phi nodes, we already assert that the number of source components equals the number of dest components, so you can just use dest.ssa.num_components. You won't ever get a non-SSA thing in a phi node anyways (that only happens internally
Re: [Mesa-dev] [PATCH] llvmpipe: Truncate the binned constants to max const buffer size.
Am 18.06.2015 um 16:52 schrieb Jose Fonseca: > Tested with Ilia Mirkin's gzdoom.trace and > "arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test > without my earlier fix to fail linkage when UBO exceeds > GL_MAX_UNIFORM_BLOCK_SIZE. > --- > src/gallium/auxiliary/gallivm/lp_bld_limits.h | 6 +- > src/gallium/drivers/llvmpipe/lp_setup.c | 5 - > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_limits.h > b/src/gallium/auxiliary/gallivm/lp_bld_limits.h > index 49064fe..db50351 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_limits.h > +++ b/src/gallium/auxiliary/gallivm/lp_bld_limits.h > @@ -51,8 +51,12 @@ > > #define LP_MAX_TGSI_PREDS 16 > > +#define LP_MAX_TGSI_CONSTS 4096 > + > #define LP_MAX_TGSI_CONST_BUFFERS 16 > > +#define LP_MAX_TGSI_CONST_BUFFER_SIZE (LP_MAX_TGSI_CONSTS * sizeof(float[4])) > + > /* > * For quick access we cache registers in statically > * allocated arrays. Here we define the maximum size > @@ -100,7 +104,7 @@ gallivm_get_shader_param(enum pipe_shader_cap param) > case PIPE_SHADER_CAP_MAX_OUTPUTS: >return 32; > case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE: > - return sizeof(float[4]) * 4096; > + return LP_MAX_TGSI_CONST_BUFFER_SIZE; > case PIPE_SHADER_CAP_MAX_CONST_BUFFERS: >return PIPE_MAX_CONSTANT_BUFFERS; > case PIPE_SHADER_CAP_MAX_TEMPS: > diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c > b/src/gallium/drivers/llvmpipe/lp_setup.c > index 56292c6..4c8167a 100644 > --- a/src/gallium/drivers/llvmpipe/lp_setup.c > +++ b/src/gallium/drivers/llvmpipe/lp_setup.c > @@ -1069,10 +1069,13 @@ try_update_scene_state( struct lp_setup_context > *setup ) > if (setup->dirty & LP_SETUP_NEW_CONSTANTS) { >for (i = 0; i < Elements(setup->constants); ++i) { > struct pipe_resource *buffer = setup->constants[i].current.buffer; > - const unsigned current_size = > setup->constants[i].current.buffer_size; > + const unsigned current_size = > MIN2(setup->constants[i].current.buffer_size, > +LP_MAX_TGSI_CONST_BUFFER_SIZE); > const ubyte *current_data = NULL; > int num_constants; > > + STATIC_ASSERT(DATA_BLOCK_SIZE >= LP_MAX_TGSI_CONST_BUFFER_SIZE); > + > if (buffer) { > /* resource buffer */ > current_data = (ubyte *) llvmpipe_resource_data(buffer); > Reviewed-by: Roland Scheidegger ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] EGL: Add pbuffer support for drm platform
Hi, Chad, Thank you so much for reviewing my patch. I am in vacation and sabbatical right now. I will talk with our customer about your suggestion. I will fix the patch when I come back. Ying -Original Message- From: Versace, Chad Sent: Thursday, June 18, 2015 7:34 AM To: Liu, Ying2 Cc: mesa-dev@lists.freedesktop.org; Marek Olšák Subject: Re: [Mesa-dev] [PATCH] EGL: Add pbuffer support for drm platform On Thu 11 Jun 2015, Ying Liu wrote: > Add pbuffer support for drm platform, because some customers are still using > this feature. > > Signed-off-by: Ying Liu > --- > src/egl/drivers/dri2/egl_dri2.c | 2 +- > src/egl/drivers/dri2/platform_drm.c | 18 ++ > 2 files changed, 15 insertions(+), 5 deletions(-) Ying, I wrote a simple test for this patch that calls glClear on the pbuffer. The test crashes deep in the Intel driver. The test code is attached. Below is the backtrace and crashing line of code. I applied your patch to 6b0378e483ba53359545ac8b774dbdd81c2fab3f. file: src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 450+> if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS) 451| clear_type = REP_CLEAR; 452| #0 0x74f344db in brw_meta_fast_clear (brw=0x77f97040, fb=0xdbbe70, buffers=2, partial_clear=false) at brw_meta_fast_clear.c:450 #1 0x74ebf2f2 in brw_clear (ctx=0x77f97040, mask=2) at brw_clear.c:247 #2 0x74ad0baa in _mesa_Clear (mask=16384) at main/clear.c:224 #3 0x00400e4c in main () at examples/play-gbm-pbuffer.c:69 The miptree pointer 'mt' is NULL. That suggests that Mesa never allocated buffer storage for the pbuffer. If you want to continue with this patch, this is what you need to: - Fix the crash. I suspect that you need to create a gbm_bo for the pbuffer in dri2_drm_create_pbuffer_surface() and bind it to the egl_dri2_surface or DRI2Drawable, similar to how platform_x11.c allocates a pixmap for pbuffer surfaces. - Submit a Piglit test for gbm pbuffers to pig...@lists.freedesktop.org. The test should go somewhere under the 'tests/egl' directory. You may want to start with my test code, but add color probing to it. After the glClear, call glReadPixels and then verify the pixels have the correct color. That would a be a very simple test, and I think it's sufficient for this patch. BUT... I don't understand how this customer could possibly use GBM pbuffers, except very badly. Whatever the customer is doing with pbuffers, there is better way with better performance and less bugs. Does your customer actually render to the pbuffer? Or does your customer create the pbuffer only for the sake of providing a surface to eglMakeCurrent, and then never actually use it? If the customer is using the pbuffer as a dummy surface for eglMakeCurrent, then there is a better alternative. In nearly all EGL implementations (the closed source ones too), it is legal to call eglMakeCurrent() *without* a surface. That is, you can do this: eglMakeCurrent(dpy, EGL_NO_SURFACE, EGL_NO_SURFACE, context); // Chad's little gbm pbuffer test. // file: try-gbm-pbuffer.c // #define _POSIX_C_SOURCE 200809L #include #include #include #include #include #include #include #include #include #include int main(void) { bool ok; int fd = open("/dev/dri/renderD128", O_RDWR | O_CLOEXEC); assert(fd >= 0); struct gbm_device *gbm_dev = gbm_create_device(fd); assert(gbm_dev); EGLDisplay dpy = eglGetDisplay(gbm_dev); assert(dpy); EGLint major, minor; ok = eglInitialize(dpy, &major, &minor); assert(ok); EGLint config_attrs[] = { EGL_SURFACE_TYPE, EGL_PBUFFER_BIT, EGL_RENDERABLE_TYPE,EGL_OPENGL_ES2_BIT, EGL_BUFFER_SIZE,32, EGL_RED_SIZE, 8, EGL_GREEN_SIZE, 8, EGL_BLUE_SIZE, 8, EGL_ALPHA_SIZE, 8, EGL_NONE, }; EGLConfig config = 0; EGLint num_configs = 0; eglChooseConfig(dpy, config_attrs, &config, 1, &num_configs); assert(num_configs >= 1); assert(config); EGLSurface surf = eglCreatePbufferSurface(dpy, config, NULL); assert(surf); EGLint ctx_attrs[] = { EGL_CONTEXT_MAJOR_VERSION_KHR, 3, EGL_CONTEXT_MINOR_VERSION_KHR, 0, EGL_NONE, }; EGLContext ctx = eglCreateContext(dpy, config, NULL, ctx_attrs); assert(ctx); ok = eglMakeCurrent(dpy, surf, surf, ctx); assert(ok); glClearColor(1, 0, 0, 1); glClear(GL_COLOR_BUFFER_BIT); eglSwapBuffers(dpy, surf); glFinish(); return 0; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/14] mesa: Fix conditions to test signed, unsigned integer format
On Thu, Jun 18, 2015 at 7:09 AM, Iago Toral wrote: > On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: >> Signed-off-by: Anuj Phogat >> Cc: >> --- >> src/mesa/main/readpix.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c >> index caa2648..a9416ef 100644 >> --- a/src/mesa/main/readpix.c >> +++ b/src/mesa/main/readpix.c >> @@ -160,10 +160,12 @@ _mesa_readpixels_needs_slow_path(const struct >> gl_context *ctx, GLenum format, >>srcType = _mesa_get_format_datatype(rb->Format); >> >>if ((srcType == GL_INT && >> + _mesa_is_enum_format_integer(format) && >> (type == GL_UNSIGNED_INT || >> type == GL_UNSIGNED_SHORT || >> type == GL_UNSIGNED_BYTE)) || >>(srcType == GL_UNSIGNED_INT && >> + _mesa_is_enum_format_integer(format) && >> (type == GL_INT || >> type == GL_SHORT || >> type == GL_BYTE))) { > > As far as I understand this code we are trying to see if we can use > memcpy to directly copy the contents of the framebuffer to the > destination buffer. In that case, as long as the src/dst types have > different sign we can't just use memcpy, right? In fact it looks like we > might need to expand the checks to include the cases where srcType is > GL_(UNSIGNED_)SHORT and GL_(UNSIGNED_)BYTE as well. > srcType returned by _mesa_get_format_datatype() is one of: GL_UNSIGNED_NORMALIZED GL_SIGNED_NORMALIZED GL_UNSIGNED_INT GL_INT GL_FLOAT So, the suggested checks for srcType are not required. > That said, I think this code is not necessary with the call to > _mesa_format_matches_format_and_type that we do immediately after this, > since that will check that the framebuffer format exactly matches the > destination format anyway, which is a much tighter check than this. In > fact, a quick piglit run without these checks does not seem to break any > tests on i965. Gallium uses these two functions in a slightly different > way in st_cb_readpixels.c though, so I wonder if their use case requires > these checks to exist in this function anyway. > > Iago > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 11:01 AM, Connor Abbott wrote: (really I want phi's for variables too.. the way I turn arrays into fanin/collect fanout/split works on the backend for dealing with arrays in ssa form (other than making instruction graph large) but the way I go from nir to ir3 currently assumes I get nir phi's everywhere I need an ir3 phi) >>> >>> Right... we explicitly decided not to support SSA form for arrays in >>> NIR, since it seemed like a pretty bad idea. SSA form assumes that >>> inserting copies for the things you're SSA-ifying is relatively >>> inexpensive, and both SSA-based register allocation and algorithms for >>> converting out of SSA make no guarantees about not inserting >>> potentially unnecessary copies. This is a good compromise for smaller >>> things, but not for your array of (say) 64 things where inserting an >>> extra copy is rather disastrous. You can do it if you want and shoot >>> yourself in the foot, but NIR isn't going to help you there -- we >>> won't write a to-SSA pass for something which doesn't make much sense >>> in the first place. >>> >> >> in ir3 my solution is to add sufficient dependencies between >> instructions so the array accesses don't get re-ordered and they all >> collapse down to a single name per array element/slot > > It's not about getting reordered, it's about interference. The problem > is that as soon as you do basically any optimization at all (even copy > propagation), you can wind up with a situation where the sources and > destination of a phi node interfere with each other and you have to > insert extra mov's. And even if you keep everything exactly the same, > with an SSA-based register allocator, there's always the chance that > it'll screw up anyways and allocate something over your array and > force you to insert a mov. You could force the array to be allocated > to a single hardware register, but then it's not really an SSA value > anymore -- it's a hardware register, and you can't treat it like an > SSA value anymore in your allocator, and so adding phi nodes and > whatnot for it in your IR doesn't make much sense. But the point I'm trying to make is that I need the links from src to dest that I get in SSA form for *scheduling* (for example, to know how many delay slots are needed between two instructions). For things like if/else, I would need to consider number of cycles since either possible assigner. For everything else, the phi nodes (in ir3, not in nir) give me this. Arrays are not special (since they are allocated in registers) when it comes to cycle counts. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 12:42 PM, Rob Clark wrote: > On Thu, Jun 18, 2015 at 11:01 AM, Connor Abbott wrote: > (really I want phi's for variables too.. the way I turn arrays into > fanin/collect fanout/split works on the backend for dealing with > arrays in ssa form (other than making instruction graph large) but the > way I go from nir to ir3 currently assumes I get nir phi's everywhere > I need an ir3 phi) Right... we explicitly decided not to support SSA form for arrays in NIR, since it seemed like a pretty bad idea. SSA form assumes that inserting copies for the things you're SSA-ifying is relatively inexpensive, and both SSA-based register allocation and algorithms for converting out of SSA make no guarantees about not inserting potentially unnecessary copies. This is a good compromise for smaller things, but not for your array of (say) 64 things where inserting an extra copy is rather disastrous. You can do it if you want and shoot yourself in the foot, but NIR isn't going to help you there -- we won't write a to-SSA pass for something which doesn't make much sense in the first place. >>> >>> in ir3 my solution is to add sufficient dependencies between >>> instructions so the array accesses don't get re-ordered and they all >>> collapse down to a single name per array element/slot >> >> It's not about getting reordered, it's about interference. The problem >> is that as soon as you do basically any optimization at all (even copy >> propagation), you can wind up with a situation where the sources and >> destination of a phi node interfere with each other and you have to >> insert extra mov's. And even if you keep everything exactly the same, >> with an SSA-based register allocator, there's always the chance that >> it'll screw up anyways and allocate something over your array and >> force you to insert a mov. You could force the array to be allocated >> to a single hardware register, but then it's not really an SSA value >> anymore -- it's a hardware register, and you can't treat it like an >> SSA value anymore in your allocator, and so adding phi nodes and >> whatnot for it in your IR doesn't make much sense. > > > But the point I'm trying to make is that I need the links from src to > dest that I get in SSA form for *scheduling* (for example, to know how > many delay slots are needed between two instructions). For things > like if/else, I would need to consider number of cycles since either > possible assigner. For everything else, the phi nodes (in ir3, not in > nir) give me this. Arrays are not special (since they are allocated > in registers) when it comes to cycle counts. (or rather, instructions that write to arrays are not special...) > > BR, > -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 9:42 AM, Rob Clark wrote: > On Thu, Jun 18, 2015 at 11:01 AM, Connor Abbott wrote: > (really I want phi's for variables too.. the way I turn arrays into > fanin/collect fanout/split works on the backend for dealing with > arrays in ssa form (other than making instruction graph large) but the > way I go from nir to ir3 currently assumes I get nir phi's everywhere > I need an ir3 phi) Right... we explicitly decided not to support SSA form for arrays in NIR, since it seemed like a pretty bad idea. SSA form assumes that inserting copies for the things you're SSA-ifying is relatively inexpensive, and both SSA-based register allocation and algorithms for converting out of SSA make no guarantees about not inserting potentially unnecessary copies. This is a good compromise for smaller things, but not for your array of (say) 64 things where inserting an extra copy is rather disastrous. You can do it if you want and shoot yourself in the foot, but NIR isn't going to help you there -- we won't write a to-SSA pass for something which doesn't make much sense in the first place. >>> >>> in ir3 my solution is to add sufficient dependencies between >>> instructions so the array accesses don't get re-ordered and they all >>> collapse down to a single name per array element/slot >> >> It's not about getting reordered, it's about interference. The problem >> is that as soon as you do basically any optimization at all (even copy >> propagation), you can wind up with a situation where the sources and >> destination of a phi node interfere with each other and you have to >> insert extra mov's. And even if you keep everything exactly the same, >> with an SSA-based register allocator, there's always the chance that >> it'll screw up anyways and allocate something over your array and >> force you to insert a mov. You could force the array to be allocated >> to a single hardware register, but then it's not really an SSA value >> anymore -- it's a hardware register, and you can't treat it like an >> SSA value anymore in your allocator, and so adding phi nodes and >> whatnot for it in your IR doesn't make much sense. > > > But the point I'm trying to make is that I need the links from src to > dest that I get in SSA form for *scheduling* (for example, to know how > many delay slots are needed between two instructions). For things > like if/else, I would need to consider number of cycles since either > possible assigner. For everything else, the phi nodes (in ir3, not in > nir) give me this. Arrays are not special (since they are allocated > in registers) when it comes to cycle counts. > > BR, > -R Except that they still are special, and you need to account for that when you set up scheduling dependencies for them. For example, imagine that you have an array A accessed in a loop: while (...) { ... = A[i]; A[i] = ...; } if you lower the array to SSA, this will give you something like: while (...) { A_1 = phi(A_0, A_2); ... = A_1[i]; A_2 = Update(A_1, i, ...); /* makes a copy with the i'th element changed */ } and when you set up scheduling dependencies, you'll miss the false write-after-read dependency between the access and the update, meaning you could end up with: while (...) { A_1 = phi(A_0, A_2); A_2 = Update(A_1, i, ...); ... = A_1[i]; } and now the number of instructions in your shader has exploded since you have to insert a copy somewhere. You could add all the false dependencies by yourself, and force it into the same register, but by that point you've already given up all the advantages that SSA has to offer and inserting phi nodes is a rather pointless exercise. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] New stable-branch 10.5 candidate pushed
Hello list, The candidate for the Mesa 10.5.8 is now available. Currently we have: - 11 queued - 10 nominated (outstanding) - and 0 rejected (obsolete) patches The present queue consist of a few nouveau and i965 patches, along with couple libEGL hit bug fixes and a build fix for people using llvm/clang. Take a look at section "Mesa stable queue" for more information. Testing --- The following results are against piglit 305ecc3ac89. Changes - classic i965(snb) --- None. Changes - swrast classic None. Changes - gallium softpipe -- None. Changes - gallium llvmpipe (LLVM 3.6) - None. Testing reports/general approval Any testing reports (or general approval of the state of the branch) will be greatly appreciated. Trivial merge conflicts --- commit bb00457f49177d8d43417855f843887de3148e99 Author: Jason Ekstrand i965/fs: Don't let the EOT send message interfere with the MRF hack (cherry picked from commit 86e5afbfee5492235cab1a7be4ea49ac02be1644) The plan is to have 10.5.8 this Friday(19th of June). If you have any questions or comments that you would like to share before the release, please go ahead. Cheers, Emil Mesa stable queue - Nominated (10) == Boyan Ding (2): egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals i915: Add XRGB format to intel_screen_make_configs Brian Paul (1): configure: don't try to build gallium DRI drivers if --disable-dri is set Ilia Mirkin (1): glsl: add version checks to conditionals for builtin variable enablement mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls Mario Kleiner(1): nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads Tapani Pälli (2): glsl: Allow dynamic sampler array indexing with GLSL ES < 3.00 glsl: validate sampler array indexing for 'constant-index-expression' Tom Stellard (3): clover: Call clBuildProgram() notification function when build completes v2 gallium/drivers: Add threadsafe wrappers for pipe_context and pipe_screen clover: Use threadsafe wrappers for pipe_screen and pipe_context Queued (11) === Ben Widawsky (1): i965: Disable compaction for EOT send messages Boyan Ding (1): egl/x11: Set version of swrastLoader to 2 Emil Velikov (1): docs: Add sha256sums for the 10.5.7 release Erik Faye-Lund (1): mesa: build xmlconfig to a separate static library Francisco Jerez (1): i965: Don't compact instructions with unmapped bits. Ilia Mirkin (3): nvc0/ir: fix collection of first uses for texture barrier insertion nv50,nvc0: clamp uniform size to 64k nvc0/ir: can't have a join on a load with an indirect source Jason Ekstrand (1): i965/fs: Don't let the EOT send message interfere with the MRF hack Marek Olšák (1): egl: fix setting context flags Roland Scheidegger (1): draw: (trivial) fix NULL pointer dereference ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes
https://bugs.freedesktop.org/show_bug.cgi?id=91020 --- Comment #1 from Emil Velikov --- Mesa commit 7a58262e58d removed all the EGL_MESA_screen_surface definitions as it was never implemented (according to it). Seems like mesa-demos doesn't have the necessary compile guards. Something like the following should fix things #if EGL_MESA_screen_surface // code using the extension #endif Feel free to wrap up a patch and send it to mesa-dev ML. Thanks Emil -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: rename LLVM_VERSION_PATCH to avoid conflict with llvm-config.h
On 18 June 2015 at 13:32, Marek Olšák wrote: > We'll need to check the patch version for tessellation and amdgpu. > Did not see that one. Thanks ! -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] darwin: Suppress type conversion warnings for GLhandleARB
On 18 June 2015 at 06:53, Julien Isorce wrote: > From: Jon TURNEY > > On darwin, GLhandleARB is defined as a void *, not the unsigned int it is on > linux. > > For the moment, apply a cast to supress the warning > > Possibly this is safe, as for the mesa software renderer the shader program > handle is not a real pointer, but a integer handle > > Probably this is not the right thing to do, and we should pay closer attention > to how the GLhandlerARB type is used. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66346 Imho one's goal should not be to find the quickest solution to shut warnings up, but to understand them. As mentioned by Jeremy Huddleston in comment 11, this approach only masks the issue. There was a lengthy discussion(s) on the topic a while back and one of the objections on the proposed fixes (from Ian iirc) was that it will break the libglapi ABI. As Jeremy did not see that as a concern perhaps we can revive them and make sure that the ABI change does not happen in the non-Mac world. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 1:27 PM, Connor Abbott wrote: > On Thu, Jun 18, 2015 at 9:42 AM, Rob Clark wrote: >> On Thu, Jun 18, 2015 at 11:01 AM, Connor Abbott wrote: >> (really I want phi's for variables too.. the way I turn arrays into >> fanin/collect fanout/split works on the backend for dealing with >> arrays in ssa form (other than making instruction graph large) but the >> way I go from nir to ir3 currently assumes I get nir phi's everywhere >> I need an ir3 phi) > > Right... we explicitly decided not to support SSA form for arrays in > NIR, since it seemed like a pretty bad idea. SSA form assumes that > inserting copies for the things you're SSA-ifying is relatively > inexpensive, and both SSA-based register allocation and algorithms for > converting out of SSA make no guarantees about not inserting > potentially unnecessary copies. This is a good compromise for smaller > things, but not for your array of (say) 64 things where inserting an > extra copy is rather disastrous. You can do it if you want and shoot > yourself in the foot, but NIR isn't going to help you there -- we > won't write a to-SSA pass for something which doesn't make much sense > in the first place. > in ir3 my solution is to add sufficient dependencies between instructions so the array accesses don't get re-ordered and they all collapse down to a single name per array element/slot >>> >>> It's not about getting reordered, it's about interference. The problem >>> is that as soon as you do basically any optimization at all (even copy >>> propagation), you can wind up with a situation where the sources and >>> destination of a phi node interfere with each other and you have to >>> insert extra mov's. And even if you keep everything exactly the same, >>> with an SSA-based register allocator, there's always the chance that >>> it'll screw up anyways and allocate something over your array and >>> force you to insert a mov. You could force the array to be allocated >>> to a single hardware register, but then it's not really an SSA value >>> anymore -- it's a hardware register, and you can't treat it like an >>> SSA value anymore in your allocator, and so adding phi nodes and >>> whatnot for it in your IR doesn't make much sense. >> >> >> But the point I'm trying to make is that I need the links from src to >> dest that I get in SSA form for *scheduling* (for example, to know how >> many delay slots are needed between two instructions). For things >> like if/else, I would need to consider number of cycles since either >> possible assigner. For everything else, the phi nodes (in ir3, not in >> nir) give me this. Arrays are not special (since they are allocated >> in registers) when it comes to cycle counts. >> >> BR, >> -R > > Except that they still are special, and you need to account for that > when you set up scheduling dependencies for them. For example, imagine > that you have an array A accessed in a loop: > > while (...) { > ... = A[i]; > A[i] = ...; > } > > if you lower the array to SSA, this will give you something like: > > while (...) { > A_1 = phi(A_0, A_2); > ... = A_1[i]; > A_2 = Update(A_1, i, ...); /* makes a copy with the i'th element changed > */ > } > > and when you set up scheduling dependencies, you'll miss the false > write-after-read dependency between the access and the update, meaning > you could end up with: > > while (...) { > A_1 = phi(A_0, A_2); > A_2 = Update(A_1, i, ...); > ... = A_1[i]; > } > > and now the number of instructions in your shader has exploded since > you have to insert a copy somewhere. You could add all the false > dependencies by yourself, and force it into the same register, but by > that point you've already given up all the advantages that SSA has to > offer and inserting phi nodes is a rather pointless exercise. except, like I said, for the purpose of scheduling realizing that there are two dependency paths to consider for figuring out the required number of delay slots.. That said, having to re-invent the to-ssa pass in my backend is something I was hoping to avoid.. so I'm thinking of other alternatives. But currently the depth calculations (used for scheduling), dead code elimination (used if nothing else to clean things up when generating binning-pass shader), scheduling, and even to some degree RA, depend on this use-def graph between instructions. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] applegl: Provide requirements of _SET_DrawBuffers
On 18 June 2015 at 06:53, Julien Isorce wrote: > From: Jon TURNEY > > _SET_DrawBuffers requires driDispatchRemapTable, > so we need to link with libmesa for remap.c. > libmesa requires the C++ linker. > > Also need to arrange to call _mesa_init_remap_table() > to initialize the remap table. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90311 > Signed-off-by: Jon TURNEY > --- > src/glx/Makefile.am | 5 - > src/glx/apple/apple_glapi.c | 3 +++ > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am > index 6e50e09..de24496 100644 > --- a/src/glx/Makefile.am > +++ b/src/glx/Makefile.am > @@ -139,7 +139,10 @@ libglx_la_SOURCES += \ > applegl_glx.c > > SUBDIRS += apple > -libglx_la_LIBADD += $(builddir)/apple/libappleglx.la > +libglx_la_LIBADD += \ > + $(builddir)/apple/libappleglx.la \ > + $(top_builddir)/src/mesa/libmesa.la > +nodist_EXTRA_lib@GL_LIB@_la_SOURCES = dummy.cpp Pulling hunks of mesa into libGL does not sounds like a reasonable thing. There is a reason why Jon had that XXX: in the original commit message. Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] egl/dri2: load libglapi.0.dylib on osx
On 18 June 2015 at 06:53, Julien Isorce wrote: > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90903 > Signed-off-by: Julien Isorce > --- > src/egl/drivers/dri2/egl_dri2.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index a1cbd43..90b9648 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -2354,6 +2354,8 @@ dri2_load(_EGLDriver *drv) > #ifdef HAVE_SHARED_GLAPI > #ifdef HAVE_ANDROID_PLATFORM > const char *libname = "libglapi.so"; > +#elif defined(__APPLE__) > + const char *libname = "libglapi.0.dylib"; > #else Is this enough to get dri modules (suspecting only swrast atm) working with EGL ? Note that currently libEGL already explicitly links against libglapi, so imho we can just drop the whole thing. Haven't really looked but do we use glFlush() as a workaround or it's something required by the EGL standard ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] egl: use unix defines on osx with clang
On 18 June 2015 at 06:53, Julien Isorce wrote: > CC egl_dri2.lo > include/EGL/eglplatform.h:135:2: > error: "Platform not recognized" > include/EGL/eglplatform.h:140:9: > error: unknown type name 'EGLNativeDisplayType' > typedef EGLNativeDisplayType NativeDisplayType; > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90249 > Signed-off-by: Julien Isorce > --- > include/EGL/eglplatform.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h > index 7802542..b376e64 100644 > --- a/include/EGL/eglplatform.h > +++ b/include/EGL/eglplatform.h > @@ -77,7 +77,7 @@ typedef HDC EGLNativeDisplayType; > typedef HBITMAP EGLNativePixmapType; > typedef HWNDEGLNativeWindowType; > > -#elif defined(__APPLE__) || defined(__WINSCW__) || defined(__SYMBIAN32__) > /* Symbian */ > +#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */ > > typedef int EGLNativeDisplayType; > typedef void *EGLNativeWindowType; > @@ -105,7 +105,7 @@ typedef struct ANativeWindow* > EGLNativeWindowType; > typedef struct egl_native_pixmap_t* EGLNativePixmapType; > typedef void* EGLNativeDisplayType; > > -#elif defined(__unix__) > +#elif defined(__unix__) || defined(__APPLE__) > > #if defined(MESA_EGL_NO_X11_HEADERS) > > -- > 1.9.5 (Apple Git-50.3) > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] egl: use unix defines on osx with clang
On 18 June 2015 at 19:29, Emil Velikov wrote: Sorry about that. Unintentionally hit send ;-\ > On 18 June 2015 at 06:53, Julien Isorce wrote: >> CC egl_dri2.lo >> include/EGL/eglplatform.h:135:2: >> error: "Platform not recognized" >> include/EGL/eglplatform.h:140:9: >> error: unknown type name 'EGLNativeDisplayType' >> typedef EGLNativeDisplayType NativeDisplayType; >> You should not longer see this message. Did you try building things, with the updated eglplatform.h ? >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90249 >> Signed-off-by: Julien Isorce >> --- >> include/EGL/eglplatform.h | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h >> index 7802542..b376e64 100644 >> --- a/include/EGL/eglplatform.h >> +++ b/include/EGL/eglplatform.h >> @@ -77,7 +77,7 @@ typedef HDC EGLNativeDisplayType; >> typedef HBITMAP EGLNativePixmapType; >> typedef HWNDEGLNativeWindowType; >> >> -#elif defined(__APPLE__) || defined(__WINSCW__) || defined(__SYMBIAN32__) >> /* Symbian */ >> +#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */ >> The above "defined(__APPLE__))" comes from Khronos, so if it's wrong perhaps it should be reported to them ? Afaict with current mesa this patch is not needed, and things will just work. Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 11:19 AM, Rob Clark wrote: > On Thu, Jun 18, 2015 at 1:27 PM, Connor Abbott wrote: >> On Thu, Jun 18, 2015 at 9:42 AM, Rob Clark wrote: >>> On Thu, Jun 18, 2015 at 11:01 AM, Connor Abbott wrote: >>> (really I want phi's for variables too.. the way I turn arrays into >>> fanin/collect fanout/split works on the backend for dealing with >>> arrays in ssa form (other than making instruction graph large) but the >>> way I go from nir to ir3 currently assumes I get nir phi's everywhere >>> I need an ir3 phi) >> >> Right... we explicitly decided not to support SSA form for arrays in >> NIR, since it seemed like a pretty bad idea. SSA form assumes that >> inserting copies for the things you're SSA-ifying is relatively >> inexpensive, and both SSA-based register allocation and algorithms for >> converting out of SSA make no guarantees about not inserting >> potentially unnecessary copies. This is a good compromise for smaller >> things, but not for your array of (say) 64 things where inserting an >> extra copy is rather disastrous. You can do it if you want and shoot >> yourself in the foot, but NIR isn't going to help you there -- we >> won't write a to-SSA pass for something which doesn't make much sense >> in the first place. >> > > in ir3 my solution is to add sufficient dependencies between > instructions so the array accesses don't get re-ordered and they all > collapse down to a single name per array element/slot It's not about getting reordered, it's about interference. The problem is that as soon as you do basically any optimization at all (even copy propagation), you can wind up with a situation where the sources and destination of a phi node interfere with each other and you have to insert extra mov's. And even if you keep everything exactly the same, with an SSA-based register allocator, there's always the chance that it'll screw up anyways and allocate something over your array and force you to insert a mov. You could force the array to be allocated to a single hardware register, but then it's not really an SSA value anymore -- it's a hardware register, and you can't treat it like an SSA value anymore in your allocator, and so adding phi nodes and whatnot for it in your IR doesn't make much sense. >>> >>> >>> But the point I'm trying to make is that I need the links from src to >>> dest that I get in SSA form for *scheduling* (for example, to know how >>> many delay slots are needed between two instructions). For things >>> like if/else, I would need to consider number of cycles since either >>> possible assigner. For everything else, the phi nodes (in ir3, not in >>> nir) give me this. Arrays are not special (since they are allocated >>> in registers) when it comes to cycle counts. >>> >>> BR, >>> -R >> >> Except that they still are special, and you need to account for that >> when you set up scheduling dependencies for them. For example, imagine >> that you have an array A accessed in a loop: >> >> while (...) { >> ... = A[i]; >> A[i] = ...; >> } >> >> if you lower the array to SSA, this will give you something like: >> >> while (...) { >> A_1 = phi(A_0, A_2); >> ... = A_1[i]; >> A_2 = Update(A_1, i, ...); /* makes a copy with the i'th element changed >> */ >> } >> >> and when you set up scheduling dependencies, you'll miss the false >> write-after-read dependency between the access and the update, meaning >> you could end up with: >> >> while (...) { >> A_1 = phi(A_0, A_2); >> A_2 = Update(A_1, i, ...); >> ... = A_1[i]; >> } >> >> and now the number of instructions in your shader has exploded since >> you have to insert a copy somewhere. You could add all the false >> dependencies by yourself, and force it into the same register, but by >> that point you've already given up all the advantages that SSA has to >> offer and inserting phi nodes is a rather pointless exercise. > > except, like I said, for the purpose of scheduling realizing that > there are two dependency paths to consider for figuring out the > required number of delay slots.. No, there aren't. There's the dependency between the read and the write (in my example), which serializes things and makes it one path. In other words, the stuff before the write must happen before the write, even if in SSA those are two different values. > > That said, having to re-invent the to-ssa pass in my backend is > something I was hoping to avoid.. so I'm thinking of other > alternatives. But currently the depth calculations (used for > scheduling), dead code elimination (used if nothing else to clean > things up when generating binning-pass shader), scheduling, and even > to some degree RA, depend on this use-def graph between instructions. What I'm saying is that for arrays, the use-def graph isn't enough. At least the scheduler has to
Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads
On 2015-06-17 22:57:27, Iago Toral wrote: > On Wed, 2015-06-17 at 17:20 -0700, Jordan Justen wrote: > > I wanted to question whether this was required, based on this text > > from the extension spec: > > > > "The ability to write to buffer objects creates the potential for > > multiple independent shader invocations to read and write the same > > underlying memory. The same issue exists with the > > ARB_shader_image_load_store extension provided in OpenGL 4.2, which > > can write to texture objects and buffers. In both cases, the > > specification makes few guarantees related to the relative order of > > memory reads and writes performed by the shader invocations." > > > > But I'm not sure if we can reconcile CSE with 'memoryBarrier' and > > 'barrier'. curro, any thoughts from image load/store? > > I think the problem is within the same thread, that text above talks > about multiple invocations reading from and writing to the same > location, but within the same invocation, the order of reads and writes > must be preserved: > > "Buffer variable memory reads and writes within a single shader > invocation are processed in order. However, the order of reads and > writes performed in one invocation relative to those performed by > another invocation is largely undefined." > > For example, if X is a shader storage buffer variable and we have code > like this with just one invocation: > > ssbo_store(X, 1); > a = ssbo_load(X) + 1 // a = 2 > ssbo_store(X, 2); > b = ssbo_load(X) + 1; // b = 3 > > CSE could mess it up like this: > > ssbo_store(X, 1); > tmp = ssbo_load(X) + 1 // tmp = 2 > a = tmp; > ssbo_store(X, 2); > b = tmp; > > which would be incorrect. I think I wrote this patch after seeing > something like this happening. The CSE pass clearly states that it does > not support write variables after all. > > Also, notice the same would apply if there are multiple invocations but > the shader code used something like gl_VertexID or gl_FragCoord to make > each invocation read from/write to a different address within the SSBO > buffer (I imagine this is the usual way to operate with SSBOs). In these > cases, even if we have multiple invocations, keeping the relative order > of reads and writes within each one is necessary. Ok. Reviewed-by: Jordan Justen > > > > On 2015-06-03 00:01:13, Iago Toral Quiroga wrote: > > > SSBOs are read/write and this CSE pass only handles read-only variables. > > > --- > > > src/glsl/opt_cse.cpp | 33 - > > > 1 file changed, 32 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp > > > index 4b8e9a0..a05ab46 100644 > > > --- a/src/glsl/opt_cse.cpp > > > +++ b/src/glsl/opt_cse.cpp > > > @@ -245,6 +245,28 @@ contains_rvalue(ir_rvalue *haystack, ir_rvalue > > > *needle) > > > } > > > > > > static bool > > > +expression_contains_ssbo_load(ir_expression *expr) > > > +{ > > > + if (expr->operation == ir_binop_ssbo_load) > > > + return true; > > > + > > > + for (unsigned i = 0; i < expr->get_num_operands(); i++) { > > > + ir_rvalue *op = expr->operands[i]; > > > + if (op->ir_type == ir_type_expression && > > > + expression_contains_ssbo_load(op->as_expression())) { > > > + return true; > > > + } else if (op->ir_type == ir_type_swizzle) { > > > + ir_swizzle *swizzle = op->as_swizzle(); > > > + ir_expression *val = swizzle->val->as_expression(); > > > + if (val && expression_contains_ssbo_load(val)) > > > +return true; > > > + } > > > + } > > > + > > > + return false; > > > +} > > > + > > > +static bool > > > is_cse_candidate(ir_rvalue *ir) > > > { > > > /* Our temporary variable assignment generation isn't ready to handle > > > @@ -260,7 +282,16 @@ is_cse_candidate(ir_rvalue *ir) > > > * to variable-index array dereferences at some point. > > > */ > > > switch (ir->ir_type) { > > > - case ir_type_expression: > > > + case ir_type_expression: { > > > + /* Skip expressions involving SSBO loads, since these operate on > > > + * read-write variables, meaning that the same ssbo_load > > > expression > > > + * may return a different value if the underlying buffer storage > > > + * is written in between. > > > + */ > > > + if (expression_contains_ssbo_load(ir->as_expression())) > > > +return false; > > > + } > > > + break; > > > case ir_type_texture: > > >break; > > > default: > > > -- > > > 1.9.1 > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC v2 14/15] i965: refactor miptree alignment calculation code
The next revision will be all code refactoring. I moved the spec references into the new top-level function intel_get_texture_alignment_unit because some of the alignment calculations occurring in the top-level function reference the spec. Since the intel_vertical_texture_alignment_unit and intel_horizontal_texture_alignment_unit are called from within the top-level (and only within it), it would be assumed that the references also apply to these functions. On Tue, Jun 9, 2015 at 12:13 PM, Ian Romanick wrote: > On 06/01/2015 10:13 AM, Nanley Chery wrote: >> From: Nanley Chery >> >> - Remove redundant checks and comments by grouping our calculations for >> align_w and align_h wherever possible. >> - Don't pass more parameters than necessary. >> - Minor code simplifications. >> >> Signed-off-by: Nanley Chery >> --- >> src/mesa/drivers/dri/i965/brw_tex_layout.c | 181 >> + >> 1 file changed, 83 insertions(+), 98 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> index 5aadd00..a54b77d 100644 >> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> @@ -40,16 +40,9 @@ >> #define FILE_DEBUG_FLAG DEBUG_MIPTREE >> >> static unsigned int >> -intel_horizontal_texture_alignment_unit(struct brw_context *brw, >> -struct intel_mipmap_tree *mt) >> +intel_horizontal_texture_alignment_unit(int gen, const struct >> intel_mipmap_tree *mt) >> { >> /** >> -* From the "Alignment Unit Size" section of various specs, namely: >> -* - Gen3 Spec: "Memory Data Formats" Volume, Section 1.20.1.4 >> -* - i965 and G45 PRMs: Volume 1, Section 6.17.3.4. >> -* - Ironlake and Sandybridge PRMs: Volume 1, Part 1, Section 7.18.3.4 >> -* - BSpec (for Ivybridge and slight variations in separate stencil) >> -* > > Since the rest of the spec quotation remains, why remove this part? > > I also agree with Anuj's comments about splitting this into multiple > patches. > >> * >> +--+ >> * || alignment unit width >> ("i") | >> * | Surface Property >> |-| >> @@ -67,47 +60,19 @@ intel_horizontal_texture_alignment_unit(struct >> brw_context *brw, >> * On IVB+, non-special cases can be overridden by setting the >> SURFACE_STATE >> * "Surface Horizontal Alignment" field to HALIGN_4 or HALIGN_8. >> */ >> -if (_mesa_is_format_compressed(mt->format)) { >> - /* The hardware alignment requirements for compressed textures >> -* happen to match the block boundaries. >> -*/ >> - unsigned int i, j; >> - _mesa_get_format_block_size(mt->format, &i, &j); >> - >> - /* On Gen9+ we can pick our own alignment for compressed textures but >> it >> - * has to be a multiple of the block size. The minimum alignment we >> can >> - * pick is 4 so we effectively have to align to 4 times the block >> - * size >> - */ >> - if (brw->gen >= 9) >> - return i * 4; >> - else >> - return i; >> -} >> - >> - if (mt->format == MESA_FORMAT_S_UINT8) >> - return 8; >> - >> - if (brw->gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16) >> + if (gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16) >>return 8; >> >> - if (brw->gen == 8 && mt->mcs_mt && mt->num_samples <= 1) >> + if (gen == 8 && mt->mcs_mt && mt->num_samples <= 1) >>return 16; >> >> return 4; >> } >> >> static unsigned int >> -intel_vertical_texture_alignment_unit(struct brw_context *brw, >> - mesa_format format, bool multisampled) >> +intel_vertical_texture_alignment_unit(int gen, const struct >> intel_mipmap_tree *mt) >> { >> /** >> -* From the "Alignment Unit Size" section of various specs, namely: >> -* - Gen3 Spec: "Memory Data Formats" Volume, Section 1.20.1.4 >> -* - i965 and G45 PRMs: Volume 1, Section 6.17.3.4. >> -* - Ironlake and Sandybridge PRMs: Volume 1, Part 1, Section 7.18.3.4 >> -* - BSpec (for Ivybridge and slight variations in separate stencil) >> -* >> * >> +--+ >> * || alignment unit height >> ("j") | >> * | Surface Property >> |-| >> @@ -124,34 +89,25 @@ intel_vertical_texture_alignment_unit(struct >> brw_context *brw, >> * Where "*" means either VALIGN_2 or VALIGN_4 depending on the setting >> of >> * the SURFACE_STATE "Surface Vertical Alignment" field. >> */ >> - if (_mesa_is_format_compressed(format)) { >> - unsigned int i, j; >> - _mesa_get_
Re: [Mesa-dev] [PATCH 5/5] osx: fix asm support on darwin
On 18 June 2015 at 06:53, Julien Isorce wrote: > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90908 > Signed-off-by: Julien Isorce > --- > configure.ac | 2 +- > src/mesa/x86-64/xform4.S | 53 > +--- > src/mesa/x86/assyntax.h | 2 +- > 3 files changed, 30 insertions(+), 27 deletions(-) > > diff --git a/configure.ac b/configure.ac > index ae6d83d..6d699d5 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -615,7 +615,7 @@ if test "x$enable_asm" = xyes; then > ;; > x86_64|amd64) > case "$host_os" in > -linux* | *freebsd* | dragonfly* | *netbsd* | openbsd*) > +linux* | *freebsd* | dragonfly* | *netbsd* | openbsd* | darwin*) > asm_arch=x86_64 > ;; > esac > diff --git a/src/mesa/x86-64/xform4.S b/src/mesa/x86-64/xform4.S > index c185f62..17eb7fa 100644 > --- a/src/mesa/x86-64/xform4.S > +++ b/src/mesa/x86-64/xform4.S > @@ -24,14 +24,15 @@ > > #ifdef USE_X86_64_ASM > > +#include "x86/assyntax.h" > #include "matypes.h" > > .text > > .align 16 > -.globl _mesa_x86_64_cpuid > -.hidden _mesa_x86_64_cpuid > -_mesa_x86_64_cpuid: > +GLOBL GLNAME(_mesa_x86_64_cpuid) > +HIDDEN(_mesa_x86_64_cpuid) > +GLNAME(_mesa_x86_64_cpuid): > pushq %rbx > movl(%rdi), %eax > movl8(%rdi), %ecx > @@ -46,9 +47,9 @@ _mesa_x86_64_cpuid: > ret > > .align 16 > -.globl _mesa_x86_64_transform_points4_general > -.hidden _mesa_x86_64_transform_points4_general > -_mesa_x86_64_transform_points4_general: > +GLOBL GLNAME(_mesa_x86_64_transform_points4_general) > +HIDDEN(_mesa_x86_64_transform_points4_general) > +GLNAME(_mesa_x86_64_transform_points4_general): > /* > * rdi = dest > * rsi = matrix > @@ -105,8 +106,10 @@ p4_general_loop: > p4_general_done: > .byte 0xf3 > ret > - > + > +#if defined (__ELF__) && defined (__linux__) > .section .rodata > +#endif > > .align 16 > p4_constants: > @@ -122,13 +125,13 @@ p4_constants: > > .text > .align 16 > -.globl _mesa_x86_64_transform_points4_3d > -.hidden _mesa_x86_64_transform_points4_3d > +GLOBL GLNAME(_mesa_x86_64_transform_points4_3d) > +HIDDEN(_mesa_x86_64_transform_points4_3d) > /* > * this is slower than _mesa_x86_64_transform_points4_general > * because it ensures that the last matrix row (or is it column?) is 0,0,0,1 > */ > -_mesa_x86_64_transform_points4_3d: > +GLNAME(_mesa_x86_64_transform_points4_3d): > > leaq p4_constants(%rip), %rax > > @@ -194,9 +197,9 @@ p4_3d_done: > > > .align 16 > -.globl _mesa_x86_64_transform_points4_identity > -.hidden _mesa_x86_64_transform_points4_identity > -_mesa_x86_64_transform_points4_identity: > +GLOBL GLNAME(_mesa_x86_64_transform_points4_identity) > +HIDDEN(_mesa_x86_64_transform_points4_identitiy) > +GLNAME(_mesa_x86_64_transform_points4_identity): > > movl V4F_COUNT(%rdx), %ecx /* count */ > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > @@ -223,9 +226,9 @@ p4_identity_done: > > > .align 16 > -.globl _mesa_3dnow_transform_points4_3d_no_rot > -.hidden _mesa_3dnow_transform_points4_3d_no_rot > -_mesa_3dnow_transform_points4_3d_no_rot: > +GLOBL GLNAME(_mesa_3dnow_transform_points4_3d_no_rot) > +HIDDEN(_mesa_3dnow_transform_points4_3d_no_rot) > +GLNAME(_mesa_3dnow_transform_points4_3d_no_rot): > > movl V4F_COUNT(%rdx), %ecx /* count */ > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > @@ -288,9 +291,9 @@ p4_3d_no_rot_done: > > > .align 16 > -.globl _mesa_3dnow_transform_points4_perspective > -.hidden _mesa_3dnow_transform_points4_perspective > -_mesa_3dnow_transform_points4_perspective: > +GLOBL GLNAME(_mesa_3dnow_transform_points4_perspective) > +HIDDEN(_mesa_3dnow_transform_points4_perspective) > +GLNAME(_mesa_3dnow_transform_points4_perspective): > > movl V4F_COUNT(%rdx), %ecx /* count */ > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > @@ -355,9 +358,9 @@ p4_perspective_done: > ret > > .align 16 > -.globl _mesa_3dnow_transform_points4_2d_no_rot > -.hidden _mesa_3dnow_transform_points4_2d_no_rot > -_mesa_3dnow_transform_points4_2d_no_rot: > +GLOBL GLNAME(_mesa_3dnow_transform_points4_2d_no_rot) > +HIDDEN(_mesa_3dnow_transform_points4_2d_no_rot) > +GLNAME(_mesa_3dnow_transform_points4_2d_no_rot): > > movl V4F_COUNT(%rdx), %ecx /* count */ > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > @@ -411,9 +414,9 @@ p4_2d_no_rot_done: > > > .align 16 > -.globl _mesa_3dnow_transform_points4_2d > -.hidden _mesa_3dnow_transform_points4_2d > -_mesa_3dnow_transform_points4_2d: > +GLOBL GLNAME(_mesa_3dnow_transform_points4_2d) > +HIDDEN(_mesa_3dnow_transform_points4_2d) > +GLNAME(_mesa_3dnow_transform_points4_2d): > > movl V4F_COUNT(%rdx), %ecx /* count */ > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > diff --git a/src/mesa/x86/assyntax.h b/src/mesa/x86/assyntax.h > index 67867bd..8ba7e50 100644 > --- a/src/mesa/x86/a
Re: [Mesa-dev] [PATCH 0/4] CLOEXEC fix-ups
On 17 June 2015 at 17:28, Derek Foreman wrote: > This series catches a couple of places where we forget to set CLOEXEC on > file descriptors, and makes a helper function for the necessarily ugly > way we have to open fds to make sure CLOEXEC is set. > > Derek Foreman (4): > egl/drm: Duplicate fd with F_DUPFD_CLOEXEC to prevent leak > loader: Rename drm_open_device() to loader_open_device() and share it > glx: Use loader_open_device() helper > egl: Use the loader_open_device() helper to do open with CLOEXEC > > src/egl/drivers/dri2/platform_drm.c | 6 +++--- > src/egl/drivers/dri2/platform_surfaceless.c | 11 +-- > src/egl/drivers/dri2/platform_wayland.c | 11 +-- > src/egl/drivers/dri2/platform_x11.c | 12 ++-- > src/glx/dri2_glx.c | 10 +- > src/loader/loader.c | 6 +++--- > src/loader/loader.h | 3 +++ > 7 files changed, 14 insertions(+), 45 deletions(-) Series looks great. Thanks for the help Derek. Reviewed-by: Emil Velikov Will give it a couple of days, and barring any objections I'll push this to master. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] EGL: Add pbuffer support for drm platform
On 18 June 2015 at 17:11, Liu, Ying2 wrote: > Hi, Chad, > > Thank you so much for reviewing my patch. I am in vacation and sabbatical > right now. I will talk with our customer about your suggestion. I will fix > the patch when I come back. > Enjoy your sabbatical Ying ! As you come back, please look closer at in dri2_drm_create_surface(). Matt did mention that the !dri2_dpy->dri2 case hasn't been updated, albeit used by swrast. So as-is things will end up pretty bad. Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
On Thu, Jun 18, 2015 at 2:34 PM, Connor Abbott wrote: > On Thu, Jun 18, 2015 at 11:19 AM, Rob Clark wrote: >> On Thu, Jun 18, 2015 at 1:27 PM, Connor Abbott wrote: >>> On Thu, Jun 18, 2015 at 9:42 AM, Rob Clark wrote: On Thu, Jun 18, 2015 at 11:01 AM, Connor Abbott wrote: (really I want phi's for variables too.. the way I turn arrays into fanin/collect fanout/split works on the backend for dealing with arrays in ssa form (other than making instruction graph large) but the way I go from nir to ir3 currently assumes I get nir phi's everywhere I need an ir3 phi) >>> >>> Right... we explicitly decided not to support SSA form for arrays in >>> NIR, since it seemed like a pretty bad idea. SSA form assumes that >>> inserting copies for the things you're SSA-ifying is relatively >>> inexpensive, and both SSA-based register allocation and algorithms for >>> converting out of SSA make no guarantees about not inserting >>> potentially unnecessary copies. This is a good compromise for smaller >>> things, but not for your array of (say) 64 things where inserting an >>> extra copy is rather disastrous. You can do it if you want and shoot >>> yourself in the foot, but NIR isn't going to help you there -- we >>> won't write a to-SSA pass for something which doesn't make much sense >>> in the first place. >>> >> >> in ir3 my solution is to add sufficient dependencies between >> instructions so the array accesses don't get re-ordered and they all >> collapse down to a single name per array element/slot > > It's not about getting reordered, it's about interference. The problem > is that as soon as you do basically any optimization at all (even copy > propagation), you can wind up with a situation where the sources and > destination of a phi node interfere with each other and you have to > insert extra mov's. And even if you keep everything exactly the same, > with an SSA-based register allocator, there's always the chance that > it'll screw up anyways and allocate something over your array and > force you to insert a mov. You could force the array to be allocated > to a single hardware register, but then it's not really an SSA value > anymore -- it's a hardware register, and you can't treat it like an > SSA value anymore in your allocator, and so adding phi nodes and > whatnot for it in your IR doesn't make much sense. But the point I'm trying to make is that I need the links from src to dest that I get in SSA form for *scheduling* (for example, to know how many delay slots are needed between two instructions). For things like if/else, I would need to consider number of cycles since either possible assigner. For everything else, the phi nodes (in ir3, not in nir) give me this. Arrays are not special (since they are allocated in registers) when it comes to cycle counts. BR, -R >>> >>> Except that they still are special, and you need to account for that >>> when you set up scheduling dependencies for them. For example, imagine >>> that you have an array A accessed in a loop: >>> >>> while (...) { >>> ... = A[i]; >>> A[i] = ...; >>> } >>> >>> if you lower the array to SSA, this will give you something like: >>> >>> while (...) { >>> A_1 = phi(A_0, A_2); >>> ... = A_1[i]; >>> A_2 = Update(A_1, i, ...); /* makes a copy with the i'th element >>> changed */ >>> } >>> >>> and when you set up scheduling dependencies, you'll miss the false >>> write-after-read dependency between the access and the update, meaning >>> you could end up with: >>> >>> while (...) { >>> A_1 = phi(A_0, A_2); >>> A_2 = Update(A_1, i, ...); >>> ... = A_1[i]; >>> } >>> >>> and now the number of instructions in your shader has exploded since >>> you have to insert a copy somewhere. You could add all the false >>> dependencies by yourself, and force it into the same register, but by >>> that point you've already given up all the advantages that SSA has to >>> offer and inserting phi nodes is a rather pointless exercise. >> >> except, like I said, for the purpose of scheduling realizing that >> there are two dependency paths to consider for figuring out the >> required number of delay slots.. > > No, there aren't. There's the dependency between the read and the > write (in my example), which serializes things and makes it one path. > In other words, the stuff before the write must happen before the > write, even if in SSA those are two different values. that wasn't really the case I was talking about.. but rather: if (...) { ... arr[i] = a + b; } else { arr[i] = c + d; } ... = arr[i]; >> >> That said, having to re-invent the to-ssa pass in my backend is >> something I was hoping to avoid.. so I'm thinking of other >> alternatives. But currently the depth calculations (
Re: [Mesa-dev] [PATCH 1/5] darwin: Suppress type conversion warnings for GLhandleARB
Hi Emil, Do you mean adding the cast only if __APPLE__ is defined ? Actually when reading comment #2 we should get the build error for src/mesa/main/uniform_query.cpp too but it is not the case anymore. So with git blame I can see that the build error has been fixed by commit d96ed5c0 for the file uniform_query.cpp only. But can we apply the same for _mesa_BindAttribLocation in shader_query.cpp ? I can see in the spec that non-ARB version of glBindAttribLocation 's first param is GLuint https://www.opengl.org/sdk/docs/man3/xhtml/glBindAttribLocation.xml (gles2: https://www.khronos.org/opengles/sdk/docs/man/xhtml/glBindAttribLocation.xml ) What is the purpose of src/mapi/glapi/gen/ARB_robustness.xml ? Thx Julien On 18 June 2015 at 19:16, Emil Velikov wrote: > On 18 June 2015 at 06:53, Julien Isorce wrote: > > From: Jon TURNEY > > > > On darwin, GLhandleARB is defined as a void *, not the unsigned int it > is on > > linux. > > > > For the moment, apply a cast to supress the warning > > > > Possibly this is safe, as for the mesa software renderer the shader > program > > handle is not a real pointer, but a integer handle > > > > Probably this is not the right thing to do, and we should pay closer > attention > > to how the GLhandlerARB type is used. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66346 > Imho one's goal should not be to find the quickest solution to shut > warnings up, but to understand them. > As mentioned by Jeremy Huddleston in comment 11, this approach only > masks the issue. > > There was a lengthy discussion(s) on the topic a while back and one of > the objections on the proposed fixes (from Ian iirc) was that it will > break the libglapi ABI. As Jeremy did not see that as a concern > perhaps we can revive them and make sure that the ABI change does not > happen in the non-Mac world. > > -Emil > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: move gl_NumSamples enablement with other ARB_sample_shading vars
gl_NumSamples should only be enabled when ARB_sample_shading is enabled, and only in the fragment shader. Move to the generate_fs_special_vars function, next to the other ARB_sample_shading-provided variables. Signed-off-by: Ilia Mirkin --- src/glsl/builtin_variables.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index c52b252..d2acbac 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -764,7 +764,6 @@ builtin_variable_generator::generate_constants() void builtin_variable_generator::generate_uniforms() { - add_uniform(int_t, "gl_NumSamples"); add_uniform(type("gl_DepthRangeParameters"), "gl_DepthRange"); add_uniform(array(vec4_t, VERT_ATTRIB_MAX), "gl_CurrentAttribVertMESA"); add_uniform(array(vec4_t, VARYING_SLOT_MAX), "gl_CurrentAttribFragMESA"); @@ -947,6 +946,7 @@ builtin_variable_generator::generate_fs_special_vars() } if (state->is_version(400, 0) || state->ARB_sample_shading_enable) { + add_uniform(int_t, "gl_NumSamples"); add_system_value(SYSTEM_VALUE_SAMPLE_ID, int_t, "gl_SampleID"); add_system_value(SYSTEM_VALUE_SAMPLE_POS, vec2_t, "gl_SamplePosition"); /* From the ARB_sample_shading specification: -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] applegl: Provide requirements of _SET_DrawBuffers
Sorry for removing the XXX line. Original message is here: https://bugs.freedesktop.org/attachment.cgi?id=115539 At the time src/glx/apple/apple_glapi.c has been developed this patch was not needed I guess so I wonder which change made the regression. Cheers Julien On 18 June 2015 at 19:23, Emil Velikov wrote: > On 18 June 2015 at 06:53, Julien Isorce wrote: > > From: Jon TURNEY > > > > _SET_DrawBuffers requires driDispatchRemapTable, > > so we need to link with libmesa for remap.c. > > libmesa requires the C++ linker. > > > > Also need to arrange to call _mesa_init_remap_table() > > to initialize the remap table. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90311 > > Signed-off-by: Jon TURNEY > > --- > > src/glx/Makefile.am | 5 - > > src/glx/apple/apple_glapi.c | 3 +++ > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am > > index 6e50e09..de24496 100644 > > --- a/src/glx/Makefile.am > > +++ b/src/glx/Makefile.am > > @@ -139,7 +139,10 @@ libglx_la_SOURCES += \ > > applegl_glx.c > > > > SUBDIRS += apple > > -libglx_la_LIBADD += $(builddir)/apple/libappleglx.la > > +libglx_la_LIBADD += \ > > + $(builddir)/apple/libappleglx.la \ > > + $(top_builddir)/src/mesa/libmesa.la > > +nodist_EXTRA_lib@GL_LIB@_la_SOURCES = dummy.cpp > Pulling hunks of mesa into libGL does not sounds like a reasonable > thing. There is a reason why Jon had that XXX: in the original commit > message. > > Cheers, > Emil > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: move gl_NumSamples enablement with other ARB_sample_shading vars
Hmmm actually looking at the GLSL 4.50 doc, it looks like it just has a list of all uniforms available in one place, unlike the ARB_sample_shading spec which says it should go into fs only. From ARB_sample_shading: Add the following prototype to the list of built-in uniforms accessible from a fragment shader: uniform int gl_NumSamples; From GLSL 4.50 though: As an aid to accessing OpenGL processing state, the following uniform variables are built into the OpenGL Shading Language: ... uniform int gl_NumSamples; So I guess I should leave this in generate_uniforms but just add the enable conditions around it being added? -ilia On Thu, Jun 18, 2015 at 6:57 PM, Ilia Mirkin wrote: > gl_NumSamples should only be enabled when ARB_sample_shading is enabled, > and only in the fragment shader. Move to the generate_fs_special_vars > function, next to the other ARB_sample_shading-provided variables. > > Signed-off-by: Ilia Mirkin > --- > src/glsl/builtin_variables.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp > index c52b252..d2acbac 100644 > --- a/src/glsl/builtin_variables.cpp > +++ b/src/glsl/builtin_variables.cpp > @@ -764,7 +764,6 @@ builtin_variable_generator::generate_constants() > void > builtin_variable_generator::generate_uniforms() > { > - add_uniform(int_t, "gl_NumSamples"); > add_uniform(type("gl_DepthRangeParameters"), "gl_DepthRange"); > add_uniform(array(vec4_t, VERT_ATTRIB_MAX), "gl_CurrentAttribVertMESA"); > add_uniform(array(vec4_t, VARYING_SLOT_MAX), "gl_CurrentAttribFragMESA"); > @@ -947,6 +946,7 @@ builtin_variable_generator::generate_fs_special_vars() > } > > if (state->is_version(400, 0) || state->ARB_sample_shading_enable) { > + add_uniform(int_t, "gl_NumSamples"); >add_system_value(SYSTEM_VALUE_SAMPLE_ID, int_t, "gl_SampleID"); >add_system_value(SYSTEM_VALUE_SAMPLE_POS, vec2_t, "gl_SamplePosition"); >/* From the ARB_sample_shading specification: > -- > 2.3.6 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl: guard gl_NumSamples enablement on ARB_sample_shading
gl_NumSamples should only be enabled when ARB_sample_shading is enabled. Signed-off-by: Ilia Mirkin --- src/glsl/builtin_variables.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index c52b252..a765d35 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -764,7 +764,8 @@ builtin_variable_generator::generate_constants() void builtin_variable_generator::generate_uniforms() { - add_uniform(int_t, "gl_NumSamples"); + if (state->is_version(400, 0) || state->ARB_sample_shading_enable) + add_uniform(int_t, "gl_NumSamples"); add_uniform(type("gl_DepthRangeParameters"), "gl_DepthRange"); add_uniform(array(vec4_t, VERT_ATTRIB_MAX), "gl_CurrentAttribVertMESA"); add_uniform(array(vec4_t, VARYING_SLOT_MAX), "gl_CurrentAttribFragMESA"); -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90903] egl_dri2.c:dri2_load fails to load libglapi on osx
https://bugs.freedesktop.org/show_bug.cgi?id=90903 --- Comment #1 from Julien Isorce --- Created attachment 116583 --> https://bugs.freedesktop.org/attachment.cgi?id=116583&action=edit es2gears_x11 and es2_info with llvmpipe on osx Running gears with egl/gles2/x11 with llvmpipe software driver on osx. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] egl/dri2: load libglapi.0.dylib on osx
On 18 June 2015 at 19:29, Emil Velikov wrote: > On 18 June 2015 at 06:53, Julien Isorce wrote: > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90903 > > Signed-off-by: Julien Isorce > > --- > > src/egl/drivers/dri2/egl_dri2.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/src/egl/drivers/dri2/egl_dri2.c > b/src/egl/drivers/dri2/egl_dri2.c > > index a1cbd43..90b9648 100644 > > --- a/src/egl/drivers/dri2/egl_dri2.c > > +++ b/src/egl/drivers/dri2/egl_dri2.c > > @@ -2354,6 +2354,8 @@ dri2_load(_EGLDriver *drv) > > #ifdef HAVE_SHARED_GLAPI > > #ifdef HAVE_ANDROID_PLATFORM > > const char *libname = "libglapi.so"; > > +#elif defined(__APPLE__) > > + const char *libname = "libglapi.0.dylib"; > > #else > Is this enough to get dri modules (suspecting only swrast atm) working > with EGL ? > Well at least es2_info and es2gears_x11 are working (with both GALLIUM_DRIVER=softpipe and llvmpipe). I have attached a screenshot: https://bugs.freedesktop.org/attachment.cgi?id=116583 > > Note that currently libEGL already explicitly links against libglapi, > so imho we can just drop the whole thing. Haven't really looked but do > we use glFlush() as a workaround or it's something required by the EGL > standard ? > No idea but without the patch you get: EGL_LOG_LEVEL=debug es2_info libEGL debug: Native platform type: x11 (autodetected) libEGL debug: added egl_dri2 to module array libEGL warning: DRI2: failed to find _glapi_get_proc_address libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize Error: eglInitialize() failed Cheers Julien > > Thanks > Emil > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Add missing braces around if-statement.
Fixes a performance problem caused by commit b639ed2f. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895 --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index c0c8dfa..49f2e3e 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context *brw, mesa_format format, const union gl_color_union *color) { - if (_mesa_is_format_integer_color(format)) + if (_mesa_is_format_integer_color(format)) { if (brw->gen >= 8) { perf_debug("Integer fast clear not enabled for (%s)", _mesa_get_format_name(format)); } return false; + } for (int i = 0; i < 4; i++) { if (color->f[i] != 0.0 && color->f[i] != 1.0 && -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] egl: use unix defines on osx with clang
On 18 June 2015 at 19:33, Emil Velikov wrote: > On 18 June 2015 at 19:29, Emil Velikov wrote: > Sorry about that. Unintentionally hit send ;-\ > > > On 18 June 2015 at 06:53, Julien Isorce wrote: > >> CC egl_dri2.lo > >> include/EGL/eglplatform.h:135:2: > >> error: "Platform not recognized" > >> include/EGL/eglplatform.h:140:9: > >> error: unknown type name 'EGLNativeDisplayType' > >> typedef EGLNativeDisplayType NativeDisplayType; > >> > You should not longer see this message. Did you try building things, > with the updated eglplatform.h ? > You are right I forgot to update the new error message with current eglplatform.h. > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90249 > >> Signed-off-by: Julien Isorce > >> --- > >> include/EGL/eglplatform.h | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h > >> index 7802542..b376e64 100644 > >> --- a/include/EGL/eglplatform.h > >> +++ b/include/EGL/eglplatform.h > >> @@ -77,7 +77,7 @@ typedef HDC EGLNativeDisplayType; > >> typedef HBITMAP EGLNativePixmapType; > >> typedef HWNDEGLNativeWindowType; > >> > >> -#elif defined(__APPLE__) || defined(__WINSCW__) || > defined(__SYMBIAN32__) /* Symbian */ > >> +#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */ > >> > The above "defined(__APPLE__))" comes from Khronos, so if it's wrong > perhaps it should be reported to them ? Afaict with current mesa this > patch is not needed, and things will just work. > I remember that with the latest eglplatform.h I had to manually include #include / #include where they are used (because not hitting their inclusion by eglplatform.h) But then build was complaining about converting int to Display*, and void* to Pixmap/Window. A naive cast could silenced the build error but then it was crashing at runtime. I'll have another try. Cheers Julien > > Cheers, > Emil > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add missing braces around if-statement.
Oh, how silly :) Reviewed-by: Chris Forbes - Chris On Fri, Jun 19, 2015 at 11:19 AM, Matt Turner wrote: > Fixes a performance problem caused by commit b639ed2f. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895 > --- > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > index c0c8dfa..49f2e3e 100644 > --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context *brw, > mesa_format format, > const union gl_color_union *color) > { > - if (_mesa_is_format_integer_color(format)) > + if (_mesa_is_format_integer_color(format)) { >if (brw->gen >= 8) { > perf_debug("Integer fast clear not enabled for (%s)", > _mesa_get_format_name(format)); >} >return false; > + } > > for (int i = 0; i < 4; i++) { >if (color->f[i] != 0.0 && color->f[i] != 1.0 && > -- > 2.3.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add missing braces around if-statement.
Wow... Reviewed-by: Jason Ekstrand On Thu, Jun 18, 2015 at 4:19 PM, Matt Turner wrote: > Fixes a performance problem caused by commit b639ed2f. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895 > --- > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > index c0c8dfa..49f2e3e 100644 > --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context *brw, > mesa_format format, > const union gl_color_union *color) > { > - if (_mesa_is_format_integer_color(format)) > + if (_mesa_is_format_integer_color(format)) { >if (brw->gen >= 8) { > perf_debug("Integer fast clear not enabled for (%s)", > _mesa_get_format_name(format)); >} >return false; > + } > > for (int i = 0; i < 4; i++) { >if (color->f[i] != 0.0 && color->f[i] != 1.0 && > -- > 2.3.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add missing braces around if-statement.
On Thu, Jun 18, 2015 at 04:19:36PM -0700, Matt Turner wrote: > Fixes a performance problem caused by commit b639ed2f. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895 Ken spotted this in review. /me hides Reviewed-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > index c0c8dfa..49f2e3e 100644 > --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context *brw, > mesa_format format, > const union gl_color_union *color) > { > - if (_mesa_is_format_integer_color(format)) > + if (_mesa_is_format_integer_color(format)) { >if (brw->gen >= 8) { > perf_debug("Integer fast clear not enabled for (%s)", > _mesa_get_format_name(format)); >} >return false; > + } > > for (int i = 0; i < 4; i++) { >if (color->f[i] != 0.0 && color->f[i] != 1.0 && > -- > 2.3.6 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] New stable-branch 10.5 candidate pushed
We should also pull in Chris' 3-patch drawbuffers series if it applies: http://lists.freedesktop.org/archives/mesa-dev/2015-June/085851.html On Thu, Jun 18, 2015 at 11:30 AM, Emil Velikov wrote: > Hello list, > > The candidate for the Mesa 10.5.8 is now available. Currently we have: > - 11 queued > - 10 nominated (outstanding) > - and 0 rejected (obsolete) patches > > The present queue consist of a few nouveau and i965 patches, along with > couple libEGL hit bug fixes and a build fix for people using llvm/clang. > > Take a look at section "Mesa stable queue" for more information. > > Testing > --- > The following results are against piglit 305ecc3ac89. > > > Changes - classic i965(snb) > --- > None. > > > Changes - swrast classic > > None. > > > Changes - gallium softpipe > -- > None. > > > Changes - gallium llvmpipe (LLVM 3.6) > - > None. > > > Testing reports/general approval > > Any testing reports (or general approval of the state of the branch) > will be greatly appreciated. > > > Trivial merge conflicts > --- > commit bb00457f49177d8d43417855f843887de3148e99 > Author: Jason Ekstrand > > i965/fs: Don't let the EOT send message interfere with the MRF hack > > (cherry picked from commit 86e5afbfee5492235cab1a7be4ea49ac02be1644) > > > > The plan is to have 10.5.8 this Friday(19th of June). > > If you have any questions or comments that you would like to share > before the release, please go ahead. > > > Cheers, > Emil > > > Mesa stable queue > - > > Nominated (10) > == > > Boyan Ding (2): > egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals > i915: Add XRGB format to intel_screen_make_configs > > Brian Paul (1): > configure: don't try to build gallium DRI drivers if --disable-dri is > set > > Ilia Mirkin (1): > glsl: add version checks to conditionals for builtin variable enablement > mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls > > Mario Kleiner(1): > nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads > > Tapani Pälli (2): > glsl: Allow dynamic sampler array indexing with GLSL ES < 3.00 > glsl: validate sampler array indexing for 'constant-index-expression' > > Tom Stellard (3): > clover: Call clBuildProgram() notification function when build > completes v2 > gallium/drivers: Add threadsafe wrappers for pipe_context and > pipe_screen > clover: Use threadsafe wrappers for pipe_screen and pipe_context > > > Queued (11) > === > > Ben Widawsky (1): > i965: Disable compaction for EOT send messages > > Boyan Ding (1): > egl/x11: Set version of swrastLoader to 2 > > Emil Velikov (1): > docs: Add sha256sums for the 10.5.7 release > > Erik Faye-Lund (1): > mesa: build xmlconfig to a separate static library > > Francisco Jerez (1): > i965: Don't compact instructions with unmapped bits. > > Ilia Mirkin (3): > nvc0/ir: fix collection of first uses for texture barrier insertion > nv50,nvc0: clamp uniform size to 64k > nvc0/ir: can't have a join on a load with an indirect source > > Jason Ekstrand (1): > i965/fs: Don't let the EOT send message interfere with the MRF hack > > Marek Olšák (1): > egl: fix setting context flags > > Roland Scheidegger (1): > draw: (trivial) fix NULL pointer dereference > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] osx: fix asm support on darwin
On 18 June 2015 at 19:46, Emil Velikov wrote: > On 18 June 2015 at 06:53, Julien Isorce wrote: > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90908 > > Signed-off-by: Julien Isorce > > --- > > configure.ac | 2 +- > > src/mesa/x86-64/xform4.S | 53 > +--- > > src/mesa/x86/assyntax.h | 2 +- > > 3 files changed, 30 insertions(+), 27 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index ae6d83d..6d699d5 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -615,7 +615,7 @@ if test "x$enable_asm" = xyes; then > > ;; > > x86_64|amd64) > > case "$host_os" in > > -linux* | *freebsd* | dragonfly* | *netbsd* | openbsd*) > > +linux* | *freebsd* | dragonfly* | *netbsd* | openbsd* | darwin*) > > asm_arch=x86_64 > > ;; > > esac > > diff --git a/src/mesa/x86-64/xform4.S b/src/mesa/x86-64/xform4.S > > index c185f62..17eb7fa 100644 > > --- a/src/mesa/x86-64/xform4.S > > +++ b/src/mesa/x86-64/xform4.S > > @@ -24,14 +24,15 @@ > > > > #ifdef USE_X86_64_ASM > > > > +#include "x86/assyntax.h" > > #include "matypes.h" > > > > .text > > > > .align 16 > > -.globl _mesa_x86_64_cpuid > > -.hidden _mesa_x86_64_cpuid > > -_mesa_x86_64_cpuid: > > +GLOBL GLNAME(_mesa_x86_64_cpuid) > > +HIDDEN(_mesa_x86_64_cpuid) > > +GLNAME(_mesa_x86_64_cpuid): > > pushq %rbx > > movl(%rdi), %eax > > movl8(%rdi), %ecx > > @@ -46,9 +47,9 @@ _mesa_x86_64_cpuid: > > ret > > > > .align 16 > > -.globl _mesa_x86_64_transform_points4_general > > -.hidden _mesa_x86_64_transform_points4_general > > -_mesa_x86_64_transform_points4_general: > > +GLOBL GLNAME(_mesa_x86_64_transform_points4_general) > > +HIDDEN(_mesa_x86_64_transform_points4_general) > > +GLNAME(_mesa_x86_64_transform_points4_general): > > /* > > * rdi = dest > > * rsi = matrix > > @@ -105,8 +106,10 @@ p4_general_loop: > > p4_general_done: > > .byte 0xf3 > > ret > > - > > + > > +#if defined (__ELF__) && defined (__linux__) > > .section .rodata > > +#endif > > > > .align 16 > > p4_constants: > > @@ -122,13 +125,13 @@ p4_constants: > > > > .text > > .align 16 > > -.globl _mesa_x86_64_transform_points4_3d > > -.hidden _mesa_x86_64_transform_points4_3d > > +GLOBL GLNAME(_mesa_x86_64_transform_points4_3d) > > +HIDDEN(_mesa_x86_64_transform_points4_3d) > > /* > > * this is slower than _mesa_x86_64_transform_points4_general > > * because it ensures that the last matrix row (or is it column?) is > 0,0,0,1 > > */ > > -_mesa_x86_64_transform_points4_3d: > > +GLNAME(_mesa_x86_64_transform_points4_3d): > > > > leaq p4_constants(%rip), %rax > > > > @@ -194,9 +197,9 @@ p4_3d_done: > > > > > > .align 16 > > -.globl _mesa_x86_64_transform_points4_identity > > -.hidden _mesa_x86_64_transform_points4_identity > > -_mesa_x86_64_transform_points4_identity: > > +GLOBL GLNAME(_mesa_x86_64_transform_points4_identity) > > +HIDDEN(_mesa_x86_64_transform_points4_identitiy) > > +GLNAME(_mesa_x86_64_transform_points4_identity): > > > > movl V4F_COUNT(%rdx), %ecx /* count */ > > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > > @@ -223,9 +226,9 @@ p4_identity_done: > > > > > > .align 16 > > -.globl _mesa_3dnow_transform_points4_3d_no_rot > > -.hidden _mesa_3dnow_transform_points4_3d_no_rot > > -_mesa_3dnow_transform_points4_3d_no_rot: > > +GLOBL GLNAME(_mesa_3dnow_transform_points4_3d_no_rot) > > +HIDDEN(_mesa_3dnow_transform_points4_3d_no_rot) > > +GLNAME(_mesa_3dnow_transform_points4_3d_no_rot): > > > > movl V4F_COUNT(%rdx), %ecx /* count */ > > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > > @@ -288,9 +291,9 @@ p4_3d_no_rot_done: > > > > > > .align 16 > > -.globl _mesa_3dnow_transform_points4_perspective > > -.hidden _mesa_3dnow_transform_points4_perspective > > -_mesa_3dnow_transform_points4_perspective: > > +GLOBL GLNAME(_mesa_3dnow_transform_points4_perspective) > > +HIDDEN(_mesa_3dnow_transform_points4_perspective) > > +GLNAME(_mesa_3dnow_transform_points4_perspective): > > > > movl V4F_COUNT(%rdx), %ecx /* count */ > > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > > @@ -355,9 +358,9 @@ p4_perspective_done: > > ret > > > > .align 16 > > -.globl _mesa_3dnow_transform_points4_2d_no_rot > > -.hidden _mesa_3dnow_transform_points4_2d_no_rot > > -_mesa_3dnow_transform_points4_2d_no_rot: > > +GLOBL GLNAME(_mesa_3dnow_transform_points4_2d_no_rot) > > +HIDDEN(_mesa_3dnow_transform_points4_2d_no_rot) > > +GLNAME(_mesa_3dnow_transform_points4_2d_no_rot): > > > > movl V4F_COUNT(%rdx), %ecx /* count */ > > movzbl V4F_STRIDE(%rdx), %eax /* stride */ > > @@ -411,9 +414,9 @@ p4_2d_no_rot_done: > > > > > > .align 16 > > -.globl _mesa_3dnow_transform_points4_2d > > -.hidden _mesa_3dnow_transform_points4_2d > > -_mesa_3dnow_transform_points4_2d: > > +GLOBL GLNAME(_mesa_3dno
Re: [Mesa-dev] [PATCH] i965: Add missing braces around if-statement.
On Thursday, June 18, 2015 04:28:25 PM Ben Widawsky wrote: > > On Thu, Jun 18, 2015 at 04:19:36PM -0700, Matt Turner wrote: > > Fixes a performance problem caused by commit b639ed2f. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895 > > Ken spotted this in review. > /me hides > > Reviewed-by: Ben Widawsky Scratch one mystery! Thanks Matt. Reviewed-by: Kenneth Graunke > > > --- > > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > > b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > > index c0c8dfa..49f2e3e 100644 > > --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > > +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > > @@ -339,12 +339,13 @@ is_color_fast_clear_compatible(struct brw_context > > *brw, > > mesa_format format, > > const union gl_color_union *color) > > { > > - if (_mesa_is_format_integer_color(format)) > > + if (_mesa_is_format_integer_color(format)) { > >if (brw->gen >= 8) { > > perf_debug("Integer fast clear not enabled for (%s)", > > _mesa_get_format_name(format)); > >} > >return false; > > + } > > > > for (int i = 0; i < 4; i++) { > >if (color->f[i] != 0.0 && color->f[i] != 1.0 && > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes
https://bugs.freedesktop.org/show_bug.cgi?id=91020 Mike Lothian changed: What|Removed |Added CC||m...@fireburn.co.uk --- Comment #2 from Mike Lothian --- I'm afraid that's beyond me - I've tried compiling without the egl utilities and now get: -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes
https://bugs.freedesktop.org/show_bug.cgi?id=91020 Mike Lothian changed: What|Removed |Added Attachment #116585|Buid log without EGL|Build log without EGL description|| -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes
https://bugs.freedesktop.org/show_bug.cgi?id=91020 --- Comment #3 from Mike Lothian --- Created attachment 116585 --> https://bugs.freedesktop.org/attachment.cgi?id=116585&action=edit Buid log without EGL -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl: guard gl_NumSamples enablement on ARB_sample_shading
On Thu, Jun 18, 2015 at 4:08 PM, Ilia Mirkin wrote: > gl_NumSamples should only be enabled when ARB_sample_shading is enabled. > > Signed-off-by: Ilia Mirkin > --- > src/glsl/builtin_variables.cpp | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp > index c52b252..a765d35 100644 > --- a/src/glsl/builtin_variables.cpp > +++ b/src/glsl/builtin_variables.cpp > @@ -764,7 +764,8 @@ builtin_variable_generator::generate_constants() > void > builtin_variable_generator::generate_uniforms() > { > - add_uniform(int_t, "gl_NumSamples"); > + if (state->is_version(400, 0) || state->ARB_sample_shading_enable) > + add_uniform(int_t, "gl_NumSamples"); > add_uniform(type("gl_DepthRangeParameters"), "gl_DepthRange"); > add_uniform(array(vec4_t, VERT_ATTRIB_MAX), "gl_CurrentAttribVertMESA"); > add_uniform(array(vec4_t, VARYING_SLOT_MAX), "gl_CurrentAttribFragMESA"); > -- > 2.3.6 > Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] i965/gen9: Don't use encrypted MOCS
On Wednesday, June 17, 2015 03:50:13 PM Ben Widawsky wrote: > On gen9+ MOCS is an index into a table. It is 7 bits, and AFAICT, bit 0 is for > doing encrypted reads. > > I don't recall how I decided to do this for BXT. I don't know this patch was > ever needed, since it seems nothing is broken today on SKL. Furthermore, this > patch may no longer be needed because of the ongoing changes with MOCS setup. > It > is what is being used/tested, so it's included in the series. > > The chosen values are the old values left shifted. That was also an arbitrary > choice. > > Cc: Francisco Jerez > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_defines.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index bfcc442..5358edc 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -2495,8 +2495,8 @@ enum brw_wm_barycentric_interp_mode { > * cache settings. We still use only either write-back or write-through; and > * rely on the documented default values. > */ > -#define SKL_MOCS_WB 9 > -#define SKL_MOCS_WT 5 > +#define SKL_MOCS_WB 0x12 > +#define SKL_MOCS_WT 0xa Yeah, it looks like Kristian made these defines the indices into the table, but may have missed that the MOCS field puts that table index in [6:1] and bit 0 is something else. So shifting left by 1 seems like a good plan. Perhaps write it as #define SKL_MOCS_WB (0b000101 << 1) #define SKL_MOCS_WT (0b001001 << 1) so the index value is written like it is in the documentation, and the shift 1 indicates moving it into the right place for MOCS? Either way, Reviewed-by: Kenneth Graunke Incidentally...the WT value (index 5) appears to skip eLLC - the target cache is 01b = "LLC only". That doesn't seem desirable. We probably want index 6 instead (0b000110 << 1) which uses both LLC and eLLC. That said, we shouldn't ever be using WT in the driver - we want to use the PTE value. (krh even added a FINISHME comment to that effect.) I think a proper value for that would be: #define SKL_MOCS_PTE (0b10 << 1) (Default: 0b10, LeCC = 0x00 - use cacheability controls from page table / ... TC = LLC/eLLC allowed) We could either fix the _WT define or just delete it. > > #define MEDIA_VFE_STATE 0x7000 > /* GEN7 DW2, GEN8+ DW3 */ > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/17] i965/fs: Explicitly set the exec_size on the add(32) in interpolation setup
Soon we will start using the builder to explicitly set all the execution sizes. We could make a 32-wide builder, but the builder asserts that we never grow it which is usually a reasonable assumption. Sinc this one instruction is a bit of an odd-ball, we just set the exec_size explicitly. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 4770838..b00825e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1357,10 +1357,11 @@ fs_visitor::emit_interpolation_setup_gen6() */ fs_reg int_pixel_xy(GRF, alloc.allocate(dispatch_width / 8), BRW_REGISTER_TYPE_UW, dispatch_width * 2); - abld.exec_all() - .ADD(int_pixel_xy, - fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)), - fs_reg(brw_imm_v(0x11001010))); + fs_inst *add = abld.exec_all() + .ADD(int_pixel_xy, + fs_reg(stride(suboffset(g1_uw, 4), 1, 4, 0)), + fs_reg(brw_imm_v(0x11001010))); + add->exec_size = dispatch_width * 2; this->pixel_x = vgrf(glsl_type::float_type); this->pixel_y = vgrf(glsl_type::float_type); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/17] i965/fs: Fix fs_inst::regs_read() for uniform pull constant loads
Previously, fs_inst::regs_read() fell back to depending on the register width for the second source. This isn't really correct since it isn't a SIMD8 value at all, but a SIMD4x2 value. This commit changes it to explicitly be always one register. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 37b6d0d..ce56657 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -763,6 +763,12 @@ fs_inst::regs_read(int arg) const return exec_size / 4; break; + case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7: + /* The second argument is a single SIMD4x2 register */ + if (arg == 1) + return 1; + break; + default: if (is_tex() && arg == 0 && src[0].file == GRF) return mlen; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/17] i965/fs: Set the builder group for emitting FB-write stencil/AA alpha
--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index b00825e..8a43ec8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1528,7 +1528,7 @@ fs_visitor::emit_single_fb_write(const fs_builder &bld, if (payload.aa_dest_stencil_reg) { sources[length] = fs_reg(GRF, alloc.allocate(1)); - bld.exec_all().annotate("FB write stencil/AA alpha") + bld.group(8, 0).exec_all().annotate("FB write stencil/AA alpha") .MOV(sources[length], fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0))); length++; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/17] i965/fs: Use a switch statement in fs_inst::regs_read()
This makes things a little simpler, more efficient, and quite a bit more readable. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 45 ++-- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 5563c5a..37b6d0d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -744,28 +744,29 @@ fs_inst::is_partial_write() const int fs_inst::regs_read(int arg) const { - if (is_tex() && arg == 0 && src[0].file == GRF) { - return mlen; - } else if (opcode == FS_OPCODE_FB_WRITE && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_URB_WRITE_SIMD8 && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_UNTYPED_ATOMIC && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_UNTYPED_SURFACE_READ && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_UNTYPED_SURFACE_WRITE && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_TYPED_ATOMIC && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_TYPED_SURFACE_READ && arg == 0) { - return mlen; - } else if (opcode == SHADER_OPCODE_TYPED_SURFACE_WRITE && arg == 0) { - return mlen; - } else if (opcode == FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET && arg == 0) { - return mlen; - } else if (opcode == FS_OPCODE_LINTERP && arg == 0) { - return exec_size / 4; + switch (opcode) { + case FS_OPCODE_FB_WRITE: + case SHADER_OPCODE_URB_WRITE_SIMD8: + case SHADER_OPCODE_UNTYPED_ATOMIC: + case SHADER_OPCODE_UNTYPED_SURFACE_READ: + case SHADER_OPCODE_UNTYPED_SURFACE_WRITE: + case SHADER_OPCODE_TYPED_ATOMIC: + case SHADER_OPCODE_TYPED_SURFACE_READ: + case SHADER_OPCODE_TYPED_SURFACE_WRITE: + case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET: + if (arg == 0) + return mlen; + break; + + case FS_OPCODE_LINTERP: + if (arg == 0) + return exec_size / 4; + break; + + default: + if (is_tex() && arg == 0 && src[0].file == GRF) + return mlen; + break; } switch (src[arg].file) { -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/17] i965/fs: Report the right value in fs_inst::regs_read() for PIXEL_X/Y
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index ce56657..4f98d63 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -769,6 +769,12 @@ fs_inst::regs_read(int arg) const return 1; break; + case FS_OPCODE_PIXEL_X: + case FS_OPCODE_PIXEL_Y: + if (arg == 0) + return 2; + break; + default: if (is_tex() && arg == 0 && src[0].file == GRF) return mlen; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/17] i965/fs: Remove fs_inst constructors that don't take an explicit exec_size
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 30 ++ src/mesa/drivers/dri/i965/brw_fs_builder.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 -- src/mesa/drivers/dri/i965/brw_ir_fs.h | 9 + 4 files changed, 8 insertions(+), 39 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 740b51d..61235d7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -126,9 +126,9 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size) init(opcode, exec_size, reg_undef, NULL, 0); } -fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst) +fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst) { - init(opcode, 0, dst, NULL, 0); + init(opcode, exec_size, dst, NULL, 0); } fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, @@ -138,12 +138,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, init(opcode, exec_size, dst, src, 1); } -fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0) -{ - const fs_reg src[1] = { src0 }; - init(opcode, 0, dst, src, 1); -} - fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, const fs_reg &src0, const fs_reg &src1) { @@ -151,13 +145,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, init(opcode, exec_size, dst, src, 2); } -fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0, - const fs_reg &src1) -{ - const fs_reg src[2] = { src0, src1 }; - init(opcode, 0, dst, src, 2); -} - fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, const fs_reg &src0, const fs_reg &src1, const fs_reg &src2) { @@ -165,19 +152,6 @@ fs_inst::fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, init(opcode, exec_size, dst, src, 3); } -fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0, - const fs_reg &src1, const fs_reg &src2) -{ - const fs_reg src[3] = { src0, src1, src2 }; - init(opcode, 0, dst, src, 3); -} - -fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, - const fs_reg src[], unsigned sources) -{ - init(opcode, 0, dst, src, sources); -} - fs_inst::fs_inst(enum opcode opcode, uint8_t exec_width, const fs_reg &dst, const fs_reg src[], unsigned sources) { diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h b/src/mesa/drivers/dri/i965/brw_fs_builder.h index 594e252..74fd2c9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h @@ -281,7 +281,7 @@ namespace brw { instruction * emit(enum opcode opcode, const dst_reg &dst) const { - return emit(instruction(opcode, dst)); + return emit(instruction(opcode, dst.width, dst)); } /** diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 0ede634..9941116 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -109,7 +109,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader) if (var->data.location == VARYING_SLOT_POS) { reg = *emit_fragcoord_interpolation(var->data.pixel_center_integer, var->data.origin_upper_left); -emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, input, reg), 0xF); +emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(), + input, reg), 0xF); } else { emit_general_interpolation(input, var->name, var->type, (glsl_interp_qualifier) var->data.interpolation, @@ -1743,7 +1744,8 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) fs_reg dest = get_nir_dest(instr->dest); dest.type = this->result.type; unsigned num_components = nir_tex_instr_dest_size(instr); - emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, dest, this->result), + emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(), + dest, this->result), (1 << num_components) - 1); } diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h b/src/mesa/drivers/dri/i965/brw_ir_fs.h index 10033ca..4e62efa 100644 --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h @@ -159,20 +159,13 @@ public: fs_inst(); fs_inst(enum opcode opcode, uint8_t exec_size); - fs_inst(enum opcode opcode, const fs_reg &dst); + fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst); fs_inst(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, const fs_reg &src0); - fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0);
[Mesa-dev] [PATCH 08/17] i965/fs: Make better use of the builder in shader_time
Previously, we were just depending on register widths to ensure that various things were exec_size of 1 etc. Now, we do so explicitly using the builder. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index c13ac7d..740b51d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -557,7 +557,7 @@ fs_visitor::get_timestamp(const fs_builder &bld) /* We want to read the 3 fields we care about even if it's not enabled in * the dispatch. */ - bld.exec_all().MOV(dst, ts); + bld.group(4, 0).exec_all().MOV(dst, ts); /* The caller wants the low 32 bits of the timestamp. Since it's running * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds, @@ -637,17 +637,19 @@ fs_visitor::emit_shader_time_end() start.negate = true; fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1); diff.set_smear(0); - ibld.ADD(diff, start, shader_end_time); + + const fs_builder cbld = ibld.group(1, 0); + cbld.group(1, 0).ADD(diff, start, shader_end_time); /* If there were no instructions between the two timestamp gets, the diff * is 2 cycles. Remove that overhead, so I can forget about that when * trying to determine the time taken for single instructions. */ - ibld.ADD(diff, diff, fs_reg(-2u)); - SHADER_TIME_ADD(ibld, type, diff); - SHADER_TIME_ADD(ibld, written_type, fs_reg(1u)); + cbld.ADD(diff, diff, fs_reg(-2u)); + SHADER_TIME_ADD(cbld, type, diff); + SHADER_TIME_ADD(cbld, written_type, fs_reg(1u)); ibld.emit(BRW_OPCODE_ELSE); - SHADER_TIME_ADD(ibld, reset_type, fs_reg(1u)); + SHADER_TIME_ADD(cbld, reset_type, fs_reg(1u)); ibld.emit(BRW_OPCODE_ENDIF); } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/17] i965/fs: Remove exec_size guessing from fs_inst::init()
Now that all of the non-explicit constructors are gone, we don't need to guess anymore. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 22 -- 1 file changed, 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index cff27e7..d9b7f75 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -68,28 +68,6 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, assert(dst.file != IMM && dst.file != UNIFORM); - /* If exec_size == 0, try to guess it from the registers. Since all -* manner of things may use hardware registers, we first try to guess -* based on GRF registers. If this fails, we will go ahead and take the -* width from the destination register. -*/ - if (this->exec_size == 0) { - if (dst.file == GRF) { - this->exec_size = dst.width; - } else { - for (unsigned i = 0; i < sources; ++i) { -if (src[i].file != GRF && src[i].file != ATTR) - continue; - -if (this->exec_size <= 1) - this->exec_size = src[i].width; -assert(src[i].width == 1 || src[i].width == this->exec_size); - } - } - - if (this->exec_size == 0 && dst.file != BAD_FILE) - this->exec_size = dst.width; - } assert(this->exec_size != 0); this->conditional_mod = BRW_CONDITIONAL_NONE; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/17] i965/fs_builder: Use dispatch_width instead of reg.width for offset and half
--- src/mesa/drivers/dri/i965/brw_fs_builder.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h b/src/mesa/drivers/dri/i965/brw_fs_builder.h index 7d3c8ab..58519d7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h @@ -161,7 +161,7 @@ namespace brw { case MRF: case ATTR: return byte_offset(reg, - delta * MAX2(reg.width * reg.stride, 1) * + delta * dispatch_width() * reg.stride * type_sz(reg.type)); case UNIFORM: reg.reg_offset += delta; @@ -185,9 +185,9 @@ namespace brw { case GRF: case MRF: -assert(reg.width == 16); -reg.width = 8; -return horiz_offset(reg, 8 * idx); +assert(dispatch_width() == 16); +reg.width = dispatch_width() / 2; +return horiz_offset(reg, (dispatch_width() / 2) * idx); case ATTR: case HW_REG: -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/17] i965/fs: Use exec_size instead of dst.width for computing component size
There are a variety of places where we use dst.width / 8 to compute the size of a single logical channel. Instead, we should be using exec_size. --- src/mesa/drivers/dri/i965/brw_fs.cpp| 6 +++--- src/mesa/drivers/dri/i965/brw_fs_cse.cpp| 2 +- src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp| 2 +- src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 4 ++-- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b889432..b30463a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2317,12 +2317,12 @@ fs_visitor::opt_register_renaming() if (depth == 0 && inst->dst.file == GRF && - alloc.sizes[inst->dst.reg] == inst->dst.width / 8 && + alloc.sizes[inst->dst.reg] == inst->exec_size / 8 && !inst->is_partial_write()) { if (remap[dst] == -1) { remap[dst] = dst; } else { -remap[dst] = alloc.allocate(inst->dst.width / 8); +remap[dst] = alloc.allocate(inst->exec_size / 8); inst->dst.reg = remap[dst]; progress = true; } @@ -2453,7 +2453,7 @@ fs_visitor::compute_to_mrf() /* Things returning more than one register would need us to * understand coalescing out more than one MOV at a time. */ -if (scan_inst->regs_written > scan_inst->dst.width / 8) +if (scan_inst->regs_written > scan_inst->exec_size / 8) break; /* SEND instructions can't have MRF as a destination. */ diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index 5ea66c5..55ed352 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -179,7 +179,7 @@ static void create_copy_instr(const fs_builder &bld, fs_inst *inst, fs_reg src, bool negate) { int written = inst->regs_written; - int dst_width = inst->dst.width / 8; + int dst_width = inst->exec_size / 8; const fs_builder ubld = bld.group(inst->exec_size, inst->force_sechalf) .exec_all(inst->force_writemask_all); fs_inst *copy; diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp index 2ad7079..149c0f0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp @@ -196,7 +196,7 @@ fs_visitor::register_coalesce() continue; } reg_to_offset[offset] = inst->dst.reg_offset; - if (inst->src[0].width == 16) + if (inst->exec_size == 16) reg_to_offset[offset + 1] = inst->dst.reg_offset + 1; mov[offset] = inst; channels_remaining -= inst->regs_written; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 68cd454..9291d39 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -912,7 +912,7 @@ fs_visitor::emit_texture(ir_texture_opcode op, bld.emit(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth, fs_reg(6)); fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, inst->regs_written); - int components = inst->regs_written / (dst.width / 8); + int components = inst->regs_written / (inst->exec_size / 8); for (int i = 0; i < components; i++) { if (i == 2) { fixed_payload[i] = fixed_depth; diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index ee0add5..b49961f 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -1314,8 +1314,8 @@ fs_instruction_scheduler::choose_instruction_to_schedule() * single-result send is probably actually reducing register * pressure. */ - if (inst->regs_written <= inst->dst.width / 8 && - chosen_inst->regs_written > chosen_inst->dst.width / 8) { + if (inst->regs_written <= inst->exec_size / 8 && + chosen_inst->regs_written > chosen_inst->exec_size / 8) { chosen = n; continue; } else if (inst->regs_written > chosen_inst->regs_written) { -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/17] i965/fs: Use exec_size for determining regs read/written and partial writes
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 61235d7..cff27e7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -101,7 +101,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg &dst, case MRF: case ATTR: this->regs_written = - DIV_ROUND_UP(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32); + DIV_ROUND_UP(MAX2(exec_size * dst.stride, 1) * type_sz(dst.type), 32); break; case BAD_FILE: this->regs_written = 0; @@ -718,7 +718,7 @@ bool fs_inst::is_partial_write() const { return ((this->predicate && this->opcode != BRW_OPCODE_SEL) || - (this->dst.width * type_sz(this->dst.type)) < 32 || + (this->exec_size * type_sz(this->dst.type)) < 32 || !this->dst.is_contiguous()); } @@ -772,8 +772,8 @@ fs_inst::regs_read(int arg) const if (src[arg].stride == 0) { return 1; } else { - int size = src[arg].width * src[arg].stride * type_sz(src[arg].type); - return (size + 31) / 32; + int size = this->exec_size * src[arg].stride * type_sz(src[arg].type); + return DIV_ROUND_UP(size, 32); } case MRF: unreachable("MRF registers are not allowed as sources"); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/17] i965/fs: Use the builder dispatch width instead of dst.width for pull constants
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index d9b7f75..b889432 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -188,7 +188,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld, bld.ADD(vec4_offset, varying_offset, fs_reg(const_offset & ~3)); int scale = 1; - if (devinfo->gen == 4 && dst.width == 8) { + if (devinfo->gen == 4 && bld.dispatch_width() == 8) { /* Pre-gen5, we can either use a SIMD8 message that requires (header, * u, v, r) as parameters, or we can just use the SIMD16 message * consisting of (header, u). We choose the second, at the cost of a @@ -204,9 +204,9 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld, op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD; assert(dst.width % 8 == 0); - int regs_written = 4 * (dst.width / 8) * scale; + int regs_written = 4 * (bld.dispatch_width() / 8) * scale; fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written), - dst.type, dst.width); + dst.type, bld.dispatch_width()); fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset); inst->regs_written = regs_written; @@ -216,7 +216,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld, if (devinfo->gen == 4) inst->mlen = 3; else - inst->mlen = 1 + dispatch_width / 8; + inst->mlen = 1 + bld.dispatch_width() / 8; } bld.MOV(dst, bld.offset(vec4_result, (const_offset & 3) * scale)); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/17] i965/fs: Move offset() and half() to the fs_builder
We want to move these into the builder so that they know the current builder's dispatch width. This will be needed by a later commit. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 52 ++ src/mesa/drivers/dri/i965/brw_fs_builder.h | 46 + src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 60 +-- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 149 ++- src/mesa/drivers/dri/i965/brw_ir_fs.h| 51 - 6 files changed, 182 insertions(+), 178 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 4f98d63..c13ac7d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -267,7 +267,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld, inst->mlen = 1 + dispatch_width / 8; } - bld.MOV(dst, offset(vec4_result, (const_offset & 3) * scale)); + bld.MOV(dst, bld.offset(vec4_result, (const_offset & 3) * scale)); } /** @@ -361,7 +361,12 @@ fs_inst::is_copy_payload(const brw::simple_allocator &grf_alloc) const reg.width = this->src[i].width; if (!this->src[i].equals(reg)) return false; - reg = ::offset(reg, 1); + + if (i < this->header_size) { + reg.reg_offset += 1; + } else { + reg.reg_offset += this->exec_size / 8; + } } return true; @@ -963,7 +968,7 @@ fs_visitor::emit_fragcoord_interpolation(bool pixel_center_integer, } else { bld.ADD(wpos, this->pixel_x, fs_reg(0.5f)); } - wpos = offset(wpos, 1); + wpos = bld.offset(wpos, 1); /* gl_FragCoord.y */ if (!flip && pixel_center_integer) { @@ -979,7 +984,7 @@ fs_visitor::emit_fragcoord_interpolation(bool pixel_center_integer, bld.ADD(wpos, pixel_y, fs_reg(offset)); } - wpos = offset(wpos, 1); + wpos = bld.offset(wpos, 1); /* gl_FragCoord.z */ if (devinfo->gen >= 6) { @@ -989,7 +994,7 @@ fs_visitor::emit_fragcoord_interpolation(bool pixel_center_integer, this->delta_xy[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC], interp_reg(VARYING_SLOT_POS, 2)); } - wpos = offset(wpos, 1); + wpos = bld.offset(wpos, 1); /* gl_FragCoord.w: Already set up in emit_interpolation */ bld.MOV(wpos, this->wpos_w); @@ -1072,7 +1077,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const char *name, /* If there's no incoming setup data for this slot, don't * emit interpolation for it. */ - attr = offset(attr, type->vector_elements); + attr = bld.offset(attr, type->vector_elements); location++; continue; } @@ -1087,7 +1092,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const char *name, interp = suboffset(interp, 3); interp.type = attr.type; bld.emit(FS_OPCODE_CINTERP, attr, fs_reg(interp)); - attr = offset(attr, 1); + attr = bld.offset(attr, 1); } } else { /* Smooth/noperspective interpolation case. */ @@ -1125,7 +1130,7 @@ fs_visitor::emit_general_interpolation(fs_reg attr, const char *name, if (devinfo->gen < 6 && interpolation_mode == INTERP_QUALIFIER_SMOOTH) { bld.MUL(attr, attr, this->pixel_w); } - attr = offset(attr, 1); + attr = bld.offset(attr, 1); } } @@ -1227,19 +1232,19 @@ fs_visitor::emit_samplepos_setup() if (dispatch_width == 8) { abld.MOV(int_sample_x, fs_reg(sample_pos_reg)); } else { - abld.half(0).MOV(half(int_sample_x, 0), fs_reg(sample_pos_reg)); - abld.half(1).MOV(half(int_sample_x, 1), + abld.half(0).MOV(abld.half(int_sample_x, 0), fs_reg(sample_pos_reg)); + abld.half(1).MOV(abld.half(int_sample_x, 1), fs_reg(suboffset(sample_pos_reg, 16))); } /* Compute gl_SamplePosition.x */ compute_sample_position(pos, int_sample_x); - pos = offset(pos, 1); + pos = abld.offset(pos, 1); if (dispatch_width == 8) { abld.MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1))); } else { - abld.half(0).MOV(half(int_sample_y, 0), + abld.half(0).MOV(abld.half(int_sample_y, 0), fs_reg(suboffset(sample_pos_reg, 1))); - abld.half(1).MOV(half(int_sample_y, 1), + abld.half(1).MOV(abld.half(int_sample_y, 1), fs_reg(suboffset(sample_pos_reg, 17))); } /* Compute gl_SamplePosition.y */ @@ -3018,10 +3023,6 @@ fs_visitor::lower_load_payload() assert(inst->dst.file == MRF || inst->dst.file == GRF); assert(inst->saturate == false); - - const fs_builder ibld = bld.group(inst->exec_size, inst->force_sechalf) - .exec_all(inst->force_writemask_all) -
[Mesa-dev] [PATCH 11/17] i965/fs_builder: Use the dispatch width for setting exec sizes
Previously we used dst.width but the two *should* be the same. --- src/mesa/drivers/dri/i965/brw_fs_builder.h | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h b/src/mesa/drivers/dri/i965/brw_fs_builder.h index 74fd2c9..7d3c8ab 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h @@ -281,7 +281,7 @@ namespace brw { instruction * emit(enum opcode opcode, const dst_reg &dst) const { - return emit(instruction(opcode, dst.width, dst)); + return emit(instruction(opcode, dispatch_width(), dst)); } /** @@ -299,11 +299,11 @@ namespace brw { case SHADER_OPCODE_SIN: case SHADER_OPCODE_COS: return fix_math_instruction( - emit(instruction(opcode, dst.width, dst, + emit(instruction(opcode, dispatch_width(), dst, fix_math_operand(src0; default: -return emit(instruction(opcode, dst.width, dst, src0)); +return emit(instruction(opcode, dispatch_width(), dst, src0)); } } @@ -319,12 +319,12 @@ namespace brw { case SHADER_OPCODE_INT_QUOTIENT: case SHADER_OPCODE_INT_REMAINDER: return fix_math_instruction( - emit(instruction(opcode, dst.width, dst, + emit(instruction(opcode, dispatch_width(), dst, fix_math_operand(src0), fix_math_operand(src1; default: -return emit(instruction(opcode, dst.width, dst, src0, src1)); +return emit(instruction(opcode, dispatch_width(), dst, src0, src1)); } } @@ -341,13 +341,14 @@ namespace brw { case BRW_OPCODE_BFI2: case BRW_OPCODE_MAD: case BRW_OPCODE_LRP: -return emit(instruction(opcode, dst.width, dst, +return emit(instruction(opcode, dispatch_width(), dst, fix_3src_operand(src0), fix_3src_operand(src1), fix_3src_operand(src2))); default: -return emit(instruction(opcode, dst.width, dst, src0, src1, src2)); +return emit(instruction(opcode, dispatch_width(), dst, +src0, src1, src2)); } } @@ -563,7 +564,8 @@ namespace brw { { assert(dst.width % 8 == 0); instruction *inst = emit(instruction(SHADER_OPCODE_LOAD_PAYLOAD, - dst.width, dst, src, sources)); + dispatch_width(), dst, + src, sources)); inst->header_size = header_size; for (unsigned i = 0; i < header_size; i++) @@ -574,7 +576,7 @@ namespace brw { for (unsigned i = header_size; i < sources; ++i) assert(src[i].file != GRF || src[i].width == dst.width); - inst->regs_written += (sources - header_size) * (dst.width / 8); + inst->regs_written += (sources - header_size) * (dispatch_width() / 8); return inst; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/17] i965/blorp: Explicitly set execution sizes for new'd instructions
This doesn't affect instructions allocated using the builder. --- src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index c1b7609..f655a0c 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -72,7 +72,7 @@ brw_blorp_eu_emitter::emit_kill_if_outside_rect(const struct brw_reg &x, emit_cmp(BRW_CONDITIONAL_L, x, dst_x1)->predicate = BRW_PREDICATE_NORMAL; emit_cmp(BRW_CONDITIONAL_L, y, dst_y1)->predicate = BRW_PREDICATE_NORMAL; - fs_inst *inst = new (mem_ctx) fs_inst(BRW_OPCODE_AND, g1, f0, g1); + fs_inst *inst = new (mem_ctx) fs_inst(BRW_OPCODE_AND, 16, g1, f0, g1); inst->force_writemask_all = true; insts.push_tail(inst); } @@ -83,7 +83,7 @@ brw_blorp_eu_emitter::emit_texture_lookup(const struct brw_reg &dst, unsigned base_mrf, unsigned msg_length) { - fs_inst *inst = new (mem_ctx) fs_inst(op, dst, brw_message_reg(base_mrf), + fs_inst *inst = new (mem_ctx) fs_inst(op, 16, dst, brw_message_reg(base_mrf), fs_reg(0u)); inst->base_mrf = base_mrf; @@ -118,7 +118,8 @@ brw_blorp_eu_emitter::emit_combine(enum opcode combine_opcode, { assert(combine_opcode == BRW_OPCODE_ADD || combine_opcode == BRW_OPCODE_AVG); - insts.push_tail(new (mem_ctx) fs_inst(combine_opcode, dst, src_1, src_2)); + insts.push_tail(new (mem_ctx) fs_inst(combine_opcode, 16, dst, + src_1, src_2)); } fs_inst * @@ -126,7 +127,7 @@ brw_blorp_eu_emitter::emit_cmp(enum brw_conditional_mod op, const struct brw_reg &x, const struct brw_reg &y) { - fs_inst *cmp = new (mem_ctx) fs_inst(BRW_OPCODE_CMP, + fs_inst *cmp = new (mem_ctx) fs_inst(BRW_OPCODE_CMP, 16, vec16(brw_null_reg()), x, y); cmp->conditional_mod = op; insts.push_tail(cmp); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 17/17] i965/fs: Remove the width field from fs_reg
As of now, the width field is no longer used for anything. The width field "seemed like a good idea at the time" but is actually entirely redundant with the instruction's execution size. Initially, it gave us the ability to easily set the instructions execution size based entirely on register widths. With the builder, we can easiliy set the sizes explicitly and the width field doesn't have as much purpose. At this point, it's just redundant information that can get out of sync so it really needs to go. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 62 -- src/mesa/drivers/dri/i965/brw_fs_builder.h | 21 ++-- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 -- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 6 +-- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 4 +- .../drivers/dri/i965/brw_fs_register_coalesce.cpp | 1 - src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26 - src/mesa/drivers/dri/i965/brw_ir_fs.h | 13 + 8 files changed, 30 insertions(+), 107 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b30463a..3be81ca 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -203,10 +203,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld, else op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD; - assert(dst.width % 8 == 0); int regs_written = 4 * (bld.dispatch_width() / 8) * scale; - fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written), - dst.type, bld.dispatch_width()); + fs_reg vec4_result = fs_reg(GRF, alloc.allocate(regs_written), dst.type); fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset); inst->regs_written = regs_written; @@ -310,7 +308,6 @@ fs_inst::is_copy_payload(const brw::simple_allocator &grf_alloc) const for (int i = 0; i < this->sources; i++) { reg.type = this->src[i].type; - reg.width = this->src[i].width; if (!this->src[i].equals(reg)) return false; @@ -366,7 +363,6 @@ fs_reg::fs_reg(float f) this->file = IMM; this->type = BRW_REGISTER_TYPE_F; this->fixed_hw_reg.dw1.f = f; - this->width = 1; } /** Immediate value constructor. */ @@ -376,7 +372,6 @@ fs_reg::fs_reg(int32_t i) this->file = IMM; this->type = BRW_REGISTER_TYPE_D; this->fixed_hw_reg.dw1.d = i; - this->width = 1; } /** Immediate value constructor. */ @@ -386,7 +381,6 @@ fs_reg::fs_reg(uint32_t u) this->file = IMM; this->type = BRW_REGISTER_TYPE_UD; this->fixed_hw_reg.dw1.ud = u; - this->width = 1; } /** Vector float immediate value constructor. */ @@ -417,7 +411,6 @@ fs_reg::fs_reg(struct brw_reg fixed_hw_reg) this->file = HW_REG; this->fixed_hw_reg = fixed_hw_reg; this->type = fixed_hw_reg.type; - this->width = 1 << fixed_hw_reg.width; } bool @@ -432,7 +425,6 @@ fs_reg::equals(const fs_reg &r) const abs == r.abs && !reladdr && !r.reladdr && memcmp(&fixed_hw_reg, &r.fixed_hw_reg, sizeof(fixed_hw_reg)) == 0 && - width == r.width && stride == r.stride); } @@ -504,7 +496,7 @@ fs_visitor::get_timestamp(const fs_builder &bld) 0), BRW_REGISTER_TYPE_UD)); - fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 4); + fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD); /* We want to read the 3 fields we care about even if it's not enabled in * the dispatch. @@ -587,7 +579,7 @@ fs_visitor::emit_shader_time_end() fs_reg start = shader_start_time; start.negate = true; - fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1); + fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD); diff.set_smear(0); const fs_builder cbld = ibld.group(1, 0); @@ -846,7 +838,7 @@ fs_visitor::vgrf(const glsl_type *const type) { int reg_width = dispatch_width / 8; return fs_reg(GRF, alloc.allocate(type_size(type) * reg_width), - brw_type_for_base_type(type), dispatch_width); + brw_type_for_base_type(type)); } /** Fixed HW reg constructor. */ @@ -856,14 +848,6 @@ fs_reg::fs_reg(enum register_file file, int reg) this->file = file; this->reg = reg; this->type = BRW_REGISTER_TYPE_F; - - switch (file) { - case UNIFORM: - this->width = 1; - break; - default: - this->width = 8; - } } /** Fixed HW reg constructor. */ @@ -873,25 +857,6 @@ fs_reg::fs_reg(enum register_file file, int reg, enum brw_reg_type type) this->file = file; this->reg = reg; this->type = type; - - switch (file) { - case UNIFORM: - this->width = 1; - break; - default: - this->width = 8; - } -} - -/** Fixed HW reg constructor. */ -fs_reg::fs_reg(enum registe
[Mesa-dev] [PATCH 16/17] i965/fs_generator: Use inst->exec_size for determining hardware reg widths
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 8eb3ace..2b66acf 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -48,7 +48,7 @@ static uint32_t brw_file_from_reg(fs_reg *reg) } static struct brw_reg -brw_reg_from_fs_reg(fs_reg *reg) +brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg) { struct brw_reg brw_reg; @@ -57,10 +57,10 @@ brw_reg_from_fs_reg(fs_reg *reg) case MRF: if (reg->stride == 0) { brw_reg = brw_vec1_reg(brw_file_from_reg(reg), reg->reg, 0); - } else if (reg->width < 8) { + } else if (inst->exec_size < 8) { brw_reg = brw_vec8_reg(brw_file_from_reg(reg), reg->reg, 0); - brw_reg = stride(brw_reg, reg->width * reg->stride, - reg->width, reg->stride); + brw_reg = stride(brw_reg, inst->exec_size * reg->stride, + inst->exec_size, reg->stride); } else { /* From the Haswell PRM: * @@ -413,7 +413,7 @@ fs_generator::generate_blorp_fb_write(fs_inst *inst) brw_fb_WRITE(p, 16 /* dispatch_width */, brw_message_reg(inst->base_mrf), -brw_reg_from_fs_reg(&inst->src[0]), +brw_reg_from_fs_reg(inst, &inst->src[0]), BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE, inst->target, inst->mlen, @@ -1562,7 +1562,7 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) annotate(p->devinfo, &annotation, cfg, inst, p->next_insn_offset); for (unsigned int i = 0; i < inst->sources; i++) { -src[i] = brw_reg_from_fs_reg(&inst->src[i]); +src[i] = brw_reg_from_fs_reg(inst, &inst->src[i]); /* The accumulator result appears to get used for the * conditional modifier generation. When negating a UD @@ -1574,7 +1574,7 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) inst->src[i].type != BRW_REGISTER_TYPE_UD || !inst->src[i].negate); } - dst = brw_reg_from_fs_reg(&inst->dst); + dst = brw_reg_from_fs_reg(inst, &inst->dst); brw_set_default_predicate_control(p, inst->predicate); brw_set_default_predicate_inverse(p, inst->predicate_inverse); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/gen9: Implement Push Constant Buffer workaround
On Wed, Jun 3, 2015 at 9:35 PM, Ben Widawsky wrote: > This implements a workaround (exact excerpt as a comment in the code). The > docs > specify [clearly, after you struggle for a while] that the offset isn't > relative > to state base. This actually makes sense. > > Buffer #0 is meant to be used for normal uniforms. > Buffer #1 is typically used for gather constants when using RS. > Buffer #1-#3 could be used to push a bunch of UBO data which would just be > somewhere in memory, and not relative to the dynamic state. > > NOTE: I've moved away from the ternary operator for the new gen9 conditions. > Admittedly it's probably not great to do this, but I really want to fix this > all > up in the subsequent patch and doing it here makes that diff a lot nicer. I > want > to split out the gen8/9 code to make the function a bit more readable, but to > keep this easily cherry-pickable I am doing this fix first. If we decide not > to > merge the cleanup patch then I can revisit this. > > Anuj ran this on his SKL and said there were no fixes on regressions. There is > some hope it fixes BXT issues. > > Cc: Imre Deak > Cc: Neil Roberts > Cc: Anuj Phogat > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/gen7_vs_state.c | 48 > ++- > 1 file changed, 41 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c > b/src/mesa/drivers/dri/i965/gen7_vs_state.c > index 278b3ec..4b17d06 100644 > --- a/src/mesa/drivers/dri/i965/gen7_vs_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c > @@ -43,18 +43,52 @@ gen7_upload_constant_state(struct brw_context *brw, > int dwords = brw->gen >= 8 ? 11 : 7; > BEGIN_BATCH(dwords); > OUT_BATCH(opcode << 16 | (dwords - 2)); > - OUT_BATCH(active ? stage_state->push_const_size : 0); > - OUT_BATCH(0); > + > + /* Workaround for SKL+ (we use option #2 until we have a need for more > +* constant buffers). This comes from the documentation for > 3DSTATE_CONSTANT_* > +* > +* The driver must ensure The following case does not occur without a > flush > +* to the 3D engine: 3DSTATE_CONSTANT_* with buffer 3 read length equal to > +* zero committed followed by a 3DSTATE_CONSTANT_* with buffer 0 read > length > +* not equal to zero committed. Possible ways to avoid this condition > +* include: > +* 1. always force buffer 3 to have a non zero read length > +* 2. always force buffer 0 to a zero read length > +*/ > + if (brw->gen >= 9 && active) { > + OUT_BATCH(0); > + OUT_BATCH(stage_state->push_const_size); > + } else { > + OUT_BATCH(active ? stage_state->push_const_size : 0); > + OUT_BATCH(0); > + } > /* Pointer to the constant buffer. Covered by the set of state flags > * from gen6_prepare_wm_contants > */ > - OUT_BATCH(active ? (stage_state->push_const_offset | mocs) : 0); > - OUT_BATCH(0); > - OUT_BATCH(0); > - OUT_BATCH(0); > - if (brw->gen >= 8) { > + if (brw->gen >= 9 && active) { > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + /* XXX: When using buffers other than 0, you need to specify the > + * graphics virtual address regardless of INSPM/debug bits INSTPM > + */ > + OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_RENDER, 0, > + stage_state->push_const_offset); >OUT_BATCH(0); >OUT_BATCH(0); > + } else if (brw->gen>= 8) { > + OUT_BATCH(active ? (stage_state->push_const_offset | mocs) : 0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + } else { > + OUT_BATCH(active ? (stage_state->push_const_offset | mocs) : 0); > + OUT_BATCH(0); >OUT_BATCH(0); >OUT_BATCH(0); > } > -- > 2.4.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Verified with the spec. LGTM. Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa_tags: dri/common no longer exists
--- scripts/tags/mesa_tags.sh | 4 1 file changed, 4 deletions(-) diff --git a/scripts/tags/mesa_tags.sh b/scripts/tags/mesa_tags.sh index 4404b92..c8e2098 100755 --- a/scripts/tags/mesa_tags.sh +++ b/scripts/tags/mesa_tags.sh @@ -3,13 +3,9 @@ rm cscope.* rm tags git ls-files src/mesa/drivers/dri/i965 >| cscope.files -git ls-files src/mesa/drivers/dri/common >> cscope.files git ls-files src/mesa/main >> cscope.files git ls-files include/GL >> cscope.files git ls-files src/util >> cscope.files ctags -L cscope.files --langmap=C++:.C.h.c.cpp.hpp --languages=C++ --c++-kinds=+p --fields=+iaS --extra=+q cscope -bkqu #rm cscope.files - - - -- 2.4.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen8: Use HALIGN_16 for single sample mcs buffers
The original code meant to do this, but was only checking num_samples == 1 to figure out if a surface was fast clear capable. However, we can allocate single sample miptrees with num_samples == 0 (when it's an internally created buffer). This fixes a bunch of the piglit tests on gen8. Other gens should have been fine. Here is the order of events that allowed this to slip through: t0: I wrote halign patches and tested them. These alignment assertions are for gen8 fast clear surfaces, basically. t1: I pushed bogus perf patch which made fast clears never happen t2: Reworked halign patches based on Chad's feedback and introduced the bug this patch fixes. t2.5: I tested reworked patches, but assertion wasn't hit because of t1. t3. Matt fixed issue in t1 which made fast clears happen here: commit 22af95af8316f2888a3935cdf774ff0997b3dd42 Author: Matt Turner Date: Thu Jun 18 16:14:50 2015 -0700 i965: Add missing braces around if-statement. This logic should match that of the v1 of my halign patch series. Cc: Kenneth Graunke Cc: Matt Turner Reported-by: Kenneth Graunke Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 80c52f2..6aa969a 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -501,7 +501,7 @@ intel_miptree_create_layout(struct brw_context *brw, * 6 | ? |? */ if (intel_miptree_is_fast_clear_capable(brw, mt)) { - if (brw->gen >= 9 || (brw->gen == 8 && num_samples == 1)) + if (brw->gen >= 9 || (brw->gen == 8 && num_samples <= 1)) layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16; } else if (brw->gen >= 9 && num_samples > 1) { layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16; -- 2.4.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa_tags: dri/common no longer exists
Sorry, ignore this. I had script fail. Real patch coming up. On Thu, Jun 18, 2015 at 06:44:35PM -0700, Ben Widawsky wrote: > --- > scripts/tags/mesa_tags.sh | 4 > 1 file changed, 4 deletions(-) > > diff --git a/scripts/tags/mesa_tags.sh b/scripts/tags/mesa_tags.sh > index 4404b92..c8e2098 100755 > --- a/scripts/tags/mesa_tags.sh > +++ b/scripts/tags/mesa_tags.sh > @@ -3,13 +3,9 @@ > rm cscope.* > rm tags > git ls-files src/mesa/drivers/dri/i965 >| cscope.files > -git ls-files src/mesa/drivers/dri/common >> cscope.files > git ls-files src/mesa/main >> cscope.files > git ls-files include/GL >> cscope.files > git ls-files src/util >> cscope.files > ctags -L cscope.files --langmap=C++:.C.h.c.cpp.hpp --languages=C++ > --c++-kinds=+p --fields=+iaS --extra=+q > cscope -bkqu > #rm cscope.files > - > - > - > -- > 2.4.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: update developer info
Just link directly to the piglit repo the old link has outdated information. Add note about updating patchwork when sending patch revisions. --- docs/devinfo.html | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/devinfo.html b/docs/devinfo.html index 0da18b9..6779ab4 100644 --- a/docs/devinfo.html +++ b/docs/devinfo.html @@ -244,7 +244,7 @@ to update the tests themselves. Whenever possible and applicable, test the patch with -http://people.freedesktop.org/~nh/piglit/";>Piglit to +http://cgit.freedesktop.org/piglit";>Piglit to check for regressions. @@ -266,6 +266,12 @@ re-sending the whole series). Using --in-reply-to makes it harder for reviewers to accidentally review old patches. + +When submitting follow-up patches you should also login to +https://patchwork.freedesktop.org";>patchwork and change the +state of your old patches to Superseded. + + Reviewing Patches -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] darwin: Suppress type conversion warnings for GLhandleARB
On 06/17/2015 10:53 PM, Julien Isorce wrote: > From: Jon TURNEY > > On darwin, GLhandleARB is defined as a void *, not the unsigned int it is on > linux. > > For the moment, apply a cast to supress the warning > > Possibly this is safe, as for the mesa software renderer the shader program > handle is not a real pointer, but a integer handle > > Probably this is not the right thing to do, and we should pay closer attention > to how the GLhandlerARB type is used. In Mesa, glBindAttribLocation (which takes GLuint) and glBindAttribLocationARB (which takes GLhandleARB) are *the same function*. The same applies to pretty much all the other GLhandleARB functions. Disentangling them would require an ABI break on Linux. That just isn't going to happen... ...unless we decide to have a flag day and do a major amount of clean up on the libGL <-> driver interface. Before that is even contemplated, we need to finish the rewrite / refactor of the glapi generator scripts and get moved over to the Khronos XML. Dylan has been working on this, but I don't know how much progress he has made lately. > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66346 > Signed-off-by: Jon TURNEY > --- > src/mesa/main/shader_query.cpp | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp > index a6246a3..e3a1213 100644 > --- a/src/mesa/main/shader_query.cpp > +++ b/src/mesa/main/shader_query.cpp > @@ -69,7 +69,7 @@ _mesa_BindAttribLocation(GLhandleARB program, GLuint index, > GET_CURRENT_CONTEXT(ctx); > > struct gl_shader_program *const shProg = > - _mesa_lookup_shader_program_err(ctx, program, "glBindAttribLocation"); > + _mesa_lookup_shader_program_err(ctx, (uintptr_t)program, > "glBindAttribLocation"); > if (!shProg) >return; > > @@ -137,7 +137,7 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint > desired_index, >return; > } > > - shProg = _mesa_lookup_shader_program_err(ctx, program, > "glGetActiveAttrib"); > + shProg = _mesa_lookup_shader_program_err(ctx, (uintptr_t)program, > "glGetActiveAttrib"); > if (!shProg) >return; > > @@ -251,7 +251,7 @@ _mesa_GetAttribLocation(GLhandleARB program, const > GLcharARB * name) > { > GET_CURRENT_CONTEXT(ctx); > struct gl_shader_program *const shProg = > - _mesa_lookup_shader_program_err(ctx, program, "glGetAttribLocation"); > + _mesa_lookup_shader_program_err(ctx, (uintptr_t)program, > "glGetAttribLocation"); > > if (!shProg) { >return -1; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: update developer info
On Thu, Jun 18, 2015 at 8:14 PM, Timothy Arceri wrote: > Just link directly to the piglit repo the old link has outdated information. > > Add note about updating patchwork when sending patch revisions. > --- > docs/devinfo.html | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/docs/devinfo.html b/docs/devinfo.html > index 0da18b9..6779ab4 100644 > --- a/docs/devinfo.html > +++ b/docs/devinfo.html > @@ -244,7 +244,7 @@ to update the tests themselves. > > > Whenever possible and applicable, test the patch with > -http://people.freedesktop.org/~nh/piglit/";>Piglit to > +http://cgit.freedesktop.org/piglit";>Piglit to > check for regressions. > > > @@ -266,6 +266,12 @@ re-sending the whole series). Using --in-reply-to makes > it harder for reviewers to accidentally review old patches. > > > + > +When submitting follow-up patches you should also login to > +https://patchwork.freedesktop.org";>patchwork and change the > +state of your old patches to Superseded. > + > + > Reviewing Patches > > > -- > 2.1.0 Thanks! Can't believe we're still seeing piglit links to ~nh/ these days. Acked-by: Matt Turner Feel free to commit. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: update developer info
On Friday, June 19, 2015 01:14:07 PM Timothy Arceri wrote: > Just link directly to the piglit repo the old link has outdated information. > > Add note about updating patchwork when sending patch revisions. > --- > docs/devinfo.html | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/docs/devinfo.html b/docs/devinfo.html > index 0da18b9..6779ab4 100644 > --- a/docs/devinfo.html > +++ b/docs/devinfo.html > @@ -244,7 +244,7 @@ to update the tests themselves. > > > Whenever possible and applicable, test the patch with > -http://people.freedesktop.org/~nh/piglit/";>Piglit to > +http://cgit.freedesktop.org/piglit";>Piglit to > check for regressions. > > > @@ -266,6 +266,12 @@ re-sending the whole series). Using --in-reply-to makes > it harder for reviewers to accidentally review old patches. > > > + > +When submitting follow-up patches you should also login to > +https://patchwork.freedesktop.org";>patchwork and change the > +state of your old patches to Superseded. > + > + > Reviewing Patches > > > I might link to http://piglit.freedesktop.org/ instead - it's the actual Piglit website. (There's not much more than the git link, though - either are definitely better than linking to ~nh!) signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/14] mesa: Fix conditions to test signed, unsigned integer format
On Thu, 2015-06-18 at 09:19 -0700, Anuj Phogat wrote: > On Thu, Jun 18, 2015 at 7:09 AM, Iago Toral wrote: > > On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: > >> Signed-off-by: Anuj Phogat > >> Cc: > >> --- > >> src/mesa/main/readpix.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c > >> index caa2648..a9416ef 100644 > >> --- a/src/mesa/main/readpix.c > >> +++ b/src/mesa/main/readpix.c > >> @@ -160,10 +160,12 @@ _mesa_readpixels_needs_slow_path(const struct > >> gl_context *ctx, GLenum format, > >>srcType = _mesa_get_format_datatype(rb->Format); > >> > >>if ((srcType == GL_INT && > >> + _mesa_is_enum_format_integer(format) && > >> (type == GL_UNSIGNED_INT || > >> type == GL_UNSIGNED_SHORT || > >> type == GL_UNSIGNED_BYTE)) || > >>(srcType == GL_UNSIGNED_INT && > >> + _mesa_is_enum_format_integer(format) && > >> (type == GL_INT || > >> type == GL_SHORT || > >> type == GL_BYTE))) { > > > > As far as I understand this code we are trying to see if we can use > > memcpy to directly copy the contents of the framebuffer to the > > destination buffer. In that case, as long as the src/dst types have > > different sign we can't just use memcpy, right? In fact it looks like we > > might need to expand the checks to include the cases where srcType is > > GL_(UNSIGNED_)SHORT and GL_(UNSIGNED_)BYTE as well. > > > srcType returned by _mesa_get_format_datatype() is one of: > GL_UNSIGNED_NORMALIZED > GL_SIGNED_NORMALIZED > GL_UNSIGNED_INT > GL_INT > GL_FLOAT > So, the suggested checks for srcType are not required. Oh, right, although I think that does not invalidate my point: can we memcpy from a GL_UNSIGNED_NORMALIZED to a format with type GL_FLOAT or GL_SIGNED_NORMALIZED? It does not look like these checks here are thorough. In any case, that's beyond the point of your patch. Talking specifically about your patch: can we memcpy, for example, from a _signed_ integer format like MESA_FORMAT_R_SINT8 to an _unsigned_ format (integer or not)? I don't think we can, in which case your patch would not look correct to me. Also, as I said in my previous e-mail, I feel like these checks here do not add anything, at least when this function is called from readpixels_can_use_memcpy, since even if we return true here, we will later call _mesa_format_matches_format_and_type and that would check that the formats match anyway before going through the memcpy path. Even the other use of this function in Gallium would call _mesa_format_matches_format_and_type before it calls this... That's why I think we probably want to remove these checks from this function and rely on _mesa_format_matches_format_and_type exclusively to check allowed formats and types. > > That said, I think this code is not necessary with the call to > > _mesa_format_matches_format_and_type that we do immediately after this, > > since that will check that the framebuffer format exactly matches the > > destination format anyway, which is a much tighter check than this. In > > fact, a quick piglit run without these checks does not seem to break any > > tests on i965. Gallium uses these two functions in a slightly different > > way in st_cb_readpixels.c though, so I wonder if their use case requires > > these checks to exist in this function anyway. > > > > Iago > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev