Re: [Mesa-dev] [PATCH 1/2] radeonsi: add FMASK texture binding slots and resource setup (v2)
On Fre, 2013-08-16 at 03:29 +0200, Marek Olšák wrote: > v2: bind FMASK textures to shader resource slots 16..31 This series is Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: Don't leave gaps between position exports from vertex shader
On Fre, 2013-08-09 at 23:41 +0200, Laurent Carlier wrote: > Le vendredi 9 août 2013 18:50:20 Michel Dänzer a écrit : > > From: Michel Dänzer > > > > Exporting position 2/3 (clip distances) but not position 1 (point size) > > causes geometry corruption for some reason. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974 > > > > Cc: mesa-sta...@lists.freedesktop.org > > Signed-off-by: Michel Dänzer > > Tested with Dota2 and L4D2, and it fixes the rendering Thanks for testing. Did you overlook the corruption you reported in https://bugs.freedesktop.org/show_bug.cgi?id=68162 when you tested that series, or was it not there? I've been looking into that as well, but haven't found the problem yet. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] vl: rename enum pipe_video_codec to pipe_video_format
From: Christian König Signed-off-by: Christian König --- src/gallium/auxiliary/util/u_video.h | 12 +++ src/gallium/auxiliary/vl/vl_decoder.c |4 +-- src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c |2 +- src/gallium/auxiliary/vl/vl_mpeg12_decoder.c |4 +-- src/gallium/drivers/nouveau/nouveau_video.c|2 +- src/gallium/drivers/nouveau/nouveau_vp3_video.c| 30 src/gallium/drivers/nouveau/nouveau_vp3_video.h|2 +- .../drivers/nouveau/nouveau_vp3_video_bsp.c| 10 +++--- src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c | 12 +++ src/gallium/drivers/nv50/nv84_video.c |8 ++--- src/gallium/drivers/nv50/nv98_video.c |8 ++--- src/gallium/drivers/nv50/nv98_video_bsp.c |8 ++--- src/gallium/drivers/nv50/nv98_video_ppp.c | 12 +++ src/gallium/drivers/nv50/nv98_video_vp.c |8 ++--- src/gallium/drivers/nvc0/nvc0_video.c |8 ++--- src/gallium/drivers/nvc0/nvc0_video_bsp.c |6 ++-- src/gallium/drivers/nvc0/nvc0_video_ppp.c | 12 +++ src/gallium/drivers/nvc0/nvc0_video_vp.c |8 ++--- src/gallium/drivers/r600/r600_uvd.c|6 ++-- src/gallium/drivers/radeon/radeon_uvd.c| 38 ++-- src/gallium/include/pipe/p_video_enums.h | 18 +- src/gallium/include/pipe/p_video_state.h |4 +-- src/gallium/state_trackers/vdpau/decode.c |8 ++--- src/gallium/state_trackers/xvmc/surface.c |2 +- 24 files changed, 116 insertions(+), 116 deletions(-) diff --git a/src/gallium/auxiliary/util/u_video.h b/src/gallium/auxiliary/util/u_video.h index e575947..276e460 100644 --- a/src/gallium/auxiliary/util/u_video.h +++ b/src/gallium/auxiliary/util/u_video.h @@ -39,7 +39,7 @@ extern "C" { #include "pipe/p_compiler.h" #include "util/u_debug.h" -static INLINE enum pipe_video_codec +static INLINE enum pipe_video_format u_reduce_video_profile(enum pipe_video_profile profile) { switch (profile) @@ -47,24 +47,24 @@ u_reduce_video_profile(enum pipe_video_profile profile) case PIPE_VIDEO_PROFILE_MPEG1: case PIPE_VIDEO_PROFILE_MPEG2_SIMPLE: case PIPE_VIDEO_PROFILE_MPEG2_MAIN: - return PIPE_VIDEO_CODEC_MPEG12; + return PIPE_VIDEO_FORMAT_MPEG12; case PIPE_VIDEO_PROFILE_MPEG4_SIMPLE: case PIPE_VIDEO_PROFILE_MPEG4_ADVANCED_SIMPLE: - return PIPE_VIDEO_CODEC_MPEG4; + return PIPE_VIDEO_FORMAT_MPEG4; case PIPE_VIDEO_PROFILE_VC1_SIMPLE: case PIPE_VIDEO_PROFILE_VC1_MAIN: case PIPE_VIDEO_PROFILE_VC1_ADVANCED: - return PIPE_VIDEO_CODEC_VC1; + return PIPE_VIDEO_FORMAT_VC1; case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE: case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN: case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH: - return PIPE_VIDEO_CODEC_MPEG4_AVC; + return PIPE_VIDEO_FORMAT_MPEG4_AVC; default: - return PIPE_VIDEO_CODEC_UNKNOWN; + return PIPE_VIDEO_FORMAT_UNKNOWN; } } diff --git a/src/gallium/auxiliary/vl/vl_decoder.c b/src/gallium/auxiliary/vl/vl_decoder.c index dcbb77c..60e0ce9 100644 --- a/src/gallium/auxiliary/vl/vl_decoder.c +++ b/src/gallium/auxiliary/vl/vl_decoder.c @@ -37,7 +37,7 @@ vl_profile_supported(struct pipe_screen *screen, enum pipe_video_profile profile { assert(screen); switch (u_reduce_video_profile(profile)) { - case PIPE_VIDEO_CODEC_MPEG12: + case PIPE_VIDEO_FORMAT_MPEG12: return true; default: return false; @@ -82,7 +82,7 @@ vl_create_decoder(struct pipe_context *pipe, temp.height = pot_buffers ? util_next_power_of_two(height) : align(height, VL_MACROBLOCK_HEIGHT); switch (u_reduce_video_profile(temp.profile)) { - case PIPE_VIDEO_CODEC_MPEG12: + case PIPE_VIDEO_FORMAT_MPEG12: return vl_create_mpeg12_decoder(pipe, &temp); default: diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c index 81199e4..d8c5311 100644 --- a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c +++ b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c @@ -814,7 +814,7 @@ decode_slice(struct vl_mpg12_bs *bs, struct pipe_video_buffer *target) signed x = -1; memset(&mb, 0, sizeof(mb)); - mb.base.codec = PIPE_VIDEO_CODEC_MPEG12; + mb.base.codec = PIPE_VIDEO_FORMAT_MPEG12; mb.y = vl_vlc_get_uimsbf(&bs->vlc, 8) - 1; mb.blocks = dct_blocks; diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c index 48661cf..9349b5e 100644 --- a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c +++ b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c @@ -582,7 +582,7 @@ vl_mpeg12_decode_macroblock(struct pipe_video_decoder *decoder, unsigned i, j, mv_weights[2]; assert(
[Mesa-dev] [PATCH 4/5] vl: add entrypoint to get_video_param
From: Christian König Signed-off-by: Christian König --- src/gallium/auxiliary/vl/vl_decoder.c |4 +++- src/gallium/auxiliary/vl/vl_decoder.h |3 ++- src/gallium/auxiliary/vl/vl_video_buffer.c |1 + src/gallium/drivers/ilo/ilo_screen.c|3 ++- src/gallium/drivers/nouveau/nouveau_video.c |3 ++- src/gallium/drivers/nouveau/nouveau_vp3_video.c |1 + src/gallium/drivers/nouveau/nouveau_vp3_video.h |1 + src/gallium/drivers/nv50/nv50_context.h |1 + src/gallium/drivers/nv50/nv84_video.c |1 + src/gallium/drivers/r300/r300_screen.c |3 ++- src/gallium/drivers/r600/r600_pipe.c|3 ++- src/gallium/drivers/r600/r600_pipe.h|1 + src/gallium/drivers/r600/r600_uvd.c |3 ++- src/gallium/drivers/radeon/radeon_uvd.c |1 + src/gallium/drivers/radeon/radeon_uvd.h |1 + src/gallium/drivers/radeonsi/radeonsi_pipe.c|3 ++- src/gallium/drivers/softpipe/sp_screen.c|3 ++- src/gallium/include/pipe/p_screen.h |1 + src/gallium/state_trackers/vdpau/decode.c | 13 + src/gallium/state_trackers/vdpau/mixer.c|4 ++-- src/gallium/state_trackers/vdpau/query.c| 18 -- src/gallium/state_trackers/vdpau/surface.c |2 ++ src/gallium/state_trackers/xvmc/subpicture.c|1 + src/gallium/state_trackers/xvmc/surface.c |6 -- 24 files changed, 58 insertions(+), 23 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_decoder.c b/src/gallium/auxiliary/vl/vl_decoder.c index f19834f..b325b8c 100644 --- a/src/gallium/auxiliary/vl/vl_decoder.c +++ b/src/gallium/auxiliary/vl/vl_decoder.c @@ -33,7 +33,8 @@ #include "vl_mpeg12_decoder.h" bool -vl_profile_supported(struct pipe_screen *screen, enum pipe_video_profile profile) +vl_profile_supported(struct pipe_screen *screen, enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint) { assert(screen); switch (u_reduce_video_profile(profile)) { @@ -74,6 +75,7 @@ vl_create_decoder(struct pipe_context *pipe, ( pipe->screen, templat->profile, + templat->entrypoint, PIPE_VIDEO_CAP_NPOT_TEXTURES ); diff --git a/src/gallium/auxiliary/vl/vl_decoder.h b/src/gallium/auxiliary/vl/vl_decoder.h index 124315f..0c216df 100644 --- a/src/gallium/auxiliary/vl/vl_decoder.h +++ b/src/gallium/auxiliary/vl/vl_decoder.h @@ -35,7 +35,8 @@ * check if a given profile is supported with shader based decoding */ bool -vl_profile_supported(struct pipe_screen *screen, enum pipe_video_profile profile); +vl_profile_supported(struct pipe_screen *screen, enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint); /** * get the maximum supported level for the given profile with shader based decoding diff --git a/src/gallium/auxiliary/vl/vl_video_buffer.c b/src/gallium/auxiliary/vl/vl_video_buffer.c index 16c7649..d81c181 100644 --- a/src/gallium/auxiliary/vl/vl_video_buffer.c +++ b/src/gallium/auxiliary/vl/vl_video_buffer.c @@ -406,6 +406,7 @@ vl_video_buffer_create(struct pipe_context *pipe, ( pipe->screen, PIPE_VIDEO_PROFILE_UNKNOWN, + PIPE_VIDEO_ENTRYPOINT_UNKNOWN, PIPE_VIDEO_CAP_NPOT_TEXTURES ); diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index 5f97226..9f3235c 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -152,11 +152,12 @@ ilo_get_shader_param(struct pipe_screen *screen, unsigned shader, static int ilo_get_video_param(struct pipe_screen *screen, enum pipe_video_profile profile, +enum pipe_video_entrypoint entrypoint, enum pipe_video_cap param) { switch (param) { case PIPE_VIDEO_CAP_SUPPORTED: - return vl_profile_supported(screen, profile); + return vl_profile_supported(screen, profile, entrypoint); case PIPE_VIDEO_CAP_NPOT_TEXTURES: return 1; case PIPE_VIDEO_CAP_MAX_WIDTH: diff --git a/src/gallium/drivers/nouveau/nouveau_video.c b/src/gallium/drivers/nouveau/nouveau_video.c index 67b6739..8e08cab 100644 --- a/src/gallium/drivers/nouveau/nouveau_video.c +++ b/src/gallium/drivers/nouveau/nouveau_video.c @@ -834,11 +834,12 @@ error: static int nouveau_screen_get_video_param(struct pipe_screen *pscreen, enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint, enum pipe_video_cap param) { switch (param) { case PIPE_VIDEO_CAP_SUPPORTED: - return vl_profile_supported(pscreen, profile); + return vl_profile_supported(pscreen, profile, entrypoint); case PIPE_VIDEO_CAP_NPOT_TEXTURES: return 1; case PIPE_VIDEO_CAP_MAX_WIDTH: diff --git a/src/g
[Mesa-dev] [PATCH 5/5] vl: add entrypoint to is_video_format_supported
From: Christian König Signed-off-by: Christian König --- src/gallium/auxiliary/vl/vl_video_buffer.c |3 ++- src/gallium/auxiliary/vl/vl_video_buffer.h |3 ++- src/gallium/drivers/ilo/ilo_format.c|5 +++-- src/gallium/drivers/nouveau/nouveau_vp3_video.c |5 +++-- src/gallium/drivers/nouveau/nouveau_vp3_video.h |3 ++- src/gallium/drivers/nv50/nv50_context.h |3 ++- src/gallium/drivers/nv50/nv84_video.c |5 +++-- src/gallium/drivers/radeon/radeon_uvd.c |3 ++- src/gallium/drivers/radeon/radeon_uvd.h |3 ++- src/gallium/include/pipe/p_screen.h |3 ++- src/gallium/state_trackers/vdpau/decode.c |3 ++- src/gallium/state_trackers/vdpau/query.c|6 -- 12 files changed, 29 insertions(+), 16 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_video_buffer.c b/src/gallium/auxiliary/vl/vl_video_buffer.c index d81c181..f0ba389 100644 --- a/src/gallium/auxiliary/vl/vl_video_buffer.c +++ b/src/gallium/auxiliary/vl/vl_video_buffer.c @@ -147,7 +147,8 @@ vl_video_buffer_surface_format(enum pipe_format format) boolean vl_video_buffer_is_format_supported(struct pipe_screen *screen, enum pipe_format format, -enum pipe_video_profile profile) +enum pipe_video_profile profile, +enum pipe_video_entrypoint entrypoint) { const enum pipe_format *resource_formats; unsigned i; diff --git a/src/gallium/auxiliary/vl/vl_video_buffer.h b/src/gallium/auxiliary/vl/vl_video_buffer.h index e92e270..b936a37 100644 --- a/src/gallium/auxiliary/vl/vl_video_buffer.h +++ b/src/gallium/auxiliary/vl/vl_video_buffer.h @@ -73,7 +73,8 @@ vl_video_buffer_max_size(struct pipe_screen *screen); boolean vl_video_buffer_is_format_supported(struct pipe_screen *screen, enum pipe_format format, -enum pipe_video_profile profile); +enum pipe_video_profile profile, +enum pipe_video_entrypoint entrypoint); /* * set the associated data for the given video buffer diff --git a/src/gallium/drivers/ilo/ilo_format.c b/src/gallium/drivers/ilo/ilo_format.c index 65fb820..40b5ffa 100644 --- a/src/gallium/drivers/ilo/ilo_format.c +++ b/src/gallium/drivers/ilo/ilo_format.c @@ -671,9 +671,10 @@ ilo_is_format_supported(struct pipe_screen *screen, static boolean ilo_is_video_format_supported(struct pipe_screen *screen, enum pipe_format format, - enum pipe_video_profile profile) + enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint) { - return vl_video_buffer_is_format_supported(screen, format, profile); + return vl_video_buffer_is_format_supported(screen, format, profile, entrypoint); } /** diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c b/src/gallium/drivers/nouveau/nouveau_vp3_video.c index 1659c5f..07ce016 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c @@ -416,10 +416,11 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen, boolean nouveau_vp3_screen_video_supported(struct pipe_screen *screen, enum pipe_format format, - enum pipe_video_profile profile) + enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint) { if (profile != PIPE_VIDEO_PROFILE_UNKNOWN) return format == PIPE_FORMAT_NV12; - return vl_video_buffer_is_format_supported(screen, format, profile); + return vl_video_buffer_is_format_supported(screen, format, profile, entrypoint); } diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index 8aedfd0..0193ed0 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -226,4 +226,5 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen, boolean nouveau_vp3_screen_video_supported(struct pipe_screen *screen, enum pipe_format format, - enum pipe_video_profile profile); + enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint); diff --git a/src/gallium/drivers/nv50/nv50_context.h b/src/gallium/drivers/nv50/nv50_context.h index 3600f70..ed4ef24 100644 --- a/src/gallium/drivers/nv50/nv50_context.h +++ b/src/gallium/drivers/nv50/nv50_context.h @@ -307,7 +307,8 @@ nv84_screen_get_video_param(struct pipe_screen *pscreen, boolean nv84_sc
Re: [Mesa-dev] [PATCH 6/6] i965: Make the VS binding table as small as possible.
On 14 August 2013 21:07, Kenneth Graunke wrote: > For some reason, we didn't use this information even though the VS > backend has computed it (albeit poorly) for ages. > > Signed-off-by: Kenneth Graunke > For some reason I can't get this series to apply cleanly. Can you post a branch somewhere? > --- > src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c > b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c > index 2c2d713..4577e76 100644 > --- a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c > +++ b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c > @@ -145,7 +145,8 @@ brw_vs_upload_binding_table(struct brw_context *brw) > /* CACHE_NEW_VS_PROG: Skip making a binding table if we don't use > textures or > * pull constants. > */ > - if (brw->vs.prog_data->base.binding_table_size == 0) { > + const unsigned entries = brw->vs.prog_data->base.binding_table_size; > + if (entries == 0) { >if (brw->vs.bind_bo_offset != 0) { > brw->state.dirty.brw |= BRW_NEW_VS_BINDING_TABLE; > brw->vs.bind_bo_offset = 0; > @@ -157,11 +158,11 @@ brw_vs_upload_binding_table(struct brw_context *brw) > * space for the binding table. > */ > bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE, > - sizeof(uint32_t) * BRW_MAX_VS_SURFACES, > + sizeof(uint32_t) * entries, > 32, &brw->vs.bind_bo_offset); > > /* BRW_NEW_SURFACES and BRW_NEW_VS_CONSTBUF */ > - for (i = 0; i < BRW_MAX_VS_SURFACES; i++) { > + for (i = 0; i < entries; i++) { >bind[i] = brw->vs.surf_offset[i]; > } > > -- > 1.8.3.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] i965: Make the VS binding table as small as possible.
On 16 August 2013 05:35, Paul Berry wrote: > On 14 August 2013 21:07, Kenneth Graunke wrote: > >> For some reason, we didn't use this information even though the VS >> backend has computed it (albeit poorly) for ages. >> >> Signed-off-by: Kenneth Graunke >> > > For some reason I can't get this series to apply cleanly. Can you post a > branch somewhere? > Never mind, I sorted it out. Checking out a453eb6, then applying series "[PATCH 00/10] i965: Separate VS/FS sampler tables." did the trick. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] i965/fs: Use SURF_INDEX_DRAW() when emitting FB writes.
On 14 August 2013 21:07, Kenneth Graunke wrote: > SURF_INDEX_DRAW is the identity function, and it's unlikely that it will > change, but we may as well use it for documentation's sake. > > Signed-off-by: Kenneth Graunke > The comment above the declaration of SURF_INDEX_DRAW (brw_context.h) says that it's never used: * Note that nothing actually uses the SURF_INDEX_DRAW macro, so it has to be * the identity function or things will break. We do want to keep draw buffers * first so we can use headerless render target writes for RT 0. Would you mind updating the comment in this commit? With that fixed, this patch is: Reviewed-by: Paul Berry ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] i965/fs: Track the maximum surface index used in brw_wm_prog_data.
On 14 August 2013 21:07, Kenneth Graunke wrote: > This allows us to determine how small we can make the binding table. > > Since it depends entirely on the shader program, we can just compute > it once at compile time, rather than at binding table emit time (which > happens during drawing). > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.h | 2 ++ > src/mesa/drivers/dri/i965/brw_fs.h| 2 ++ > src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 23 +++ > 3 files changed, 27 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index ff0a65c..380fe08 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -305,6 +305,8 @@ struct brw_wm_prog_data { > GLuint reg_blocks_16; > GLuint total_scratch; > > + unsigned max_surface_index; > + > I'm bothered by the off-by-one inconsistency of using max_surface_index here, but using binding_table_size over in brw_vec4_prog_data (see patch 5). Could we change this to binding_table_size, and update fs_generator::mark_surface_used() to do: prog_data->binding_table_size = MAX2(prog_data->binding_table_size, surf_index + 1); Then it would be consistent with vec4_generator::mark_surface_used(). With that changed, this patch is: Reviewed-by: Paul Berry > GLuint nr_params; /**< number of float params/constants */ > GLuint nr_pull_params; > bool dual_src_blend; > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h > b/src/mesa/drivers/dri/i965/brw_fs.h > index 7feb2b6..9d240b5 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.h > +++ b/src/mesa/drivers/dri/i965/brw_fs.h > @@ -569,6 +569,8 @@ private: > struct brw_reg offset, > struct brw_reg value); > > + void mark_surface_used(unsigned surf_index); > + > void patch_discard_jumps_to_fb_writes(); > > struct brw_context *brw; > diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > index b90cf0f..41dacff 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > @@ -59,6 +59,15 @@ fs_generator::~fs_generator() > } > > void > +fs_generator::mark_surface_used(unsigned surf_index) > +{ > + assert(surf_index < BRW_MAX_WM_SURFACES); > + > + if (surf_index > c->prog_data.max_surface_index) > + c->prog_data.max_surface_index = surf_index; > +} > + > +void > fs_generator::patch_discard_jumps_to_fb_writes() > { > if (brw->gen < 6 || this->discard_halt_patches.is_empty()) > @@ -175,6 +184,8 @@ fs_generator::generate_fb_write(fs_inst *inst) > 0, > eot, > inst->header_present); > + > + mark_surface_used(SURF_INDEX_DRAW(inst->target)); > } > > /* Computes the integer pixel x,y values from the origin. > @@ -519,6 +530,8 @@ fs_generator::generate_tex(fs_inst *inst, struct > brw_reg dst, struct brw_reg src > inst->header_present, > simd_mode, > return_format); > + > + mark_surface_used(SURF_INDEX_TEXTURE(inst->sampler)); > } > > > @@ -648,6 +661,8 @@ > fs_generator::generate_uniform_pull_constant_load(fs_inst *inst, > > brw_oword_block_read(p, dst, brw_message_reg(inst->base_mrf), > read_offset, surf_index); > + > + mark_surface_used(surf_index); > } > > void > @@ -688,6 +703,8 @@ > fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst, > false, /* no header */ > BRW_SAMPLER_SIMD_MODE_SIMD4X2, > 0); > + > + mark_surface_used(surf_index); > } > > void > @@ -753,6 +770,8 @@ > fs_generator::generate_varying_pull_constant_load(fs_inst *inst, > inst->header_present, > simd_mode, > return_format); > + > + mark_surface_used(surf_index); > } > > void > @@ -795,6 +814,8 @@ > fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst, > false, /* no header */ > simd_mode, > 0); > + > + mark_surface_used(surf_index); > } > > /** > @@ -1040,6 +1061,8 @@ fs_generator::generate_shader_time_add(fs_inst *inst, > brw_MOV(p, payload_value, value); > brw_shader_time_add(p, payload, SURF_INDEX_WM_SHADER_TIME); > brw_pop_insn_state(p); > + > + mark_surface_used(SURF_INDEX_WM_SHADER_TIME); > } > > void > -- > 1.8.3.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/6] i965: Make the VS binding table as small as possible.
On 14 August 2013 21:07, Kenneth Graunke wrote: > For some reason, we didn't use this information even though the VS > backend has computed it (albeit poorly) for ages. > > Signed-off-by: Kenneth Graunke > This series, and the 10 patch series it's based on, are some nice reorg work. Thanks for doing this. I made comments on patches 1 and 2. The remainder are: Reviewed-by: Paul Berry ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] radeonsi: LLVM r187139 broke some piglit tests
On Don, 2013-08-15 at 13:50 -0700, Tom Stellard wrote: > On Thu, Aug 15, 2013 at 07:50:10PM +0200, Michel Dänzer wrote: > > On Don, 2013-08-15 at 09:16 -0700, Tom Stellard wrote: > > > On Thu, Aug 15, 2013 at 08:22:39AM -0700, Tom Stellard wrote: > > > > On Thu, Aug 15, 2013 at 11:55:36AM +0200, Michel Dänzer wrote: > > > > > On Fre, 2013-08-02 at 17:58 +0200, Michel Dänzer wrote: > > > > > > On Mit, 2013-07-31 at 08:42 -0700, Tom Stellard wrote: > > > > > > > On Wed, Jul 31, 2013 at 01:04:01PM +0200, Michel Dänzer wrote: > > > > > > > > > > > > > > > > LLVM revision 187139 ('Allocate local registers in order for > > > > > > > > optimal > > > > > > > > coloring.') broke some derivative related piglit tests with the > > > > > > > > radeonsi > > > > > > > > driver. > > > > > > > > > > > > > > > > I'm attaching a diff between the bad and good generated code > > > > > > > > (as printed > > > > > > > > with RADEON_DUMP_SHADERS=1) for the glsl-derivs test. The only > > > > > > > > difference I can see is in which registers are used in which > > > > > > > > order. > > > > > > > > > > > > > > > > I wonder if we might be missing S_WAITCNT after DS_READ/WRITE > > > > > > > > instructions in some cases, but I haven't spotted any > > > > > > > > candidates for > > > > > > > > that in the bad code which aren't there in the good code as > > > > > > > > well. Can > > > > > > > > anyone else spot something I've missed? > > > > > > > > > > > > > > Shouldn't we be using the S_BARRIER instruction to keep the > > > > > > > threads in sync? > > > > > > > > > > > > Doesn't seem to help unfortunately, but thanks for the good > > > > > > suggestion. > > > > > > > > > > I found one thing going wrong: DS_WRITE_B32 ends up using a VGPR > > > > > register number instead of the $gds operand for encoding the GDS field > > > > > (the asm output from llc even shows the VGPR name). If the VGPR number > > > > > happens to be odd (i.e. to have the least significant bit set), the > > > > > shader ends up writing to GDS instead of LDS. > > > > > > > > > > > > > Ouch, that's a pretty bad bug. > > > > > > > > > But I have no idea why this is happening, or how to fix it. :( > > > > > > > > > > > > > > > > > > I can take a look at it. > > > > > > The attached patch should fix the problem, can you test? > > > > Thanks for finding my silly mistake. > > > > However, I'd like to preserve the ability to use these instructions for > > GDS access, and the logic in SIInsertWaits::getHwCounts() only really > > makes sense for SMRD anyway. > > > > How about this patch instead? It fixes the piglit regressions that > > prompted me to start this thread. [...] > > diff --git a/lib/Target/R600/SIInsertWaits.cpp > > b/lib/Target/R600/SIInsertWaits.cpp > > index ba202e3..200e064 100644 > > --- a/lib/Target/R600/SIInsertWaits.cpp > > +++ b/lib/Target/R600/SIInsertWaits.cpp > > @@ -134,14 +134,20 @@ Counters SIInsertWaits::getHwCounts(MachineInstr &MI) > > { > >// LGKM may uses larger values > >if (TSFlags & SIInstrFlags::LGKM_CNT) { > > > > -MachineOperand &Op = MI.getOperand(0); > > -if (!Op.isReg()) > > - Op = MI.getOperand(1); > > -assert(Op.isReg() && "First LGKM operand must be a register!"); > > - > > -unsigned Reg = Op.getReg(); > > -unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize(); > > -Result.Named.LGKM = Size > 4 ? 2 : 1; > > +if (MI.getNumOperands() == 3) { > > We should add a TSFlag for SMRD like we do for MIMG and add a helper > function isSMRD to SIInstrInfo and use it here. The number of operands > for instructions tends to change from time to time. Like this? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer >From c1d6b0f3d9cfcf2255257fdc87c748a46f799935 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michel=20D=C3=A4nzer?= Date: Thu, 15 Aug 2013 19:43:02 +0200 Subject: [PATCH v2] R600/SI: Fix broken encoding of DS_WRITE_B32 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused it to corrupt the encoding of that by clobbering the first operand with the second one. Undo that damage and only apply the SMRD logic to that. Fixes some derivates related piglit regressions with radeonsi. Signed-off-by: Michel Dänzer --- v2: Use SIInstrFlags to test for SMRD instructions. lib/Target/R600/SIDefines.h | 3 ++- lib/Target/R600/SIInsertWaits.cpp | 21 + lib/Target/R600/SIInstrFormats.td | 3 +++ lib/Target/R600/SIInstrInfo.cpp | 4 lib/Target/R600/SIInstrInfo.h | 1 + test/CodeGen/R600/local-memory.ll | 4 ++-- 6 files changed, 25 insertions(+), 11 deletions(-) diff --git a/lib/Target/R600/SIDefines.h b/lib/Target/R600/SIDefines.h index 572ed6a..f5445a
Re: [Mesa-dev] radeonsi: LLVM r187139 broke some piglit tests
On Fri, Aug 16, 2013 at 03:36:38PM +0200, Michel Dänzer wrote: > On Don, 2013-08-15 at 13:50 -0700, Tom Stellard wrote: > > On Thu, Aug 15, 2013 at 07:50:10PM +0200, Michel Dänzer wrote: > > > On Don, 2013-08-15 at 09:16 -0700, Tom Stellard wrote: > > > > On Thu, Aug 15, 2013 at 08:22:39AM -0700, Tom Stellard wrote: > > > > > On Thu, Aug 15, 2013 at 11:55:36AM +0200, Michel Dänzer wrote: > > > > > > On Fre, 2013-08-02 at 17:58 +0200, Michel Dänzer wrote: > > > > > > > On Mit, 2013-07-31 at 08:42 -0700, Tom Stellard wrote: > > > > > > > > On Wed, Jul 31, 2013 at 01:04:01PM +0200, Michel Dänzer wrote: > > > > > > > > > > > > > > > > > > LLVM revision 187139 ('Allocate local registers in order for > > > > > > > > > optimal > > > > > > > > > coloring.') broke some derivative related piglit tests with > > > > > > > > > the radeonsi > > > > > > > > > driver. > > > > > > > > > > > > > > > > > > I'm attaching a diff between the bad and good generated code > > > > > > > > > (as printed > > > > > > > > > with RADEON_DUMP_SHADERS=1) for the glsl-derivs test. The only > > > > > > > > > difference I can see is in which registers are used in which > > > > > > > > > order. > > > > > > > > > > > > > > > > > > I wonder if we might be missing S_WAITCNT after DS_READ/WRITE > > > > > > > > > instructions in some cases, but I haven't spotted any > > > > > > > > > candidates for > > > > > > > > > that in the bad code which aren't there in the good code as > > > > > > > > > well. Can > > > > > > > > > anyone else spot something I've missed? > > > > > > > > > > > > > > > > Shouldn't we be using the S_BARRIER instruction to keep the > > > > > > > > threads in sync? > > > > > > > > > > > > > > Doesn't seem to help unfortunately, but thanks for the good > > > > > > > suggestion. > > > > > > > > > > > > I found one thing going wrong: DS_WRITE_B32 ends up using a VGPR > > > > > > register number instead of the $gds operand for encoding the GDS > > > > > > field > > > > > > (the asm output from llc even shows the VGPR name). If the VGPR > > > > > > number > > > > > > happens to be odd (i.e. to have the least significant bit set), the > > > > > > shader ends up writing to GDS instead of LDS. > > > > > > > > > > > > > > > > Ouch, that's a pretty bad bug. > > > > > > > > > > > But I have no idea why this is happening, or how to fix it. :( > > > > > > > > > > > > > > > > > > > > > > I can take a look at it. > > > > > > > > The attached patch should fix the problem, can you test? > > > > > > Thanks for finding my silly mistake. > > > > > > However, I'd like to preserve the ability to use these instructions for > > > GDS access, and the logic in SIInsertWaits::getHwCounts() only really > > > makes sense for SMRD anyway. > > > > > > How about this patch instead? It fixes the piglit regressions that > > > prompted me to start this thread. > > [...] > > > > diff --git a/lib/Target/R600/SIInsertWaits.cpp > > > b/lib/Target/R600/SIInsertWaits.cpp > > > index ba202e3..200e064 100644 > > > --- a/lib/Target/R600/SIInsertWaits.cpp > > > +++ b/lib/Target/R600/SIInsertWaits.cpp > > > @@ -134,14 +134,20 @@ Counters SIInsertWaits::getHwCounts(MachineInstr > > > &MI) { > > >// LGKM may uses larger values > > >if (TSFlags & SIInstrFlags::LGKM_CNT) { > > > > > > -MachineOperand &Op = MI.getOperand(0); > > > -if (!Op.isReg()) > > > - Op = MI.getOperand(1); > > > -assert(Op.isReg() && "First LGKM operand must be a register!"); > > > - > > > -unsigned Reg = Op.getReg(); > > > -unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize(); > > > -Result.Named.LGKM = Size > 4 ? 2 : 1; > > > +if (MI.getNumOperands() == 3) { > > > > We should add a TSFlag for SMRD like we do for MIMG and add a helper > > function isSMRD to SIInstrInfo and use it here. The number of operands > > for instructions tends to change from time to time. > > Like this? > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Debian, X and DRI developer > From c1d6b0f3d9cfcf2255257fdc87c748a46f799935 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Michel=20D=C3=A4nzer?= > Date: Thu, 15 Aug 2013 19:43:02 +0200 > Subject: [PATCH v2] R600/SI: Fix broken encoding of DS_WRITE_B32 > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD > instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused > it to corrupt the encoding of that by clobbering the first operand with > the second one. > > Undo that damage and only apply the SMRD logic to that. > > Fixes some derivates related piglit regressions with radeonsi. > > Signed-off-by: Michel Dänzer Reviewed-by: Tom Stellard > --- > > v2: Use SIInstrFlags to test for SMRD instructions. > > lib/Target/R600/SIDefines.h | 3 ++- > lib/Target/R600/SIInsertWaits.cpp
Re: [Mesa-dev] [PATCH V3 3/3] i965/blorp: Add support for single sample scaled blit with bilinear filter
On 14 August 2013 18:28, Anuj Phogat wrote: > Currently single sample scaled blits with GL_LINEAR filter falls > back to meta path. Patch removes this limitation in BLORP engine > and implements single sample scaled blit with bilinear filter. > No piglit, gles3 regressions are observed with this patch on Ivybridge. > > V2: Use "sample" message to utilize the linear filtering functionality > built in to hardware. > V3: Define a bool variable (bilinear_filter) to handle the conditions > for GL_LINEAR blits. > Thanks, Anuj. Reviewed-by: Paul Berry ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] R600/SI: Add pattern for xor of i1
From: Michel Dänzer Fixes two recent piglit regressions with radeonsi. Signed-off-by: Michel Dänzer --- lib/Target/R600/SIInstructions.td | 4 +++- test/CodeGen/R600/xor.ll | 17 + 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td index 4eb3566..436a2cd 100644 --- a/lib/Target/R600/SIInstructions.td +++ b/lib/Target/R600/SIInstructions.td @@ -1144,7 +1144,9 @@ def : Pat < (S_OR_B64 $src0, $src1) >; def S_XOR_B32 : SOP2_32 <0x0012, "S_XOR_B32", []>; -def S_XOR_B64 : SOP2_64 <0x0013, "S_XOR_B64", []>; +def S_XOR_B64 : SOP2_64 <0x0013, "S_XOR_B64", + [(set i1:$dst, (xor i1:$src0, i1:$src1))] +>; def S_ANDN2_B32 : SOP2_32 <0x0014, "S_ANDN2_B32", []>; def S_ANDN2_B64 : SOP2_64 <0x0015, "S_ANDN2_B64", []>; def S_ORN2_B32 : SOP2_32 <0x0016, "S_ORN2_B32", []>; diff --git a/test/CodeGen/R600/xor.ll b/test/CodeGen/R600/xor.ll index f52729d..84d4cd4 100644 --- a/test/CodeGen/R600/xor.ll +++ b/test/CodeGen/R600/xor.ll @@ -37,3 +37,20 @@ define void @xor_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> addrspace(1)* %in store <4 x i32> %result, <4 x i32> addrspace(1)* %out ret void } + +;EG-CHECK: @xor_i1 +;EG-CHECK: XOR_INT {{\*? *}}T{{[0-9]+\.[XYZW], PV\.[XYZW], PV\.[XYZW]}} + +;SI-CHECK: @xor_i1 +;SI-CHECK: S_XOR_B64 {{SGPR[0-9]+_SGPR[0-9]+, SGPR[0-9]+_SGPR[0-9]+, SGPR[0-9]+_SGPR[0-9]+}} + +define void @xor_i1(float addrspace(1)* %out, float addrspace(1)* %in0, float addrspace(1)* %in1) { + %a = load float addrspace(1) * %in0 + %b = load float addrspace(1) * %in1 + %acmp = fcmp oge float %a, 0.00e+00 + %bcmp = fcmp oge float %b, 0.00e+00 + %xor = xor i1 %acmp, %bcmp + %result = select i1 %xor, float %a, float %b + store float %result, float addrspace(1)* %out + ret void +} -- 1.8.4.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] nouveau: xvmc on nv43
On Fri, Aug 16, 2013 at 5:40 AM, Pali Rohár wrote: > Hello Ilia, > > I was your last commit which fixing xvmc support for nv30 hw in mesa git tree. > Maybe you can help me. I have graphics card nvidia geforce 6600 gt (nv43 chip) > According to wiki page http://nouveau.freedesktop.org/wiki/FeatureMatrix/ xvmc > support for nv43 is already done. When I start xvmcinfo it print: FTR, an individual with a NV43 AGP had trouble with it. See http://nouveau.freedesktop.org/wiki/VideoAcceleration/ for a few more details. Note that if you're using a recent kernel, you need 3.11-rc4 or later (nouveau/master is fine too, of course), as support got broken at some point. > > $ ./xvmcinfo Huh, never heard of that. No gentoo ebuild either. > > Xv version 2.2 > XvMC version 1.1 > > screen number 0 >info for adaptor 0 [NV40 texture adapter] > number of XvMC surface types: 2 > > info about surface 0: > max_width=2048 > max_height=2048 > subpicture_max_width=2048 > subpicture_max_height=2048 > chroma_format: > XVMC_CHROMA_FORMAT_420 > mc_type: > format : MPEG2 > accelaration start from : IDCT > flags: > XVMC_BACKEND_SUBPICTURE XVMC_SUBPICTURE_INDEPENDENT_SCALING > > info about surface 1: > max_width=2048 > max_height=2048 > subpicture_max_width=2048 > subpicture_max_height=2048 > chroma_format: > XVMC_CHROMA_FORMAT_422 > mc_type: > format : MPEG2 > accelaration start from : IDCT > flags: > XVMC_BACKEND_SUBPICTURE XVMC_SUBPICTURE_INDEPENDENT_SCALING > >info for adaptor 1 [NV40 high quality adapter] > number of XvMC surface types: 0 > >info for adaptor 2 [NV Video Blitter] > number of XvMC surface types: 0 This actually doesn't (necessarily) have anything to do with reality. It's reported entirely by X, which has little to do with actual XvMC operation. > > So some xvmc support is there (via nouveau xvmc library). But when I tried to > use mpeg2play_accel testing application (or mplayer) it crash. Here is gdb > backtrace from coredump file: > > (gdb) bt > #0 0x in ?? () > #1 0x7fca300cd345 in XvMCCreateContext (dpy=0xbefd00, port=63, > surface_type_id=842094169, width=720, height=480, flags=1, context=0x60dee0) > at context.c:248 > #2 0x0040941f in init_display () at display.c:270 > #3 0x00402f0c in initdecoder () at mpeg2dec.c:211 > #4 0x00402b57 in main (argc=2, argv=0x7fff94c89f10) at mpeg2dec.c:121 > (gdb) bt full > #0 0x in ?? () > No symbol table info available. > #1 0x7fca300cd345 in XvMCCreateContext (dpy=0xbefd00, port=63, > surface_type_id=842094169, width=720, height=480, flags=1, context=0x60dee0) > at context.c:248 > found_port = true > scrn = 0 > chroma_format = 1 > mc_type = 65538 > surface_flags = 6 > subpic_max_w = 2048 > subpic_max_h = 2048 > ret = 0 > vscreen = 0xbf13f0 > pipe = 0xc191e0 > context_priv = 0xbfead0 > csc = {{0, 0, 0, 0}, {-2.02570359e-26, 4.59163468e-41, 5.89217978e-39, > 0}, {-2.02575536e-26, 4.59163468e-41, 8.44951942e-10, 4.5842078e-41}} > #2 0x0040941f in init_display () at display.c:270 > surface_type_id = 842094169 > result = 0 > i = 4204800 > color = 0 > root = 652 > #3 0x00402f0c in initdecoder () at mpeg2dec.c:211 > i = 640 > blk_cnt_tab = {6, 8, 12} > #4 0x00402b57 in main (argc=2, argv=0x7fff94c89f10) at mpeg2dec.c:121 > first = 1 > framenum = 0 > runtime = 0 > tstart = {tv_sec = 140735689563912, tv_usec = 4236512} > tstop = {tv_sec = 0, tv_usec = 4204800} > > It looks like that in mesa code is calling pipe->create_video_decoder(...) but > create_video_decoder is NULL and then it crash. > > (gdb) print *pipe > $1 = {screen = 0xbffa50, priv = 0xbf13f0, draw = 0x0, destroy = 0x7fca300d00a0 > , draw_vbo = 0x7fca300da250 , > render_condition = 0x7fca300dd760 , > create_query = 0x7fca300dd510 , > destroy_query = 0x7fca300dd500 , begin_query = > 0x7fca300dd890 , end_query = 0x7fca300dd670 > , > get_query_result = 0x7fca300dd410 , create_blend_state = > 0x7fca300d47f0 , > bind_blend_state = 0x7fca300d3da0 , > delete_blend_state = 0x7fca300d4160 , > create_sampler_state = 0x7fca300d69f0 , > bind_fragment_sampler_states = 0x7fca300d72a0 > , > bind_vertex_sampler_states = 0x7fca300d79a0 > , bind_geometry_sampler_states = 0, > bind_compute_sampler_states = 0, > delete_sampler_state = 0x7fca300d69e0 , > create_rasterizer_state = 0x7fca300d44a0 , > bind_rasterizer_state = 0x7fca300d3db0 , > delete_rasterizer_state = 0x7fca300d4150 , > create_depth_stencil_alpha_state = 0x7fca300d4170
Re: [Mesa-dev] [PATCH] R600/SI: Add pattern for xor of i1
On Fri, Aug 16, 2013 at 04:04:37PM +0200, Michel Dänzer wrote: > From: Michel Dänzer > > Fixes two recent piglit regressions with radeonsi. > > Signed-off-by: Michel Dänzer Reviewed-by: Tom Stellard > --- > lib/Target/R600/SIInstructions.td | 4 +++- > test/CodeGen/R600/xor.ll | 17 + > 2 files changed, 20 insertions(+), 1 deletion(-) > > diff --git a/lib/Target/R600/SIInstructions.td > b/lib/Target/R600/SIInstructions.td > index 4eb3566..436a2cd 100644 > --- a/lib/Target/R600/SIInstructions.td > +++ b/lib/Target/R600/SIInstructions.td > @@ -1144,7 +1144,9 @@ def : Pat < >(S_OR_B64 $src0, $src1) > >; > def S_XOR_B32 : SOP2_32 <0x0012, "S_XOR_B32", []>; > -def S_XOR_B64 : SOP2_64 <0x0013, "S_XOR_B64", []>; > +def S_XOR_B64 : SOP2_64 <0x0013, "S_XOR_B64", > + [(set i1:$dst, (xor i1:$src0, i1:$src1))] > +>; > def S_ANDN2_B32 : SOP2_32 <0x0014, "S_ANDN2_B32", []>; > def S_ANDN2_B64 : SOP2_64 <0x0015, "S_ANDN2_B64", []>; > def S_ORN2_B32 : SOP2_32 <0x0016, "S_ORN2_B32", []>; > diff --git a/test/CodeGen/R600/xor.ll b/test/CodeGen/R600/xor.ll > index f52729d..84d4cd4 100644 > --- a/test/CodeGen/R600/xor.ll > +++ b/test/CodeGen/R600/xor.ll > @@ -37,3 +37,20 @@ define void @xor_v4i32(<4 x i32> addrspace(1)* %out, <4 x > i32> addrspace(1)* %in >store <4 x i32> %result, <4 x i32> addrspace(1)* %out >ret void > } > + > +;EG-CHECK: @xor_i1 > +;EG-CHECK: XOR_INT {{\*? *}}T{{[0-9]+\.[XYZW], PV\.[XYZW], PV\.[XYZW]}} > + > +;SI-CHECK: @xor_i1 > +;SI-CHECK: S_XOR_B64 {{SGPR[0-9]+_SGPR[0-9]+, SGPR[0-9]+_SGPR[0-9]+, > SGPR[0-9]+_SGPR[0-9]+}} > + > +define void @xor_i1(float addrspace(1)* %out, float addrspace(1)* %in0, > float addrspace(1)* %in1) { > + %a = load float addrspace(1) * %in0 > + %b = load float addrspace(1) * %in1 > + %acmp = fcmp oge float %a, 0.00e+00 > + %bcmp = fcmp oge float %b, 0.00e+00 > + %xor = xor i1 %acmp, %bcmp > + %result = select i1 %xor, float %a, float %b > + store float %result, float addrspace(1)* %out > + ret void > +} > -- > 1.8.4.rc2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Don't copy propagate bitcasts with source modifiers.
On 15 August 2013 16:19, Matt Turner wrote: > Previously, copy propagation would cause bitcast_f2u(abs(float)) to > be performed in a single step, but the application of source modifiers > (abs, neg) happens after type conversion, leading to incorrect results. > > That is, for bitcast_f2u(abs(float)) we would in fact generate code to > do abs(bitcast_f2u(float)). > > For example, whereas bitcast_f2u(abs(float)) might result in a register > argument such as >(abs)g2.2<0,1,0>UD > > v2: Set interfered = true and break in register_coalesce instead of > returning false. > Reviewed-by: Paul Berry ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Emit MOVs for neg/abs.
On Thu, Aug 15, 2013 at 7:38 PM, Ian Romanick wrote: > On 08/12/2013 01:18 PM, Matt Turner wrote: >> >> Necessary to avoid combining a bitcast and a modifier into a single >> operation. Otherwise if safe, the MOV should be removed by >> copy-propagation or register coalescing. > > > Has that been verified with shaderdb? Yes, there are only four changes -- four shaders in the Cave that do something like mov a.w, -b.x which we now generate an extra instruction for, because copy-propagation bails out for non-XYZW swizzles. Seems acceptable, given the limitation of copy prop. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] nouveau: xvmc on nv43
On Fri, Aug 16, 2013 at 7:34 AM, Ilia Mirkin wrote: > On Fri, Aug 16, 2013 at 5:40 AM, Pali Rohár wrote: >> Hello Ilia, >> >> I was your last commit which fixing xvmc support for nv30 hw in mesa git >> tree. >> Maybe you can help me. I have graphics card nvidia geforce 6600 gt (nv43 >> chip) >> According to wiki page http://nouveau.freedesktop.org/wiki/FeatureMatrix/ >> xvmc >> support for nv43 is already done. When I start xvmcinfo it print: > > FTR, an individual with a NV43 AGP had trouble with it. See > http://nouveau.freedesktop.org/wiki/VideoAcceleration/ for a few more > details. Note that if you're using a recent kernel, you need 3.11-rc4 > or later (nouveau/master is fine too, of course), as support got > broken at some point. > >> >> $ ./xvmcinfo > > Huh, never heard of that. No gentoo ebuild either. It's available here: http://www.mythtv.org/wiki/XvMC#Checking_your_installation Maybe we (X.Org) should package it or add it to libXvMC if it's useful. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: Don't leave gaps between position exports from vertex shader
Le vendredi 16 août 2013 10:59:13 Michel Dänzer a écrit : > On Fre, 2013-08-09 at 23:41 +0200, Laurent Carlier wrote: > > Le vendredi 9 août 2013 18:50:20 Michel Dänzer a écrit : > > > From: Michel Dänzer > > > > > > Exporting position 2/3 (clip distances) but not position 1 (point size) > > > causes geometry corruption for some reason. > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974 > > > > > > Cc: mesa-sta...@lists.freedesktop.org > > > Signed-off-by: Michel Dänzer > > > > Tested with Dota2 and L4D2, and it fixes the rendering > > Thanks for testing. Did you overlook the corruption you reported in > https://bugs.freedesktop.org/show_bug.cgi?id=68162 when you tested that > series, or was it not there? I've been looking into that as well, but > haven't found the problem yet. The problem was already there, because i've noticed similar corruption months ago with a R600g card (HD6870) with r600-llvm-compiler enabled. This problam was difficult to notice on radeonsi before your fix because of broken rendering. -- Laurent Carlier ArchLinux Developer http://www.archlinux.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Emit MOVs for neg/abs.
On 08/16/2013 08:40 AM, Matt Turner wrote: On Thu, Aug 15, 2013 at 7:38 PM, Ian Romanick wrote: On 08/12/2013 01:18 PM, Matt Turner wrote: Necessary to avoid combining a bitcast and a modifier into a single operation. Otherwise if safe, the MOV should be removed by copy-propagation or register coalescing. Has that been verified with shaderdb? Yes, there are only four changes -- four shaders in the Cave that do something like mov a.w, -b.x which we now generate an extra instruction for, because copy-propagation bails out for non-XYZW swizzles. Seems acceptable, given the limitation of copy prop. With that added, in some form, to the commit message, Reviewed-by: Ian Romanick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Move GL_APPLE_object_purgeable functionality into a new file.
On 08/14/2013 12:06 PM, Kenneth Graunke wrote: GL_APPLE_object_purgeable creates a mechanism for marking OpenGL objects as "purgeable" so they can be thrown away when system resources become scarce. It specifically applies to buffer objects, textures, and renderbuffers. The intel_buffer_objects.c file provides core functionality for GL buffer objects, such as MapBufferRange and CopyBufferSubData. Having texture and renderbuffer functionality in that file is a bit strange. The 2010 copyright on the new file is because Chris Wilson first added this code in January 2010 (commit 755915fa). Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_context.h | 3 + src/mesa/drivers/dri/i965/brw_object_purgeable.c | 173 +++ src/mesa/drivers/dri/i965/intel_buffer_objects.c | 136 -- 4 files changed, 177 insertions(+), 136 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/brw_object_purgeable.c diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index ac8487b..c92573f 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -65,6 +65,7 @@ i965_FILES = \ brw_interpolation_map.c \ brw_lower_texture_gradients.cpp \ brw_misc_state.c \ + brw_object_purgeable.c \ brw_program.c \ brw_primitive_restart.c \ brw_queryobj.c \ diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 00dd2b4..e788d14 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1305,6 +1305,9 @@ void brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt, void brw_workaround_depthstencil_alignment(struct brw_context *brw, GLbitfield clear_mask); +/* brw_object_purgeable.c */ +void brw_init_object_purgeable_functions(struct dd_function_table *functions); + /*== * brw_queryobj.c */ diff --git a/src/mesa/drivers/dri/i965/brw_object_purgeable.c b/src/mesa/drivers/dri/i965/brw_object_purgeable.c new file mode 100644 index 000..4630416 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_object_purgeable.c @@ -0,0 +1,173 @@ +/* + * Copyright © 2010 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/** + * @file brw_object_purgeable.c + * + * The driver implementation of the GL_APPLE_object_purgeable extension. + */ + +#include "main/imports.h" +#include "main/mtypes.h" +#include "main/macros.h" +#include "main/bufferobj.h" + +#include "brw_context.h" +#include "intel_buffer_objects.h" +#include "intel_fbo.h" +#include "intel_mipmap_tree.h" + +static GLenum +intel_buffer_purgeable(drm_intel_bo *buffer) +{ + int retained = 0; + + if (buffer != NULL) + retained = drm_intel_bo_madvise(buffer, I915_MADV_DONTNEED); + + return retained ? GL_VOLATILE_APPLE : GL_RELEASED_APPLE; +} + +static GLenum +intel_buffer_object_purgeable(struct gl_context * ctx, + struct gl_buffer_object *obj, + GLenum option) +{ + struct intel_buffer_object *intel_obj = intel_buffer_object(obj); + + if (intel_obj->buffer != NULL) + return intel_buffer_purgeable(intel_obj->buffer); + + if (option == GL_RELEASED_APPLE) { + return GL_RELEASED_APPLE; + } else { + /* XXX Create the buffer and madvise(MADV_DONTNEED)? */ + struct brw_context *brw = brw_context(ctx); + drm_intel_bo *bo = intel_bufferobj_buffer(brw, intel_obj, INTEL_READ); + + return intel_buffer_purgeable(bo); + } +} + +static GLenum +intel_texture_object_purgeable(struct gl_context * ctx, + struct
Re: [Mesa-dev] [PATCH 2/3] i965: Split intel_upload code out into a separate file.
On 08/14/2013 12:06 PM, Kenneth Graunke wrote: This code upload performs batched uploads via a BO. By moving it out to a separate file, intel_buffer_objects.c only provides the core buffer object functionality. Signed-off-by: Kenneth Graunke Patch 2 and 3 are Reviewed-by: Ian Romanick --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/intel_buffer_objects.c | 133 - src/mesa/drivers/dri/i965/intel_upload.c | 177 +++ 3 files changed, 178 insertions(+), 133 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/intel_upload.c diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index c92573f..60cd6e0 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -26,6 +26,7 @@ i965_FILES = \ intel_tex_image.c \ intel_tex_subimage.c \ intel_tex_validate.c \ + intel_upload.c \ brw_blorp.cpp \ brw_blorp_blit.cpp \ brw_blorp_clear.cpp \ diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c b/src/mesa/drivers/dri/i965/intel_buffer_objects.c index 740913b..81c72fa 100644 --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c @@ -412,139 +412,6 @@ intel_bufferobj_buffer(struct brw_context *brw, return intel_obj->buffer; } -#define INTEL_UPLOAD_SIZE (64*1024) - -void -intel_upload_finish(struct brw_context *brw) -{ - if (!brw->upload.bo) - return; - - if (brw->upload.buffer_len) { - drm_intel_bo_subdata(brw->upload.bo, - brw->upload.buffer_offset, - brw->upload.buffer_len, - brw->upload.buffer); - brw->upload.buffer_len = 0; - } - - drm_intel_bo_unreference(brw->upload.bo); - brw->upload.bo = NULL; -} - -static void wrap_buffers(struct brw_context *brw, GLuint size) -{ - intel_upload_finish(brw); - - if (size < INTEL_UPLOAD_SIZE) - size = INTEL_UPLOAD_SIZE; - - brw->upload.bo = drm_intel_bo_alloc(brw->bufmgr, "upload", size, 0); - brw->upload.offset = 0; -} - -void intel_upload_data(struct brw_context *brw, - const void *ptr, GLuint size, GLuint align, - drm_intel_bo **return_bo, - GLuint *return_offset) -{ - GLuint base, delta; - - base = (brw->upload.offset + align - 1) / align * align; - if (brw->upload.bo == NULL || base + size > brw->upload.bo->size) { - wrap_buffers(brw, size); - base = 0; - } - - drm_intel_bo_reference(brw->upload.bo); - *return_bo = brw->upload.bo; - *return_offset = base; - - delta = base - brw->upload.offset; - if (brw->upload.buffer_len && - brw->upload.buffer_len + delta + size > sizeof(brw->upload.buffer)) - { - drm_intel_bo_subdata(brw->upload.bo, - brw->upload.buffer_offset, - brw->upload.buffer_len, - brw->upload.buffer); - brw->upload.buffer_len = 0; - } - - if (size < sizeof(brw->upload.buffer)) - { - if (brw->upload.buffer_len == 0) -brw->upload.buffer_offset = base; - else -brw->upload.buffer_len += delta; - - memcpy(brw->upload.buffer + brw->upload.buffer_len, ptr, size); - brw->upload.buffer_len += size; - } - else - { - drm_intel_bo_subdata(brw->upload.bo, base, size, ptr); - } - - brw->upload.offset = base + size; -} - -void *intel_upload_map(struct brw_context *brw, GLuint size, GLuint align) -{ - GLuint base, delta; - char *ptr; - - base = (brw->upload.offset + align - 1) / align * align; - if (brw->upload.bo == NULL || base + size > brw->upload.bo->size) { - wrap_buffers(brw, size); - base = 0; - } - - delta = base - brw->upload.offset; - if (brw->upload.buffer_len && - brw->upload.buffer_len + delta + size > sizeof(brw->upload.buffer)) - { - drm_intel_bo_subdata(brw->upload.bo, - brw->upload.buffer_offset, - brw->upload.buffer_len, - brw->upload.buffer); - brw->upload.buffer_len = 0; - } - - if (size <= sizeof(brw->upload.buffer)) { - if (brw->upload.buffer_len == 0) -brw->upload.buffer_offset = base; - else -brw->upload.buffer_len += delta; - - ptr = brw->upload.buffer + brw->upload.buffer_len; - brw->upload.buffer_len += size; - } else - ptr = malloc(size); - - return ptr; -} - -void intel_upload_unmap(struct brw_context *brw, - const void *ptr, GLuint size, GLuint align, - drm_intel_bo **return_bo, - GLuint *return_offset) -{ - GLuint base; - - base = (brw->upload.offset + align - 1) / align * align; - if (size > sizeof(brw->upload.buffer)) { -
Re: [Mesa-dev] [PATCH 1/2] i965: Add Gen6 depth stall flushes before disabling depth in BLORP.
Series is Reviewed-by: Ian Romanick On 08/13/2013 12:07 PM, Kenneth Graunke wrote: We emit these before configuring depth in the normal path, or actually using the depth buffer in BLORP - we just failed to emit them when disabling depth altogether. On Sandybridge, this also requires the post_sync_nonzero flush. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index a4a9081..129c113 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -914,6 +914,9 @@ static void gen6_blorp_emit_depth_disable(struct brw_context *brw, const brw_blorp_params *params) { + intel_emit_post_sync_nonzero_flush(brw); + intel_emit_depth_stall_flushes(brw); + BEGIN_BATCH(7); OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2)); OUT_BATCH((BRW_DEPTHFORMAT_D32_FLOAT << 18) | ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/vdpau: drop unnecessary variable prof
Signed-off-by: Emil Velikov --- src/gallium/state_trackers/vdpau/mixer.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/mixer.c b/src/gallium/state_trackers/vdpau/mixer.c index 26db5c8..8c20d05 100644 --- a/src/gallium/state_trackers/vdpau/mixer.c +++ b/src/gallium/state_trackers/vdpau/mixer.c @@ -50,7 +50,6 @@ vlVdpVideoMixerCreate(VdpDevice device, VdpStatus ret; struct pipe_screen *screen; unsigned max_width, max_height, i; - enum pipe_video_profile prof = PIPE_VIDEO_PROFILE_UNKNOWN; vlVdpDevice *dev = vlGetDataHTAB(device); if (!dev) @@ -132,8 +131,8 @@ vlVdpVideoMixerCreate(VdpDevice device, VDPAU_MSG(VDPAU_WARN, "[VDPAU] Max layers > 4 not supported\n", vmixer->max_layers); goto no_params; } - max_width = screen->get_video_param(screen, prof, PIPE_VIDEO_CAP_MAX_WIDTH); - max_height = screen->get_video_param(screen, prof, PIPE_VIDEO_CAP_MAX_HEIGHT); + max_width = screen->get_video_param(screen, PIPE_VIDEO_PROFILE_UNKNOWN, PIPE_VIDEO_CAP_MAX_WIDTH); + max_height = screen->get_video_param(screen, PIPE_VIDEO_PROFILE_UNKNOWN, PIPE_VIDEO_CAP_MAX_HEIGHT); if (vmixer->video_width < 48 || vmixer->video_width > max_width) { VDPAU_MSG(VDPAU_WARN, "[VDPAU] 48 < %u < %u not valid for width\n", vmixer->video_width, max_width); -- 1.8.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] segfault in pstip_bind_sampler_states
On 08/12/2013 10:29 AM, Brian Paul wrote: > On 08/09/2013 01:50 PM, Kevin H. Hobbs wrote: >> (gdb) print pstip >> $1 = (struct pstip_stage *) 0xff66331aff66331a >> >> I don't think my actual RAM goes that high. > > That looks suspect since the low and high halves of the address are the > same. > I believe pstip->state gets it's funky value in null_sw_create () at null_sw_winsys.c:146 My heavily edited gdb session follows : $ gdb /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkpython (gdb) run lots of options Program received signal SIGSEGV, Segmentation fault. pstip_bind_sampler_states (pipe=, num=0, sampler=0x0) at draw/draw_pipe_pstipple.c:713 713 pstip->state.samplers[i] = NULL; (gdb) bt #0 pstip_bind_sampler_states (pipe=, num=0, sampler=0x0) at draw/draw_pipe_pstipple.c:713 #1 0x7fffdf75839c in cso_release_all (ctx=ctx@entry=0x15eebe0) at cso_cache/cso_context.c:307 (gdb) break cso_context.c:307 (gdb) run lots of options (gdb) step pstip_bind_sampler_states (pipe=0x13d4960, num=0, sampler=0x0) at draw/draw_pipe_pstipple.c:706 706 { (gdb) next 711memcpy(pstip->state.samplers, sampler, num * sizeof(void *)); # oops not a debug build (gdb) print pstip $1 = (struct pstip_stage *) 0x1360d30 (gdb) print pstip->state $2 = {samplers = {0x7fffdf863b80 ,... (gdb) print &(pstip->state) $3 = (struct {...} *) 0x1360db0 (gdb) print &(pstip->state.samplers) $4 = (void *(*)[16]) 0x1360db0 # duh (gdb) watch *0x1360db0 Hardware watchpoint 2: *0x1360db0 (gdb) run lots of options Hardware watchpoint 2: *0x1360db0 Old value = New value = 3369 vtkCellLinks::InsertCellReference (this=0x1359ce0, ptId=252, pos=4, cellId=3369) at /home/kevin/kitware/VTK/Common/DataModel/vtkCellLinks.h:159 159 } (gdb) continue Continuing. Hardware watchpoint 2: *0x1360db0 Old value = 3369 New value = 0 __libc_calloc (n=, elem_size=) at malloc.c:3286 3286if (nclears > 8) { (gdb) bt #0 __libc_calloc (n=, elem_size=) at malloc.c:3286 #1 0x7fffdf863c23 in null_sw_create () at null_sw_winsys.c:135 #2 0x7fffdf595656 in osmesa_create_screen () at target.c:43 #3 0x7fffdf887b93 in get_st_manager () at osmesa.c:151 #4 0x7fffdf888263 in OSMesaCreateContextExt (format=6408, depthBits=24, stencilBits=, accumBits=0, sharelist=) at osmesa.c:557 #5 0x7fffdcc02a10 in vtkOSOpenGLRenderWindow::CreateOffScreenWindow (this=0xebc270, width=150, height=150) at /home/kevin/kitware/VTK/Rendering/OpenGL/vtkOSOpenGLRenderWindow.cxx:188 ... (gdb) continue Continuing. Hardware watchpoint 2: *0x1360db0 Old value = 0 New value = -544851072 null_sw_create () at null_sw_winsys.c:146 146winsys->displaytarget_display = null_sw_displaytarget_display; (gdb) bt #0 null_sw_create () at null_sw_winsys.c:146 #1 0x7fffdf595656 in osmesa_create_screen () at target.c:43 #2 0x7fffdf887b93 in get_st_manager () at osmesa.c:151 #3 0x7fffdf888263 in OSMesaCreateContextExt (format=6408, depthBits=24, stencilBits=, accumBits=0, sharelist=) at osmesa.c:557 #4 0x7fffdcc02a10 in vtkOSOpenGLRenderWindow::CreateOffScreenWindow (this=0xebc270, width=150, height=150) at /home/kevin/kitware/VTK/Rendering/OpenGL/vtkOSOpenGLRenderWindow.cxx:188 (gdb) continue Breakpoint 1, cso_release_all (ctx=ctx@entry=0x15eebe0) at cso_cache/cso_context.c:307 307 ctx->pipe->bind_fragment_sampler_states( ctx->pipe, 0, NULL ); (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. pstip_bind_sampler_states (pipe=, num=0, sampler=0x0) at draw/draw_pipe_pstipple.c:713 713 pstip->state.samplers[i] = NULL; signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi/compute: Let the state tracker do all the flushing
From: Tom Stellard It shouldn't be necessary to call radeon_winsys::cs_flush() from radeonsi_launch_grid(), because the state tracker is responsible for flushing the pipeline at the appropriate time. The current behavior is also wrong, because radeonsi_launch_grid() submits packets to the compute ring, but when the state tracker calls pipe->flush() everything is submitted to the graphics ring. This has the potential to create a race condition. The downside of removing this flush is that the compute dispatch packets will be sent to the graphics ring rather than the compute ring. In the future we will need to come up with a way to detect 'compute' command streams and submit them to the appropriate ring. --- src/gallium/drivers/radeonsi/radeonsi_compute.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 41c72c5..10309ba 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -230,9 +230,6 @@ static void radeonsi_launch_grid( } #endif - rctx->ws->cs_flush(rctx->cs, RADEON_FLUSH_COMPUTE, 0); - rctx->ws->buffer_wait(shader->bo->buf, 0); - FREE(pm4); FREE(kernel_args); } -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] i965/fs: Track the maximum surface index used in brw_wm_prog_data.
On Friday, August 16, 2013 06:11:25 AM Paul Berry wrote: > On 14 August 2013 21:07, Kenneth Graunke wrote: > > This allows us to determine how small we can make the binding table. > > > > Since it depends entirely on the shader program, we can just compute > > it once at compile time, rather than at binding table emit time (which > > happens during drawing). > > > > Signed-off-by: Kenneth Graunke > > --- > > > > src/mesa/drivers/dri/i965/brw_context.h | 2 ++ > > src/mesa/drivers/dri/i965/brw_fs.h| 2 ++ > > src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 23 +++ > > 3 files changed, 27 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > > b/src/mesa/drivers/dri/i965/brw_context.h > > index ff0a65c..380fe08 100644 > > --- a/src/mesa/drivers/dri/i965/brw_context.h > > +++ b/src/mesa/drivers/dri/i965/brw_context.h > > @@ -305,6 +305,8 @@ struct brw_wm_prog_data { > > > > GLuint reg_blocks_16; > > GLuint total_scratch; > > > > + unsigned max_surface_index; > > + > > I'm bothered by the off-by-one inconsistency of using max_surface_index > here, but using binding_table_size over in brw_vec4_prog_data (see patch > 5). Could we change this to binding_table_size, and update > fs_generator::mark_surface_used() to do: > > prog_data->binding_table_size = MAX2(prog_data->binding_table_size, > surf_index + 1); > > Then it would be consistent with vec4_generator::mark_surface_used(). > > With that changed, this patch is: > > Reviewed-by: Paul Berry Sure. That would be better. I wrote this code first, and didn't think about the case where there were 0 surfaces. For the FS, it doesn't matter much since you always have at least one render target at surface index 0. For the VS, surface index 0 is the pull constant buffer, which is very optional. I'll make them consistent. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: Fix linkage of libOpenCL
On Wed, Aug 07, 2013 at 05:48:48PM +0200, Niels Ole Salscheider wrote: > Clover needs the option component of llvm. > Pushed, thanks! -Tom > Signed-off-by: Niels Ole Salscheider > --- > configure.ac | 4 > 1 Datei geändert, 4 Zeilen hinzugefügt(+) > > diff --git a/configure.ac b/configure.ac > index 62d06e0..0dcd2a5 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -1617,6 +1617,10 @@ if test "x$enable_gallium_llvm" = xyes; then > if $LLVM_CONFIG --components | grep -qw 'irreader'; then > LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader" > fi > +# LLVM 3.4 requires Option > +if $LLVM_CONFIG --components | grep -qw 'option'; then > +LLVM_COMPONENTS="${LLVM_COMPONENTS} option" > +fi > fi > DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT" > MESA_LLVM=1 > -- > 1.7.11.7 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: do clamping of border color correctly for all formats
From: Roland Scheidegger Turns out it is actually very complicated to figure out what a format really is wrt range, as using channel information for determining unorm/snorm etc. doesn't work for a bunch of cases - namely compressed, subsampled, other. Also while here add clamping for uint/sint as well - d3d10 doesn't actually need this (can only use ld with these formats hence no border) and we could do this outside the shader for GL easily (due to the fixed texture/sampler relation) do it here do just so I can forget about it. --- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 162 ++--- 1 file changed, 144 insertions(+), 18 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 76de006..f61c6c5 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -42,6 +42,7 @@ #include "util/u_math.h" #include "util/u_format.h" #include "util/u_cpu_detect.h" +#include "util/u_format_rgb9e5.h" #include "lp_bld_debug.h" #include "lp_bld_type.h" #include "lp_bld_const.h" @@ -206,33 +207,158 @@ lp_build_sample_texel_soa(struct lp_build_sample_context *bld, lp_build_const_int32(bld->gallivm, chan)); LLVMValueRef border_chan_vec = lp_build_broadcast_scalar(&bld->float_vec_bld, border_chan); +LLVMValueRef min_clamp = NULL; +LLVMValueRef max_clamp = NULL; if (!bld->texel_type.floating) { border_chan_vec = LLVMBuildBitCast(builder, border_chan_vec, bld->texel_bld.vec_type, ""); } -else { - /* -* For normalized format need to clamp border color (technically -* probably should also quantize the data). Really sucks doing this -* here but can't avoid at least for now since this is part of -* sampler state and texture format is part of sampler_view state. -*/ +/* + * For normalized format need to clamp border color (technically + * probably should also quantize the data). Really sucks doing this + * here but can't avoid at least for now since this is part of + * sampler state and texture format is part of sampler_view state. + * (Could definitely do it outside per-sample loop but llvm should + * take care of that.) + * GL expects also expects clamping for uint/sint formats too so + * do that as well (d3d10 can't end up here with uint/sint since it + * only supports them with ld). + */ +if (format_desc->layout == UTIL_FORMAT_LAYOUT_PLAIN) { unsigned chan_type = format_desc->channel[chan_s].type; unsigned chan_norm = format_desc->channel[chan_s].normalized; - if (chan_type == UTIL_FORMAT_TYPE_SIGNED && chan_norm) { - LLVMValueRef clamp_min; - clamp_min = lp_build_const_vec(bld->gallivm, bld->texel_type, -1.0F); - border_chan_vec = lp_build_clamp(&bld->texel_bld, border_chan_vec, - clamp_min, - bld->texel_bld.one); + unsigned pure_int = format_desc->channel[chan_s].pure_integer; + if (chan_type == UTIL_FORMAT_TYPE_SIGNED) { + if (chan_norm) { + min_clamp = lp_build_const_vec(bld->gallivm, bld->texel_type, -1.0F); + max_clamp = bld->texel_bld.one; + } + else if (pure_int) { + /* + * Border color was stored as int, hence need min/max clamp + * only if chan has less than 32 bits.. + */ + unsigned chan_size = format_desc->channel[chan_s].size < 32; + if (chan_size < 32) { +min_clamp = lp_build_const_int_vec(bld->gallivm, bld->texel_type, + 0 - (1 << (chan_size - 1))); +max_clamp = lp_build_const_int_vec(bld->gallivm, bld->texel_type, + (1 << (chan_size - 1)) - 1); + } + } + /* TODO: no idea about non-pure, non-normalized! */ } - else if (chan_type == UTIL_FORMAT_TYPE_UNSIGNED && chan_norm) { - border_chan_vec = lp_build_clamp(&bld->texel_bld, border_chan_vec, - bld->texel_bld.zero, - bld->texel_bld.one); + else if (chan_typ
[Mesa-dev] [Bug 68209] New: piglit glean vertArrayBGRA regression
https://bugs.freedesktop.org/show_bug.cgi?id=68209 Priority: medium Bug ID: 68209 Keywords: regression CC: fred...@kde.org Assignee: mesa-dev@lists.freedesktop.org Summary: piglit glean vertArrayBGRA regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa mesa: aafb0f9e06a0e08ebb38a92ce2090739d380df71 (master) $ ./bin/glean -t vertArrayBGRA --quick Mesa: User error: GL_INVALID_OPERATION in glColorPointer(size=GL_BGRA and type=GL_FLOAT) vertArrayBGRA: Error: glColorPointer(size=GL_BGRA, type=GL_FLOAT) did not generate expected error. vertArrayBGRA: FAIL rgba8, db, z24, s8, accrgba16, win+pmap, id 33 0e7a61a29f883c63a5439ac16eddeba3aaf4 is the first bad commit commit 0e7a61a29f883c63a5439ac16eddeba3aaf4 Author: Fredrik Höglund Date: Fri Apr 12 17:36:06 2013 +0200 mesa: Update the BGRA vertex array error handling The error code was changed from INVALID_VALUE to INVALID_OPERATION in OpenGL 3.3. We should also generate an error when size is BGRA and normalized is FALSE. Reviewed-by: Kenneth Graunke :04 04 c8835c9734b2a932e4b0c627e578ad9e3f098896 0c3b681983edb9700474957f91d656b66e95a7c9 M src bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 68209] piglit glean vertArrayBGRA regression
https://bugs.freedesktop.org/show_bug.cgi?id=68209 Ian Romanick changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG --- Comment #1 from Ian Romanick --- There are patches for the piglit test pending. Mesa was changed to generate the correct OpenGL 3.3 error, and piglit just hasn't caught up yet. See also bug #67925. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 68209] piglit glean vertArrayBGRA regression
https://bugs.freedesktop.org/show_bug.cgi?id=68209 --- Comment #2 from Kenneth Graunke --- I don't believe there are patches pending for the Glean test, though. It needs to be fixed too. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/10] i965: Separate VS/FS sampler tables.
On 08/14/2013 06:55 PM, Kenneth Graunke wrote: Currently, i965 uploads a single SAMPLER_STATE table shared across all shader stages (VS, FS). This series splits it out, uploading a unique table for each stage. I think this may actually fix some bugs with vertex texturing: Piglit's fragment-and-vertex-texturing uses two textures (unit 0 and unit 1), one in each shader. vs->SamplersUsed and fs->SamplersUsed both only have one bit set (bit 0), but vs->SamplerUnits and fs->SamplerUnits map them differently (units 0 and 1). The existing code would select fs->SamplerUnits[0], ignoring vs->SamplerUnits[0]. If the two textures had, say, different wrap modes, this would probably illustrate the problem. It also just seems like a good idea. The border color code in particular is much nicer after this change, as it's not directly tied to brw->wm any longer (even though textures can be used in all shader stages). No observed Piglit changes on Ivybridge. Could we try to create a piglit test to tickle the issue identified above? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/10] i965: Split sampler count variable to be per-stage.
On 08/14/2013 06:55 PM, Kenneth Graunke wrote: Currently, we only have a single sampler state table shared among all stages, so we just copy wm.sampler_count into vs.sampler_count. In the future, each shader stage will have its own SAMPLER_STATE table, at which point we'll need these separate sampler counts. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_context.h | 5 - src/mesa/drivers/dri/i965/brw_vs_state.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_sampler_state.c | 12 +++- src/mesa/drivers/dri/i965/brw_wm_state.c | 6 +++--- src/mesa/drivers/dri/i965/gen6_vs_state.c| 2 +- src/mesa/drivers/dri/i965/gen6_wm_state.c| 2 +- src/mesa/drivers/dri/i965/gen7_sampler_state.c | 12 +++- src/mesa/drivers/dri/i965/gen7_vs_state.c| 2 +- src/mesa/drivers/dri/i965/gen7_wm_state.c| 2 +- 9 files changed, 27 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 74e38f1..63136b1 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1065,7 +1065,6 @@ struct brw_context /** SAMPLER_STATE count and offset */ struct { - GLuint count; uint32_t offset; } sampler; @@ -1109,6 +1108,8 @@ struct brw_context uint32_t bind_bo_offset; uint32_t surf_offset[BRW_MAX_VS_SURFACES]; + + uint32_t sampler_count; } vs; There's a lot of commonality between these structures. If BRW_MAX_VS_SURFACES, BRW_MAX_GS_SURFACES, and BRW_MAX_WM_SURFACES are the same, we should make this common stuff common. Yeah? struct { @@ -1182,6 +1183,8 @@ struct brw_context uint32_t bind_bo_offset; uint32_t surf_offset[BRW_MAX_WM_SURFACES]; + uint32_t sampler_count; + struct { struct ra_regs *regs; diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index ddaf914..13aabac 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -142,7 +142,7 @@ brw_upload_vs_unit(struct brw_context *brw) vs->vs5.sampler_count = 0; /* hardware requirement */ else { /* CACHE_NEW_SAMPLER */ - vs->vs5.sampler_count = (brw->sampler.count + 3) / 4; + vs->vs5.sampler_count = (brw->vs.sampler_count + 3) / 4; } @@ -155,7 +155,7 @@ brw_upload_vs_unit(struct brw_context *brw) /* Set the sampler state pointer, and its reloc */ - if (brw->sampler.count) { + if (brw->vs.sampler_count) { vs->vs5.sampler_state_pointer = (brw->batch.bo->offset + brw->sampler.offset) >> 5; drm_intel_bo_emit_reloc(brw->batch.bo, diff --git a/src/mesa/drivers/dri/i965/brw_wm_sampler_state.c b/src/mesa/drivers/dri/i965/brw_wm_sampler_state.c index 5457671..40a6d5b 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_sampler_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_sampler_state.c @@ -377,17 +377,19 @@ brw_upload_samplers(struct brw_context *brw) /* ARB programs use the texture unit number as the sampler index, so we * need to find the highest unit used. A bit-count will not work. */ - brw->sampler.count = _mesa_fls(SamplersUsed); + brw->wm.sampler_count = _mesa_fls(SamplersUsed); + /* Currently we only use one sampler state table. Mirror the count. */ + brw->vs.sampler_count = brw->wm.sampler_count; - if (brw->sampler.count == 0) + if (brw->wm.sampler_count == 0) return; samplers = brw_state_batch(brw, AUB_TRACE_SAMPLER_STATE, - brw->sampler.count * sizeof(*samplers), + brw->wm.sampler_count * sizeof(*samplers), 32, &brw->sampler.offset); - memset(samplers, 0, brw->sampler.count * sizeof(*samplers)); + memset(samplers, 0, brw->wm.sampler_count * sizeof(*samplers)); - for (unsigned s = 0; s < brw->sampler.count; s++) { + for (unsigned s = 0; s < brw->wm.sampler_count; s++) { if (SamplersUsed & (1 << s)) { const unsigned unit = (fs->SamplersUsed & (1 << s)) ? fs->SamplerUnits[s] : vs->SamplerUnits[s]; diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c index 631f351..106d628 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_state.c @@ -144,10 +144,10 @@ brw_upload_wm_unit(struct brw_context *brw) wm->wm4.sampler_count = 0; /* hardware requirement */ else { /* CACHE_NEW_SAMPLER */ - wm->wm4.sampler_count = (brw->sampler.count + 1) / 4; + wm->wm4.sampler_count = (brw->wm.sampler_count + 1) / 4; } - if (brw->sampler.count) { + if (brw->wm.sampler_count) { /* reloc */ wm->wm4.sampler_state_pointer = (brw->batch.bo->offset + brw->sampler.offse
[Mesa-dev] [Bug 62647] Wrong rendering of Dota 2 on Wine (apitrace attached) - Intel IVB HD4000
https://bugs.freedesktop.org/show_bug.cgi?id=62647 Vedran Rodic changed: What|Removed |Added CC||auke-jan.h@intel.com --- Comment #32 from Vedran Rodic --- *** Bug 67877 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are on the CC list for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev