Re: [Mesa-dev] [PATCH] glsl/linker: check same name is not used in block and outside
Reviewed-by: Tapani Pälli On 30.01.2018 17:11, Juan A. Suarez Romero wrote: According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes CC: "18.0" Signed-off-by: Juan A. Suarez Romero --- src/compiler/glsl/linker.cpp | 54 +--- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index ce101935b01..65b22fdba8a 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -,29 +,6 @@ cross_validate_globals(struct gl_shader_program *prog, return; } - /* In OpenGL GLSL 4.20 spec, section 4.3.9, page 57: - * - * "It is a link-time error if any particular shader interface - *contains: - * - *- two different blocks, each having no instance name, and each - * having a member of the same name, or - * - *- a variable outside a block, and a block with no instance name, - * where the variable has the same name as a member in the block." - */ - if (var->data.mode == existing->data.mode && - var->get_interface_type() != existing->get_interface_type()) { -linker_error(prog, "declarations for %s `%s` are in " - "%s and %s\n", - mode_string(var), var->name, - existing->get_interface_type() ? - existing->get_interface_type()->name : "outside a block", - var->get_interface_type() ? - var->get_interface_type()->name : "outside a block"); - -return; - } /* Only in GLSL ES 3.10, the precision qualifier should not match * between block members defined in matched block names within a * shader interface. @@ -1155,6 +1132,37 @@ cross_validate_globals(struct gl_shader_program *prog, mode_string(var), var->name); } } + + /* In OpenGL GLSL 3.20 spec, section 4.3.9: + * + * "It is a link-time error if any particular shader interface + *contains: + * + *- two different blocks, each having no instance name, and each + * having a member of the same name, or + * + *- a variable outside a block, and a block with no instance name, + * where the variable has the same name as a member in the block." + */ + if (var->get_interface_type() != existing->get_interface_type()) { +if (!var->get_interface_type() || !existing->get_interface_type()) { + linker_error(prog, "declarations for %s `%s` are inside block " +"`%s` and outside a block", +mode_string(var), var->name, +var->get_interface_type() + ? var->get_interface_type()->name + : existing->get_interface_type()->name); + return; +} else if (strcmp(var->get_interface_type()->name, + existing->get_interface_type()->name) != 0) { + linker_error(prog, "declarations for %s `%s` are inside blocks " +"`%s` and `%s`", +mode_string(var), var->name, +existing->get_interface_type()->name, +var->get_interface_type()->name); + return; +} + } } else variables->add_variable(var); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] misc pahole repacking
Reviewed-by: Tapani Pälli (I verified the 1st one and I trust you on the 2nd one.) BTW I witnessed pahole crashing when processing visit() methods of ir_print_visitor class, did you experience that? My pahole version is 1.9, it dies in /lib64/libdwarves.so.1 after some prints like: --- 8< --- die__process_unit: DW_TAG_restrict_type (0x37) @ <0x122be84> not handled! die__process_unit: DW_TAG_unspecified_type (0x3b) @ <0x1230c68> not handled! die__process_unit: DW_TAG_restrict_type (0x37) @ <0x12312d3> not handled! die__process_unit: DW_TAG_unspecified_type (0x3b) @ <0x1231bc4> not handled! die__process_unit: DW_TAG_restrict_type (0x37) @ <0x12340bb> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x123f984> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x1242348> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x1242398> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x12423e8> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x1242572> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x12425c2> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x12425ea> not handled! die__process_function: DW_TAG_rvalue_reference_type (0x42) @ <0x1247929> not handled! On 31.01.2018 01:41, Dave Airlie wrote: This month's Dave hasn't got enough sleep to do real work, lets repack some structs. The format descriptions one is quite good though it reduces the radv binary data segment. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Odd input issue with mesa 18-rcX
On Tue, Jan 30, 2018 at 12:25 PM, Michel Dänzer wrote: > On 2018-01-29 10:34 AM, Ian Kumlien wrote: >> Hi, I know that this is not exactly the right place but I'm kinda >> dumbfounded... >> >> I run wayland with mesa on a radeon rx 480 graphics card - but it >> stopped working a while ago (been following mesa with git-snapshot >> builds) fully thinking that it would work it self out for the 18-rc:s... >> >> The issue is that input works, ie I can see and move the mouse... but >> clicking does *nothing* - switching to older mesa makes it work again - >> really odd. >> >> I run gentoo, rebuilt all x11-modules as you should... I have also >> verified that it *does* work with plain X11 > > See https://bugs.freedesktop.org/show_bug.cgi?id=104808 . ARGH! sorry, I did search trough the bug database and couldn't find it =( Please add it to the faq on the mesa homepage, if possible - could be causing issues for other apps as well... Back to git snapshots of mesa for me ;) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/7] disk cache: initialize cache path and index only when used
On 2018-01-30 23:17:04, Tapani Pälli wrote: > This patch makes disk_cache initialize path and index lazily so > that we can utilize disk_cache without a path using callback > functionality introduced by next patch. > > v2: unmap mmap and destroy queue only if index_mmap exists > > Signed-off-by: Tapani Pälli > --- > src/util/disk_cache.c | 127 > +++--- > 1 file changed, 78 insertions(+), 49 deletions(-) > > diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c > index 2884d3c9c1..9fa264440b 100644 > --- a/src/util/disk_cache.c > +++ b/src/util/disk_cache.c > @@ -77,6 +77,7 @@ > struct disk_cache { > /* The path to the cache directory. */ > char *path; > + bool path_init_failed; > > /* Thread queue for compressing and writing cache entries to disk */ > struct util_queue cache_queue; > @@ -178,37 +179,20 @@ concatenate_and_mkdir(void *ctx, const char *path, > const char *name) >return NULL; > } > > -#define DRV_KEY_CPY(_dst, _src, _src_size) \ > -do { \ > - memcpy(_dst, _src, _src_size); \ > - _dst += _src_size; \ > -} while (0); > - > -struct disk_cache * > -disk_cache_create(const char *gpu_name, const char *timestamp, > - uint64_t driver_flags) > +static bool > +disk_cache_path_init(struct disk_cache *cache) > { > - void *local; > - struct disk_cache *cache = NULL; > - char *path, *max_size_str; > - uint64_t max_size; > + void *local = NULL; > + char *path; > int fd = -1; > struct stat sb; > size_t size; > > - /* If running as a users other than the real user disable cache */ > - if (geteuid() != getuid()) > - return NULL; > - > /* A ralloc context for transient data during this invocation. */ > local = ralloc_context(NULL); > if (local == NULL) >goto fail; > > - /* At user request, disable shader cache entirely. */ > - if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", false)) > - goto fail; > - > /* Determine path for cache based on the first defined name as follows: > * > * $MESA_GLSL_CACHE_DIR > @@ -273,10 +257,6 @@ disk_cache_create(const char *gpu_name, const char > *timestamp, > goto fail; > } > > - cache = ralloc(NULL, struct disk_cache); > - if (cache == NULL) > - goto fail; > - > cache->path = ralloc_strdup(cache, path); > if (cache->path == NULL) >goto fail; > @@ -325,6 +305,58 @@ disk_cache_create(const char *gpu_name, const char > *timestamp, > cache->size = (uint64_t *) cache->index_mmap; > cache->stored_keys = cache->index_mmap + sizeof(uint64_t); > > + /* 1 thread was chosen because we don't really care about getting things > +* to disk quickly just that it's not blocking other tasks. > +* > +* The queue will resize automatically when it's full, so adding new jobs > +* doesn't stall. > +*/ > + util_queue_init(&cache->cache_queue, "disk_cache", 32, 1, > + UTIL_QUEUE_INIT_RESIZE_IF_FULL | > + UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY); > + > + ralloc_free(local); > + > + return true; > + > + fail: > + if (fd != -1) > + close(fd); > + > + if (local) > + ralloc_free(local); > + > + cache->path_init_failed = true; > + > + return false; > +} > + > +#define DRV_KEY_CPY(_dst, _src, _src_size) \ > +do { \ > + memcpy(_dst, _src, _src_size); \ > + _dst += _src_size; \ > +} while (0); > + > +struct disk_cache * > +disk_cache_create(const char *gpu_name, const char *timestamp, > + uint64_t driver_flags) > +{ > + struct disk_cache *cache = NULL; > + char *max_size_str; > + uint64_t max_size; > + > + /* If running as a users other than the real user disable cache */ > + if (geteuid() != getuid()) > + return NULL; > + > + /* At user request, disable shader cache entirely. */ > + if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", false)) > + return NULL; > + > + cache = rzalloc(NULL, struct disk_cache); > + if (cache == NULL) > + return NULL; > + > max_size = 0; > > max_size_str = getenv("MESA_GLSL_CACHE_MAX_SIZE"); > @@ -360,16 +392,6 @@ disk_cache_create(const char *gpu_name, const char > *timestamp, > > cache->max_size = max_size; > > - /* 1 thread was chosen because we don't really care about getting things > -* to disk quickly just that it's not blocking other tasks. > -* > -* The queue will resize automatically when it's full, so adding new jobs > -* doesn't stall. > -*/ > - util_queue_init(&cache->cache_queue, "disk_cache", 32, 1, > - UTIL_QUEUE_INIT_RESIZE_IF_FULL | > - UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY); > - > uint8_t cache_version = CACHE_VERSION; > size_t cv_size = sizeof(cache_version); > cache->drive
Re: [Mesa-dev] [PATCH v3 6/7] disk cache: add callback functionality
On 2018-01-30 23:17:05, Tapani Pälli wrote: > v2: add disk_cache_has_key, disk_cache_put_key support > using blob cache (Nicolai, Jordan) > > v3: rename set_cb as put_cb to match existing naming (Timothy) > > Signed-off-by: Tapani Pälli > --- > src/util/disk_cache.c | 49 + > src/util/disk_cache.h | 19 +++ > 2 files changed, 68 insertions(+) > > diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c > index 9fa264440b..5af0346c7a 100644 > --- a/src/util/disk_cache.c > +++ b/src/util/disk_cache.c > @@ -101,6 +101,9 @@ struct disk_cache { > /* Driver cache keys. */ > uint8_t *driver_keys_blob; > size_t driver_keys_blob_size; > + > + disk_cache_put_cb blob_put_cb; > + disk_cache_get_cb blob_get_cb; > }; > > struct disk_cache_put_job { > @@ -1012,6 +1015,11 @@ disk_cache_put(struct disk_cache *cache, const > cache_key key, > const void *data, size_t size, > struct cache_item_metadata *cache_item_metadata) > { > + if (cache->blob_put_cb) { > + cache->blob_put_cb(key, CACHE_KEY_SIZE, data, size); > + return; > + } > + > struct disk_cache_put_job *dc_job = >create_put_job(cache, key, data, size, cache_item_metadata); > > @@ -1079,6 +1087,29 @@ disk_cache_get(struct disk_cache *cache, const > cache_key key, size_t *size) > if (size) >*size = 0; > > + if (cache->blob_get_cb) { > +/* This is what Android EGL defines as the maxValueSize in egl_cache_t > + * class implementation. > + */ > +#define MAX_BLOB_SIZE 64 * 1024 What about 'const signed long max_blob_size = 64 * 1024;' instead? Reviewed-by: Jordan Justen > + void *blob = malloc(MAX_BLOB_SIZE); > + if (!blob) > + return NULL; > + > + signed long bytes = > + cache->blob_get_cb(key, CACHE_KEY_SIZE, blob, MAX_BLOB_SIZE); > + > + if (!bytes) { > + free(blob); > + return NULL; > + } > + > + if (size) > + *size = bytes; > + return blob; > +#undef MAX_BLOB_SIZE > + } > + > filename = get_cache_file(cache, key); > if (filename == NULL) >goto fail; > @@ -1194,6 +1225,11 @@ disk_cache_put_key(struct disk_cache *cache, const > cache_key key) > int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; > unsigned char *entry; > > + if (cache->blob_put_cb) { > + cache->blob_put_cb(key, CACHE_KEY_SIZE, key_chunk, sizeof(uint32_t)); > + return; > + } > + > if (!cache->path) >return; > > @@ -1216,6 +1252,11 @@ disk_cache_has_key(struct disk_cache *cache, const > cache_key key) > int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; > unsigned char *entry; > > + if (cache->blob_get_cb) { > + uint32_t blob; > + return cache->blob_get_cb(key, CACHE_KEY_SIZE, &blob, > sizeof(uint32_t)); > + } > + > /* Initialize path if not initialized yet. */ > if (cache->path_init_failed || > (!cache->path && !disk_cache_path_init(cache))) > @@ -1239,4 +1280,12 @@ disk_cache_compute_key(struct disk_cache *cache, const > void *data, size_t size, > _mesa_sha1_final(&ctx, key); > } > > +void > +disk_cache_set_callbacks(struct disk_cache *cache, disk_cache_put_cb put, > + disk_cache_get_cb get) > +{ > + cache->blob_put_cb = put; > + cache->blob_get_cb = get; > +} > + > #endif /* ENABLE_SHADER_CACHE */ > diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h > index 488b297ead..f84840fb5c 100644 > --- a/src/util/disk_cache.h > +++ b/src/util/disk_cache.h > @@ -50,6 +50,14 @@ typedef uint8_t cache_key[CACHE_KEY_SIZE]; > #define CACHE_ITEM_TYPE_UNKNOWN 0x0 > #define CACHE_ITEM_TYPE_GLSL 0x1 > > +typedef void > +(*disk_cache_put_cb) (const void *key, signed long keySize, > + const void *value, signed long valueSize); > + > +typedef signed long > +(*disk_cache_get_cb) (const void *key, signed long keySize, > + void *value, signed long valueSize); > + > struct cache_item_metadata { > /** > * The cache item type. This could be used to identify a GLSL cache item, > @@ -207,6 +215,10 @@ void > disk_cache_compute_key(struct disk_cache *cache, const void *data, size_t > size, > cache_key key); > > +void > +disk_cache_set_callbacks(struct disk_cache *cache, disk_cache_put_cb put, > + disk_cache_get_cb get); > + > #else > > static inline struct disk_cache * > @@ -260,6 +272,13 @@ disk_cache_compute_key(struct disk_cache *cache, const > void *data, size_t size, > return; > } > > +static inline void > +disk_cache_set_callbacks(struct disk_cache *cache, disk_cache_put_cb put, > + disk_cache_get_cb get) > +{ > + return; > +} > + > #endif /* ENABLE_SHADER_CACHE */ > > #ifdef __cplusplus > -- > 2.13.6 > ___ mesa-dev mailing list mes
Re: [Mesa-dev] [PATCH v2 7/7] i965: add __DRI2_BLOB support and set cache functions
On 2018-01-30 23:17:06, Tapani Pälli wrote: > v2: adjust to change that moved cache from ctx to screen > > Signed-off-by: Tapani Pälli > --- > src/mesa/drivers/dri/i965/intel_screen.c | 21 + > 1 file changed, 21 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index e1e520bc89..2c445af1e4 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -36,6 +36,7 @@ > #include "main/version.h" > #include "swrast/s_renderbuffer.h" > #include "util/ralloc.h" > +#include "util/disk_cache.h" > #include "brw_defines.h" > #include "brw_state.h" > #include "compiler/nir/nir.h" > @@ -1494,6 +1495,19 @@ brw_query_renderer_string(__DRIscreen *dri_screen, > return -1; > } > > +static void > +brw_set_cache_funcs(__DRIscreen *dri_screen, > +__DRIblobCacheSet set, __DRIblobCacheGet get) > +{ > + const struct intel_screen *const screen = > + (struct intel_screen *) dri_screen->driverPrivate; > + > + if (!screen->disk_cache) > + return; Could this cause us to fail tests if the disk cache is not enabled? For example, if they test setting the functions to NULL, or set multiple times? -Jordan > + > + disk_cache_set_callbacks(screen->disk_cache, set, get); > +} > + > static const __DRI2rendererQueryExtension intelRendererQueryExtension = { > .base = { __DRI2_RENDERER_QUERY, 1 }, > > @@ -1505,6 +1519,11 @@ static const __DRIrobustnessExtension dri2Robustness = > { > .base = { __DRI2_ROBUSTNESS, 1 } > }; > > +static const __DRI2blobExtension intelBlobExtension = { > + .base = { __DRI2_BLOB, 1 }, > + .set_cache_funcs = brw_set_cache_funcs > +}; > + > static const __DRIextension *screenExtensions[] = { > &intelTexBufferExtension.base, > &intelFenceExtension.base, > @@ -1513,6 +1532,7 @@ static const __DRIextension *screenExtensions[] = { > &intelRendererQueryExtension.base, > &dri2ConfigQueryExtension.base, > &dri2NoErrorExtension.base, > +&intelBlobExtension.base, > NULL > }; > > @@ -1525,6 +1545,7 @@ static const __DRIextension > *intelRobustScreenExtensions[] = { > &dri2ConfigQueryExtension.base, > &dri2Robustness.base, > &dri2NoErrorExtension.base, > +&intelBlobExtension.base, > NULL > }; > > -- > 2.13.6 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] meson: generate translations for driconf
Am Dienstag, 30. Januar 2018, 00:34:18 CET schrieb Dylan Baker: > Quoting Marc Dietrich (2018-01-27 06:36:51) > > > Hi Dylan, > > > > Am Donnerstag, 25. Januar 2018, 20:32:23 CET schrieb Dylan Baker: > > > Currently meson implements the same logic as SCons for translations, > > > namely it doesn't do them. This patch changes meson to use logic more > > > like autotools, and generate translations. To do this we have to go > > > behind meson's back a bit, and wrap the whole thing up in a single > > > python script. > > > > > > Meson has a module for gettext translations, but it assumes that one > > > will have a pot file, and then .mo translations will be generated and > > > checked into the source repo (or generated by hand using custom ninja > > > targets before building), mesa assumes that the targets will be > > > regenerated on each invocation of make or ninja. I think this can be > > > fixed upstream, but in the mean time this adds support for using > > > translations. > > > > I have some patch sitting in my local tree which also addresses this > > problem. It is a bit shorter and doesn't require an external script. I > > initially tried to solve this by adding some custom targets mostly in > > order to learn some meson. Unfortunately, I didn't came far. I also tried > > the i18n module, but as you said, there are still some features missing. > > > > Nevertheless, here is my solution using run_commands instead of external > > script. The advantage maybe better maintainability: > > > > diff --git a/src/util/xmlpool/meson.build b/src/util/xmlpool/meson.build > > index 97693fac8c..91f2b025f6 100644 > > --- a/src/util/xmlpool/meson.build > > +++ b/src/util/xmlpool/meson.build > > @@ -18,11 +18,36 @@ > > > > # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > > IN > > > > THE > > > > # SOFTWARE. > > > > +langs = ['ca', 'es', 'de', 'nl', 'sv', 'fr'] > > +deps = [] > > +pot = 'xmlpool.pot' > > +out = meson.current_build_dir() > > +in = meson.current_source_dir() > > + > > +xmlpool_pot = custom_target( > > + pot, > > + build_by_default : true, > > + input : 't_options.h', > > + output : pot, > > + command : ['xgettext', '-LC', '--from-code=utf-8', '-o', '@OUTPUT@', > > '@INPUT@'], > > +) > > + > > +foreach l : langs > > + po = l+'.po' > > + mo = '@0@/LC_MESSAGES/options.mo'.format(l) > > + message('Merge new strings @0@ into @1@'.format(po, pot)) > > + run_command('msgmerge', '-o', join_paths(out, po), join_paths(in, po), > > pot) + message('Updating (@0@) @1@ from @2@.'.format(l, mo, po)) > > + run_command('mkdir', '-p', join_paths(out, l, 'LC_MESSAGES')) > > + run_command('msgfmt', '-o', join_paths(out, mo), po) > > + deps += po > > +endforeach > > + > > Hi Marc, > > I'm not a huge fan of this, it adds three unix specific dependencies, and > every time ninja is run we'll call mkdir and msgfrmt multiple times. I yes, I understand that run_command is not portable. It is better to use python functions, even if it makes it complicated. > really like the idea of fixing the i18n module to return custom targets > instead of run targets, but that's going take some work and wont come until > a newer version of meson. Maybe the thing to do is to just rely on distros > manually updating the mo files. I have to admit that I don't understand how these translations work. I wonder why mesa doesn't just ship /locale//LC_MESSAGES/.mo files as many others do (and which would also be supported by meson). Instead, these files get magically included in the binaries. So a second option would be to rewrite this code to make use of external .mo files. Marc signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] radv: repack some structures post pipeline rework.
Series is: Reviewed-by: Samuel Pitoiset On 01/31/2018 12:41 AM, Dave Airlie wrote: From: Dave Airlie Bas's pipeline rework made me relook at the struct packing: radv_cmd_state: 984->968 radv_cmd_buffer: 2910->2896 radv_image: 1008->1000 radv_pipeline: 1640->1632 Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_private.h | 25 ++--- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h index e856f5f9b07..46a2e02612e 100644 --- a/src/amd/vulkan/radv_private.h +++ b/src/amd/vulkan/radv_private.h @@ -886,12 +886,14 @@ struct radv_attachment_state { struct radv_cmd_state { /* Vertex descriptors */ + enum radv_cmd_flush_bits flush_bits; bool vb_prefetch_dirty; + bool push_descriptors_dirty; + bool predicating; + uint64_t vb_va; unsigned vb_size; - bool push_descriptors_dirty; - bool predicating; uint32_t dirty; struct radv_pipeline *pipeline; @@ -915,7 +917,6 @@ struct radv_cmd_state { int32_t last_primitive_reset_en; uint32_t last_primitive_reset_index; - enum radv_cmd_flush_bits flush_bits; unsigned active_occlusion_queries; floatoffset_scale; uint32_t descriptors_dirty; @@ -962,14 +963,16 @@ struct radv_cmd_buffer { VkCommandBufferUsageFlagsusage_flags; VkCommandBufferLevel level; enum radv_cmd_buffer_status status; + uint32_t queue_family_index; + VkShaderStageFlags push_constant_stages; + VkResult record_result; + struct radeon_winsys_cs *cs; struct radv_cmd_state state; struct radv_vertex_binding vertex_bindings[MAX_VBS]; - uint32_t queue_family_index; uint8_t push_constants[MAX_PUSH_CONSTANTS_SIZE]; uint32_t dynamic_buffers[4 * MAX_DYNAMIC_BUFFERS]; - VkShaderStageFlags push_constant_stages; struct radv_push_descriptor_set push_descriptors; struct radv_descriptor_set meta_push_descriptors; struct radv_descriptor_set *descriptors[MAX_SETS]; @@ -983,12 +986,11 @@ struct radv_cmd_buffer { bool tess_rings_needed; bool sample_positions_needed; - VkResult record_result; int ring_offsets_idx; /* just used for verification */ uint32_t gfx9_fence_offset; - struct radeon_winsys_bo *gfx9_fence_bo; uint32_t gfx9_fence_idx; + struct radeon_winsys_bo *gfx9_fence_bo; }; struct radv_image; @@ -1171,11 +1173,11 @@ struct radv_pipeline { struct radv_pipeline_layout * layout; + VkShaderStageFlags active_stages; bool needs_data_cache; bool need_indirect_descriptor_sets; struct radv_shader_variant * shaders[MESA_SHADER_STAGES]; struct radv_shader_variant *gs_copy_shader; - VkShaderStageFlags active_stages; struct radeon_winsys_cs cs; @@ -1189,13 +1191,13 @@ struct radv_pipeline { struct radv_multisample_state ms; uint32_t spi_baryc_cntl; bool prim_restart_enable; + bool can_use_guardband; + uint8_t vtx_emit_num; unsigned esgs_ring_size; unsigned gsvs_ring_size; uint32_t vtx_base_sgpr; struct radv_ia_multi_vgt_param_helpers ia_multi_vgt_param; - uint8_t vtx_emit_num; struct radv_prim_vertex_count prim_vertex_count; - bool can_use_guardband; uint32_t needed_dynamic_state; } graphics; }; @@ -1301,13 +1303,14 @@ struct radv_image { unsigned queue_family_mask; bool exclusive; bool shareable; + bool tc_compatible_htile; /* Set when bound */ struct radeon_winsys_bo *bo; VkDeviceSize offset; uint64_t dcc_offset; uint64_t htile_offset; - bool tc_compatible_htile; + struct radeon_surf surface; struct radv_fmask_info fmask; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freede
Re: [Mesa-dev] [PATCH] docs/features: mark EXT_semaphore(_fd) as DONE
You also need to update docs/relnotes. On 01/31/2018 03:46 AM, Andres Rodriguez wrote: Support for these extensions is available in radeonsi. Signed-off-by: Andres Rodriguez --- docs/features.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index 2e110d9994..1672460a2f 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -316,8 +316,8 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve GL_EXT_memory_object DONE (radeonsi) GL_EXT_memory_object_fd DONE (radeonsi) GL_EXT_memory_object_win32not started - GL_EXT_semaphore not started - GL_EXT_semaphore_fd not started + GL_EXT_semaphore DONE (radeonsi) + GL_EXT_semaphore_fd DONE (radeonsi) GL_EXT_semaphore_win32not started GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+) GL_KHR_texture_compression_astc_hdr DONE (i965/bxt) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] ac/nir: fix emission of ffract for 64-bit
Signed-off-by: Samuel Pitoiset --- src/amd/common/ac_nir_to_llvm.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index fd5989389b..04fe51935a 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -1383,15 +1383,24 @@ static LLVMValueRef emit_isign(struct ac_llvm_context *ctx, } static LLVMValueRef emit_ffract(struct ac_llvm_context *ctx, - LLVMValueRef src0) + LLVMValueRef src0, unsigned bitsize) { - const char *intr = "llvm.floor.f32"; + LLVMTypeRef type; + char *intr; + + if (bitsize == 32) { + intr = "llvm.floor.f32"; + type = ctx->f32; + } else { + intr = "llvm.floor.f64"; + type = ctx->f64; + } + LLVMValueRef fsrc0 = ac_to_float(ctx, src0); LLVMValueRef params[] = { fsrc0, }; - LLVMValueRef floor = ac_build_intrinsic(ctx, intr, - ctx->f32, params, 1, + LLVMValueRef floor = ac_build_intrinsic(ctx, intr, type, params, 1, AC_FUNC_ATTR_READNONE); return LLVMBuildFSub(ctx->builder, fsrc0, floor, ""); } @@ -1845,7 +1854,7 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) ac_to_float_type(&ctx->ac, def_type),src[0]); break; case nir_op_ffract: - result = emit_ffract(&ctx->ac, src[0]); + result = emit_ffract(&ctx->ac, src[0], instr->dest.dest.ssa.bit_size); break; case nir_op_fsin: result = emit_intrin_1f_param(&ctx->ac, "llvm.sin", @@ -4026,8 +4035,8 @@ static LLVMValueRef load_sample_pos(struct ac_nir_context *ctx) { LLVMValueRef values[2]; - values[0] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[0]); - values[1] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[1]); + values[0] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[0], 32); + values[1] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[1], 32); return ac_build_gather_values(&ctx->ac, values, 2); } -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: do not dump meta shader stats
That's quite useless and that pollutes the output. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_pipeline.c | 30 +- src/amd/vulkan/radv_shader.h | 9 + 2 files changed, 18 insertions(+), 21 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index 785c216b4a..6547637338 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -110,18 +110,6 @@ void radv_DestroyPipeline( radv_pipeline_destroy(device, pipeline, pAllocator); } -static void radv_dump_pipeline_stats(struct radv_device *device, struct radv_pipeline *pipeline) -{ - int i; - - for (i = 0; i < MESA_SHADER_STAGES; i++) { - if (!pipeline->shaders[i]) - continue; - - radv_shader_dump_stats(device, pipeline->shaders[i], i, stderr); - } -} - static uint32_t get_hash_flags(struct radv_device *device) { uint32_t hash_flags = 0; @@ -1861,8 +1849,15 @@ void radv_create_shaders(struct radv_pipeline *pipeline, for (int i = 0; i < MESA_SHADER_STAGES; ++i) { free(codes[i]); - if (modules[i] && !pipeline->device->keep_shader_info) - ralloc_free(nir[i]); + if (modules[i]) { + if (!pipeline->device->keep_shader_info) + ralloc_free(nir[i]); + + if (radv_can_dump_shader_stats(device, modules[i])) + radv_shader_dump_stats(device, + pipeline->shaders[i], + i, stderr); + } } if (fs_m.nir) @@ -3233,10 +3228,6 @@ radv_pipeline_init(struct radv_pipeline *pipeline, pipeline->graphics.vtx_emit_num = 2; } - if (device->instance->debug_flags & RADV_DEBUG_DUMP_SHADER_STATS) { - radv_dump_pipeline_stats(device, pipeline); - } - result = radv_pipeline_scratch_init(device, pipeline); radv_pipeline_generate_pm4(pipeline, pCreateInfo, extra, &blend, &tess, &gs, prim, gs_out); @@ -3400,9 +3391,6 @@ static VkResult radv_compute_pipeline_create( *pPipeline = radv_pipeline_to_handle(pipeline); - if (device->instance->debug_flags & RADV_DEBUG_DUMP_SHADER_STATS) { - radv_dump_pipeline_stats(device, pipeline); - } return VK_SUCCESS; } diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h index f6486863f8..b07f8a89e7 100644 --- a/src/amd/vulkan/radv_shader.h +++ b/src/amd/vulkan/radv_shader.h @@ -122,4 +122,13 @@ radv_can_dump_shader(struct radv_device *device, module && !module->nir; } +static inline bool +radv_can_dump_shader_stats(struct radv_device *device, + struct radv_shader_module *module) +{ + /* Only dump non-meta shader stats. */ + return device->instance->debug_flags & RADV_DEBUG_DUMP_SHADER_STATS && + module && !module->nir; +} + #endif -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"
Reviewed-by: Juan A. Suarez On Fri, 2018-01-26 at 12:10 +0100, Antia Puentes wrote: > This reverts commit 513c2263cbff45edb105c7b46e58f316e06746ab. > > _mesa_base_fbo_format_ is used to validate the internalformat > passed to RenderbufferStorage, which in the OpenGL 4.6 is said: > > "An INVALID_ENUM error is generated if internalformat is not one of the > color-renderable, depth-renderable, or stencil-renderable formats defined > in section 9.4." > > RGB9_E5 format is not renderable, as stated in the same specification > (Bug 9338). > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794 > > Cc: Juan A. Suarez Romero > Cc: Kenneth Graunke > --- > src/mesa/main/fbobject.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c > index c72204e11a0..d23916d1ad7 100644 > --- a/src/mesa/main/fbobject.c > +++ b/src/mesa/main/fbobject.c > @@ -1976,9 +1976,6 @@ _mesa_base_fbo_format(const struct gl_context *ctx, > GLenum internalFormat) > ctx->Extensions.ARB_texture_float) || >_mesa_is_gles3(ctx) /* EXT_color_buffer_float */ ) > ? GL_RGBA : 0; > - case GL_RGB9_E5: > - return (_mesa_is_desktop_gl(ctx) && > ctx->Extensions.EXT_texture_shared_exponent) > - ? GL_RGB: 0; > case GL_ALPHA16F_ARB: > case GL_ALPHA32F_ARB: >return ctx->API == API_OPENGL_COMPAT && ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] ac/nir: fix emission of ffract for 64-bit
Reviewed-by: Bas Nieuwenhuizen On Wed, Jan 31, 2018 at 11:23 AM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > src/amd/common/ac_nir_to_llvm.c | 23 --- > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_ > llvm.c > index fd5989389b..04fe51935a 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -1383,15 +1383,24 @@ static LLVMValueRef emit_isign(struct > ac_llvm_context *ctx, > } > > static LLVMValueRef emit_ffract(struct ac_llvm_context *ctx, > - LLVMValueRef src0) > + LLVMValueRef src0, unsigned bitsize) > { > - const char *intr = "llvm.floor.f32"; > + LLVMTypeRef type; > + char *intr; > + > + if (bitsize == 32) { > + intr = "llvm.floor.f32"; > + type = ctx->f32; > + } else { > + intr = "llvm.floor.f64"; > + type = ctx->f64; > + } > + > LLVMValueRef fsrc0 = ac_to_float(ctx, src0); > LLVMValueRef params[] = { > fsrc0, > }; > - LLVMValueRef floor = ac_build_intrinsic(ctx, intr, > - ctx->f32, params, 1, > + LLVMValueRef floor = ac_build_intrinsic(ctx, intr, type, params, 1, > AC_FUNC_ATTR_READNONE); > return LLVMBuildFSub(ctx->builder, fsrc0, floor, ""); > } > @@ -1845,7 +1854,7 @@ static void visit_alu(struct ac_nir_context *ctx, > const nir_alu_instr *instr) > ac_to_float_type(&ctx->ac, > def_type),src[0]); > break; > case nir_op_ffract: > - result = emit_ffract(&ctx->ac, src[0]); > + result = emit_ffract(&ctx->ac, src[0], > instr->dest.dest.ssa.bit_size); > break; > case nir_op_fsin: > result = emit_intrin_1f_param(&ctx->ac, "llvm.sin", > @@ -4026,8 +4035,8 @@ static LLVMValueRef load_sample_pos(struct > ac_nir_context *ctx) > { > LLVMValueRef values[2]; > > - values[0] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[0]); > - values[1] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[1]); > + values[0] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[0], 32); > + values[1] = emit_ffract(&ctx->ac, ctx->abi->frag_pos[1], 32); > return ac_build_gather_values(&ctx->ac, values, 2); > } > > -- > 2.16.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH libdrm] meson: fix libdrm_nouveau pkgconfig include directories
On Thursday, 2018-01-25 16:14:45 -0800, Dylan Baker wrote: > Signed-off-by: Dylan Baker Reviewed-by: Eric Engestrom > --- > > I have tested building every mesa driver against this (with and without udev!) > so I'm pretty sure that this is the last pkgbuild problem. > > I'm sure I'll be sad in a day or two... > > nouveau/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/nouveau/meson.build b/nouveau/meson.build > index bfecf84b..f031cd63 100644 > --- a/nouveau/meson.build > +++ b/nouveau/meson.build > @@ -45,7 +45,7 @@ install_headers( > pkg.generate( >name : 'libdrm_nouveau', >libraries : libdrm_nouveau, > - subdirs : ['.', 'nouveau'], > + subdirs : ['.', 'libdrm', 'libdrm/nouveau'], >version : meson.project_version(), >requires_private : 'libdrm', >description : 'Userspace interface to nouveau kernel DRM services', > -- > 2.16.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: Don't expose VK_KHX_multiview on android.
deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. --- src/amd/vulkan/radv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_extensions.py index ab34c01cb6..e6c6e63627 100644 --- a/src/amd/vulkan/radv_extensions.py +++ b/src/amd/vulkan/radv_extensions.py @@ -81,7 +81,7 @@ EXTENSIONS = [ Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'), Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'), Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'), -Extension('VK_KHX_multiview', 1, True), +Extension('VK_KHX_multiview', 1, '!ANDROID'), Extension('VK_EXT_debug_report', 9, True), Extension('VK_EXT_discard_rectangles',1, True), Extension('VK_EXT_external_memory_dma_buf', 1, True), -- 2.16.0.rc1.238.g530d649a79-goog ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: Don't expose VK_KHX_multiview on android.
Reviewed-by: Samuel Pitoiset On 01/31/2018 12:31 PM, Bas Nieuwenhuizen wrote: deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. --- src/amd/vulkan/radv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_extensions.py index ab34c01cb6..e6c6e63627 100644 --- a/src/amd/vulkan/radv_extensions.py +++ b/src/amd/vulkan/radv_extensions.py @@ -81,7 +81,7 @@ EXTENSIONS = [ Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'), Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'), Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'), -Extension('VK_KHX_multiview', 1, True), +Extension('VK_KHX_multiview', 1, '!ANDROID'), Extension('VK_EXT_debug_report', 9, True), Extension('VK_EXT_discard_rectangles',1, True), Extension('VK_EXT_external_memory_dma_buf', 1, True), ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] renderonly: fix dumb BO allocation for non 32bpp formats
On 30 January 2018 at 14:22, Lucas Stach wrote: > Take into account the resource format, instead of applying a hardcoded > 32bpp. This not only over-allocates 16bpp formats, but also results in > a wrong stride being filled into the handle. Bikeshed: just use util_format_getblocksizebits()? getblocksize() internally just divides getblocksizebits by 8, so this is redundant. WIth that: Reviewed-by: Daniel Stone Cheers, Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 1/2] meson: centralise the libdrm versions information
On 30 January 2018 at 21:31, Dylan Baker wrote: > Quoting Emil Velikov (2018-01-30 10:43:06) >> On 29 January 2018 at 18:57, Dylan Baker wrote: >> > Quoting Eric Engestrom (2018-01-29 10:15:50) >> >> The big comment is taken from the equivalent block in configure.ac >> >> >> >> Signed-off-by: Eric Engestrom >> >> --- >> >> meson.build | 30 >> >> + >> >> src/gallium/targets/d3dadapter9/meson.build | 2 +- >> >> src/mesa/drivers/dri/meson.build| 2 +- >> >> 3 files changed, 24 insertions(+), 10 deletions(-) >> >> >> >> diff --git a/meson.build b/meson.build >> >> index 0a00798c2a5093ec803b..6d7a8e976ff6ad002d9a 100644 >> >> --- a/meson.build >> >> +++ b/meson.build >> >> @@ -41,6 +41,20 @@ pre_args = [ >> >> >> >> '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa";', >> >> ] >> >> >> >> +# The idea is that libdrm is distributed as one cohesive package, even >> >> +# though it is composed of multiple libraries. However some drivers >> >> +# may have different version requirements than others. This list >> >> +# codifies which drivers need which version of libdrm. Any libdrm >> >> +# version dependencies in non-driver-specific code should be reflected >> >> +# in the first entry. >> >> +libdrm_version = '2.4.75' >> >> +libdrm_amdgpu_version= '2.4.89' >> >> +libdrm_etnaviv_version = '2.4.82' >> >> +libdrm_freedreno_version = '2.4.82' >> >> +libdrm_intel_version = '2.4.75' >> >> +libdrm_nouveau_version = '2.4.66' >> >> +libdrm_radeon_version= '2.4.71' >> > >> > Is there any reason we can't just make these (for example): >> > libdrm_radeon_version= '>= 2.4.71' >> > >> > Since that avoids all of the format calls? >> > >> Is there particular reason why meson doesn't allow plain >> concatenation, and one must go through the format dance? >> Off the top of my head, I think that most higher level programming >> languages (including python) have it, making for clearer and more >> obvious code. >> >> That aside: >> A huge +1 from me on the idea, although the libdrm_foo checks should >> become libdrm && libdrm_foo. >> See commit 2b4eaabff01a3a8ea0c4742ac481492092c1ab4f. >> >> Thanks >> Emil > > I'm confused by that commit. pkg-config is supposed to handle this, > libdrm_intel > (for example) has `Requires : libdrm` in it, so when you generate libs you get > `-ldrm_intel -ldrm`. Why do we need to check libdrm as well? If it's just that > we need to make sure that the version matches we should fix the pkg-config > files > in libdrm to set `Requires : libdrm >= version`. Or am I missing something? > Only libdrm_intel has Requires: libdrm. Everyone else has the 'correct' Requires.Private Thus adding a version check won't be enough. Personally the commit feels like a workaround but Dave and Ilia wanted it, so we went ahead. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] radv: reduce vk_format_descriptions memory usage.
On 30 January 2018 at 23:41, Dave Airlie wrote: > From: Dave Airlie > > This repacks to reduce the usage from 72->64 bytes, but also > makes the descriptions static const so they don't just stay in > the same unit. > > Signed-off-by: Dave Airlie > --- > src/amd/vulkan/vk_format.h| 3 ++- > src/amd/vulkan/vk_format_table.py | 4 ++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/src/amd/vulkan/vk_format.h b/src/amd/vulkan/vk_format.h > index 43265ed3d97..7e2c1e8ae3f 100644 > --- a/src/amd/vulkan/vk_format.h > +++ b/src/amd/vulkan/vk_format.h > @@ -117,11 +117,12 @@ struct vk_format_channel_description { > struct vk_format_description > { > VkFormat format; > + enum vk_format_layout layout; > + > const char *name; > const char *short_name; > > struct vk_format_block block; > - enum vk_format_layout layout; > > unsigned nr_channels:3; > unsigned is_array:1; > diff --git a/src/amd/vulkan/vk_format_table.py > b/src/amd/vulkan/vk_format_table.py > index 36352b108d0..df08ab17970 100644 > --- a/src/amd/vulkan/vk_format_table.py > +++ b/src/amd/vulkan/vk_format_table.py > @@ -125,13 +125,13 @@ def write_format_table(formats): > print " }," > > for format in formats: > -print 'const struct vk_format_description' > +print 'static const struct vk_format_description' > print 'vk_format_%s_description = {' % (format.short_name(),) > print " %s," % (format.name,) > +print " %s," % (layout_map(format.layout),) It'll be nice to make this a designated initializer. It will spare similar changes in the future, plus make it easier to search though ;-) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 1/2] meson: centralise the libdrm versions information
On Tuesday, 2018-01-30 13:31:09 -0800, Dylan Baker wrote: > Quoting Emil Velikov (2018-01-30 10:43:06) > > On 29 January 2018 at 18:57, Dylan Baker wrote: > > > Quoting Eric Engestrom (2018-01-29 10:15:50) > > >> The big comment is taken from the equivalent block in configure.ac > > >> > > >> Signed-off-by: Eric Engestrom > > >> --- > > >> meson.build | 30 > > >> + > > >> src/gallium/targets/d3dadapter9/meson.build | 2 +- > > >> src/mesa/drivers/dri/meson.build| 2 +- > > >> 3 files changed, 24 insertions(+), 10 deletions(-) > > >> > > >> diff --git a/meson.build b/meson.build > > >> index 0a00798c2a5093ec803b..6d7a8e976ff6ad002d9a 100644 > > >> --- a/meson.build > > >> +++ b/meson.build > > >> @@ -41,6 +41,20 @@ pre_args = [ > > >> > > >> '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa";', > > >> ] > > >> > > >> +# The idea is that libdrm is distributed as one cohesive package, even > > >> +# though it is composed of multiple libraries. However some drivers > > >> +# may have different version requirements than others. This list > > >> +# codifies which drivers need which version of libdrm. Any libdrm > > >> +# version dependencies in non-driver-specific code should be reflected > > >> +# in the first entry. > > >> +libdrm_version = '2.4.75' > > >> +libdrm_amdgpu_version= '2.4.89' > > >> +libdrm_etnaviv_version = '2.4.82' > > >> +libdrm_freedreno_version = '2.4.82' > > >> +libdrm_intel_version = '2.4.75' > > >> +libdrm_nouveau_version = '2.4.66' > > >> +libdrm_radeon_version= '2.4.71' > > > > > > Is there any reason we can't just make these (for example): > > > libdrm_radeon_version= '>= 2.4.71' > > > > > > Since that avoids all of the format calls? > > > > > Is there particular reason why meson doesn't allow plain > > concatenation, and one must go through the format dance? > > Off the top of my head, I think that most higher level programming > > languages (including python) have it, making for clearer and more > > obvious code. I'm an idiot, meson supports `'foo' + 'bar'`; I'll send a v2 in a minute. > > > > That aside: > > A huge +1 from me on the idea, although the libdrm_foo checks should > > become libdrm && libdrm_foo. > > See commit 2b4eaabff01a3a8ea0c4742ac481492092c1ab4f. > > > > Thanks > > Emil > > I'm confused by that commit. pkg-config is supposed to handle this, > libdrm_intel > (for example) has `Requires : libdrm` in it, so when you generate libs you get > `-ldrm_intel -ldrm`. Why do we need to check libdrm as well? If it's just that > we need to make sure that the version matches we should fix the pkg-config > files > in libdrm to set `Requires : libdrm >= version`. Or am I missing something? I must say I'm confused as well: specific drivers should depend on the version of libdrm_$drv they need, and the generic code all drivers use depends on the version of libdrm it needs; this should cover all the cases. Quoting the comment in the code (I think you wrote it Emil?): > The idea is that libdrm is distributed as one cohesive package, even > though it is composed of multiple libraries. This means that libdrm_$drv 2.4.99 comes with libdrm 2.4.99, so there is no need to check both versions as they will always be identical, right? Are there distributions that provide separate libdrm_$drv and libdrm packages? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa v2 1/2] meson: centralise the libdrm versions information
The big comment is taken from the equivalent block in configure.ac Signed-off-by: Eric Engestrom --- meson.build | 30 + src/gallium/targets/d3dadapter9/meson.build | 2 +- src/mesa/drivers/dri/meson.build| 2 +- 3 files changed, 24 insertions(+), 10 deletions(-) diff --git a/meson.build b/meson.build index 80ea60ffa7d915654a89..30c2198d77dde383d7ac 100644 --- a/meson.build +++ b/meson.build @@ -41,6 +41,20 @@ pre_args = [ '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa";', ] +# The idea is that libdrm is distributed as one cohesive package, even +# though it is composed of multiple libraries. However some drivers +# may have different version requirements than others. This list +# codifies which drivers need which version of libdrm. Any libdrm +# version dependencies in non-driver-specific code should be reflected +# in the first entry. +libdrm_version = '2.4.75' +libdrm_amdgpu_version= '2.4.89' +libdrm_etnaviv_version = '2.4.82' +libdrm_freedreno_version = '2.4.82' +libdrm_intel_version = '2.4.75' +libdrm_nouveau_version = '2.4.66' +libdrm_radeon_version= '2.4.71' + with_vulkan_icd_dir = get_option('vulkan-icd-dir') with_tests = get_option('build-tests') with_valgrind = get_option('valgrind') @@ -199,7 +213,7 @@ endif dep_libdrm_intel = [] if with_dri_i915 or with_gallium_i915 - dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75') + dep_libdrm_intel = dependency('libdrm_intel', version : '>= ' + libdrm_intel_version) endif system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 'linux'].contains(host_machine.system()) @@ -922,7 +936,7 @@ else endif with_gallium_drisw_kms = false -dep_libdrm = dependency('libdrm', version : '>= 2.4.75', +dep_libdrm = dependency('libdrm', version : '>= ' + libdrm_version, required : with_dri2 or with_dri3) if dep_libdrm.found() pre_args += '-DHAVE_LIBDRM' @@ -957,20 +971,20 @@ dep_libdrm_nouveau = [] dep_libdrm_etnaviv = [] dep_libdrm_freedreno = [] if with_amd_vk or with_gallium_radeonsi - dep_libdrm_amdgpu = dependency('libdrm_amdgpu', version : '>= 2.4.89') + dep_libdrm_amdgpu = dependency('libdrm_amdgpu', version : '>= ' + libdrm_amdgpu_version) endif if (with_gallium_radeonsi or with_dri_r100 or with_dri_r200 or with_gallium_r300 or with_gallium_r600) - dep_libdrm_radeon = dependency('libdrm_radeon', version : '>= 2.4.71') + dep_libdrm_radeon = dependency('libdrm_radeon', version : '>= ' + libdrm_radeon_version) endif if with_gallium_nouveau or with_dri_nouveau - dep_libdrm_nouveau = dependency('libdrm_nouveau', version : '>= 2.4.66') + dep_libdrm_nouveau = dependency('libdrm_nouveau', version : '>= ' + libdrm_nouveau_version) endif if with_gallium_etnaviv - dep_libdrm_etnaviv = dependency('libdrm_etnaviv', version : '>= 2.4.82') + dep_libdrm_etnaviv = dependency('libdrm_etnaviv', version : '>= ' + libdrm_etnaviv_version) endif if with_gallium_freedreno - dep_libdrm_freedreno = dependency('libdrm_freedreno', version : '>= 2.4.89') + dep_libdrm_freedreno = dependency('libdrm_freedreno', version : '>= ' + libdrm_freedreno_version) endif llvm_modules = ['bitwriter', 'engine', 'mcdisassembler', 'mcjit'] @@ -1203,7 +1217,7 @@ gl_priv_reqs = [ 'x11', 'xext', 'xdamage >= 1.1', 'xfixes', 'x11-xcb', 'xcb', 'xcb-glx >= 1.8.1'] if dep_libdrm.found() - gl_priv_reqs += 'libdrm >= 2.4.75' + gl_priv_reqs += 'libdrm >= ' + libdrm_version endif if dep_xxf86vm != [] and dep_xxf86vm.found() gl_priv_reqs += 'xxf86vm' diff --git a/src/gallium/targets/d3dadapter9/meson.build b/src/gallium/targets/d3dadapter9/meson.build index 5476e80e70cf9e2dba5a..498737d1edbf39b3bea2 100644 --- a/src/gallium/targets/d3dadapter9/meson.build +++ b/src/gallium/targets/d3dadapter9/meson.build @@ -78,5 +78,5 @@ pkg.generate( name : 'd3d', description : 'Native D3D driver modules', version : '.'.join(nine_version), - requires_private : 'libdrm >= 2.4.75', + requires_private : 'libdrm >= ' + libdrm_version, ) diff --git a/src/mesa/drivers/dri/meson.build b/src/mesa/drivers/dri/meson.build index 87021fba885ab148988d..2a2757577828598489c9 100644 --- a/src/mesa/drivers/dri/meson.build +++ b/src/mesa/drivers/dri/meson.build @@ -69,7 +69,7 @@ endif if with_dri dri_req_private = [] if dep_libdrm.found() -dri_req_private = ['libdrm >= 2.4.75'] # FIXME: don't hardcode this +dri_req_private += 'libdrm >= ' + libdrm_version endif pkg.generate( -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa v2 2/2] meson: move dep_libdrm_intel with the other dep_libdrm_*
Signed-off-by: Eric Engestrom --- meson.build | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/meson.build b/meson.build index 30c2198d77dde383d7ac..a98410133d462028d670 100644 --- a/meson.build +++ b/meson.build @@ -211,11 +211,6 @@ if with_gallium_pl111 and not with_gallium_vc4 error('pl111 driver requires vc4 driver') endif -dep_libdrm_intel = [] -if with_dri_i915 or with_gallium_i915 - dep_libdrm_intel = dependency('libdrm_intel', version : '>= ' + libdrm_intel_version) -endif - system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 'linux'].contains(host_machine.system()) if host_machine.system() == 'darwin' @@ -970,6 +965,10 @@ dep_libdrm_radeon = [] dep_libdrm_nouveau = [] dep_libdrm_etnaviv = [] dep_libdrm_freedreno = [] +dep_libdrm_intel = [] +if with_dri_i915 or with_gallium_i915 + dep_libdrm_intel = dependency('libdrm_intel', version : '>= ' + libdrm_intel_version) +endif if with_amd_vk or with_gallium_radeonsi dep_libdrm_amdgpu = dependency('libdrm_amdgpu', version : '>= ' + libdrm_amdgpu_version) endif -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] radv: reduce vk_format_descriptions memory usage.
On Wednesday, 2018-01-31 11:44:11 +, Emil Velikov wrote: > On 30 January 2018 at 23:41, Dave Airlie wrote: > > From: Dave Airlie > > > > This repacks to reduce the usage from 72->64 bytes, but also > > makes the descriptions static const so they don't just stay in > > the same unit. > > > > Signed-off-by: Dave Airlie > > --- > > src/amd/vulkan/vk_format.h| 3 ++- > > src/amd/vulkan/vk_format_table.py | 4 ++-- > > 2 files changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/src/amd/vulkan/vk_format.h b/src/amd/vulkan/vk_format.h > > index 43265ed3d97..7e2c1e8ae3f 100644 > > --- a/src/amd/vulkan/vk_format.h > > +++ b/src/amd/vulkan/vk_format.h > > @@ -117,11 +117,12 @@ struct vk_format_channel_description { > > struct vk_format_description > > { > > VkFormat format; > > + enum vk_format_layout layout; > > + > > const char *name; > > const char *short_name; > > > > struct vk_format_block block; > > - enum vk_format_layout layout; > > > > unsigned nr_channels:3; > > unsigned is_array:1; > > diff --git a/src/amd/vulkan/vk_format_table.py > > b/src/amd/vulkan/vk_format_table.py > > index 36352b108d0..df08ab17970 100644 > > --- a/src/amd/vulkan/vk_format_table.py > > +++ b/src/amd/vulkan/vk_format_table.py > > @@ -125,13 +125,13 @@ def write_format_table(formats): > > print " }," > > > > for format in formats: > > -print 'const struct vk_format_description' > > +print 'static const struct vk_format_description' > > print 'vk_format_%s_description = {' % (format.short_name(),) > > print " %s," % (format.name,) > > +print " %s," % (layout_map(format.layout),) > It'll be nice to make this a designated initializer. > It will spare similar changes in the future, plus make it easier to > search though ;-) I had the same thought, and just did that there as well as in gallium; I'll send the patches in a minute. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 2/2] gallium/util: used designated initialisers in formats table
Signed-off-by: Eric Engestrom --- src/gallium/auxiliary/util/u_format_table.py | 98 ++-- 1 file changed, 50 insertions(+), 48 deletions(-) diff --git a/src/gallium/auxiliary/util/u_format_table.py b/src/gallium/auxiliary/util/u_format_table.py index a09ae53cbc8966ba5bb1..3da5011840243c4812a7 100644 --- a/src/gallium/auxiliary/util/u_format_table.py +++ b/src/gallium/auxiliary/util/u_format_table.py @@ -125,77 +125,79 @@ def do_swizzle_array(channels, swizzles): for format in formats: print 'const struct util_format_description' print 'util_format_%s_description = {' % (format.short_name(),) -print " %s," % (format.name,) -print " \"%s\"," % (format.name,) -print " \"%s\"," % (format.short_name(),) -print " {%u, %u, %u},\t/* block */" % (format.block_width, format.block_height, format.block_size()) -print " %s," % (layout_map(format.layout),) -print " %u,\t/* nr_channels */" % (format.nr_channels(),) -print " %s,\t/* is_array */" % (bool_map(format.is_array()),) -print " %s,\t/* is_bitmask */" % (bool_map(format.is_bitmask()),) -print " %s,\t/* is_mixed */" % (bool_map(format.is_mixed()),) +print " .format = %s," % (format.name,) +print " .name = \"%s\"," % (format.name,) +print " .short_name = \"%s\"," % (format.short_name(),) +print " .block = {%u, %u, %u}," % (format.block_width, format.block_height, format.block_size()) +print " .layout = %s," % (layout_map(format.layout),) +print " .nr_channels = %u," % (format.nr_channels(),) +print " .is_array = %s," % (bool_map(format.is_array()),) +print " .is_bitmask = %s," % (bool_map(format.is_bitmask()),) +print " .is_mixed = %s," % (bool_map(format.is_mixed()),) +print " .channel = " u_format_pack.print_channels(format, do_channel_array) +print " .swizzle = " u_format_pack.print_channels(format, do_swizzle_array) -print " %s," % (colorspace_map(format.colorspace),) +print " .colorspace = %s," % (colorspace_map(format.colorspace),) access = True if format.layout in ('bptc', 'astc'): access = False if format.layout == 'etc' and format.short_name() != 'etc1_rgb8': access = False if format.colorspace != ZS and not format.is_pure_color() and access: -print " &util_format_%s_unpack_rgba_8unorm," % format.short_name() -print " &util_format_%s_pack_rgba_8unorm," % format.short_name() +print " .unpack_rgba_8unorm = &util_format_%s_unpack_rgba_8unorm," % format.short_name() +print " .pack_rgba_8unorm = &util_format_%s_pack_rgba_8unorm," % format.short_name() if format.layout == 's3tc' or format.layout == 'rgtc': -print " &util_format_%s_fetch_rgba_8unorm," % format.short_name() +print " .fetch_rgba_8unorm = &util_format_%s_fetch_rgba_8unorm," % format.short_name() else: -print " NULL, /* fetch_rgba_8unorm */" -print " &util_format_%s_unpack_rgba_float," % format.short_name() -print " &util_format_%s_pack_rgba_float," % format.short_name() -print " &util_format_%s_fetch_rgba_float," % format.short_name() +print " .fetch_rgba_8unorm = NULL," +print " .unpack_rgba_float = &util_format_%s_unpack_rgba_float," % format.short_name() +print " .pack_rgba_float = &util_format_%s_pack_rgba_float," % format.short_name() +print " .fetch_rgba_float = &util_format_%s_fetch_rgba_float," % format.short_name() else: -print " NULL, /* unpack_rgba_8unorm */" -print " NULL, /* pack_rgba_8unorm */" -print " NULL, /* fetch_rgba_8unorm */" -print " NULL, /* unpack_rgba_float */" -print " NULL, /* pack_rgba_float */" -print " NULL, /* fetch_rgba_float */" +print " .unpack_rgba_8unorm = NULL," +print " .pack_rgba_8unorm = NULL," +print " .fetch_rgba_8unorm = NULL," +print " .unpack_rgba_float = NULL," +print " .pack_rgba_float = NULL," +print " .fetch_rgba_float = NULL," if format.has_depth(): -print " &util_format_%s_unpack_z_32unorm," % format.short_name() -print " &util_format_%s_pack_z_32unorm," % format.short_name() -print " &util_format_%s_unpack_z_float," % format.short_name() -print " &util_format_%s_pack_z_float," % format.short_name() +print " .unpack_z_32unorm = &util_format_%s_unpack_z_32unorm," % format.short_name() +print " .pack_z_32unorm = &util_format_%s_pack_z_32unorm," % format.short_name() +pr
[Mesa-dev] [PATCH mesa 1/2] radv: used designated initialisers in formats table
Signed-off-by: Eric Engestrom --- src/amd/vulkan/vk_format_table.py | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/amd/vulkan/vk_format_table.py b/src/amd/vulkan/vk_format_table.py index 36352b108d0b5220a901..f903e21f697dc42981ed 100644 --- a/src/amd/vulkan/vk_format_table.py +++ b/src/amd/vulkan/vk_format_table.py @@ -127,18 +127,20 @@ def do_swizzle_array(channels, swizzles): for format in formats: print 'const struct vk_format_description' print 'vk_format_%s_description = {' % (format.short_name(),) -print " %s," % (format.name,) -print " \"%s\"," % (format.name,) -print " \"%s\"," % (format.short_name(),) -print " {%u, %u, %u},\t/* block */" % (format.block_width, format.block_height, format.block_size()) -print " %s," % (layout_map(format.layout),) -print " %u,\t/* nr_channels */" % (format.nr_channels(),) -print " %s,\t/* is_array */" % (bool_map(format.is_array()),) -print " %s,\t/* is_bitmask */" % (bool_map(format.is_bitmask()),) -print " %s,\t/* is_mixed */" % (bool_map(format.is_mixed()),) +print " .format = %s," % (format.name,) +print " .name = \"%s\"," % (format.name,) +print " .short_name = \"%s\"," % (format.short_name(),) +print " .block = {%u, %u, %u}," % (format.block_width, format.block_height, format.block_size()) +print " .layout = %s," % (layout_map(format.layout),) +print " .nr_channels = %u," % (format.nr_channels(),) +print " .is_array = %s," % (bool_map(format.is_array()),) +print " .is_bitmask = %s," % (bool_map(format.is_bitmask()),) +print " .is_mixed = %s," % (bool_map(format.is_mixed()),) +print " .channel = " print_channels(format, do_channel_array) +print " .swizzle = " print_channels(format, do_swizzle_array) -print " %s," % (colorspace_map(format.colorspace),) +print " .colorspace = %s," % (colorspace_map(format.colorspace),) print "};" print -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 7391] Memory leak detected in _mesa_HashInsert with cairo-dock
https://bugs.freedesktop.org/show_bug.cgi?id=7391 yanhua <78666...@qq.com> changed: What|Removed |Added CC||78666...@qq.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/7] egl: add support for EGL_ANDROID_blob_cache
On 31 January 2018 at 07:17, Tapani Pälli wrote: > v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) > adapt to earlier ctx->screen changes > > Signed-off-by: Tapani Pälli > --- > src/egl/drivers/dri2/egl_dri2.c | 25 + > src/egl/drivers/dri2/egl_dri2.h | 1 + > src/egl/main/eglapi.c | 30 ++ > src/egl/main/eglapi.h | 4 > src/egl/main/egldisplay.h | 4 > src/egl/main/eglentrypoint.h| 1 + > 6 files changed, 65 insertions(+) > > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index d5a4f72e86..a54f8a4d96 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -458,6 +458,7 @@ static const struct dri2_extension_match > optional_core_extensions[] = { > { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, > { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, > { __DRI2_FLUSH_CONTROL, 1, offsetof(struct dri2_egl_display, > flush_control) }, > + { __DRI2_BLOB, 1, offsetof(struct dri2_egl_display, blob) }, > { NULL, 0, 0 } > }; > > @@ -727,6 +728,9 @@ dri2_setup_screen(_EGLDisplay *disp) >} > } > > + if (dri2_dpy->blob) > + disp->Extensions.ANDROID_blob_cache = EGL_TRUE; > + > disp->Extensions.KHR_reusable_sync = EGL_TRUE; > > if (dri2_dpy->image) { > @@ -3016,6 +3020,26 @@ dri2_dup_native_fence_fd(_EGLDriver *drv, _EGLDisplay > *dpy, _EGLSync *sync) > return dup(sync->SyncFd); > } > > +static void > +dri2_set_blob_cache_funcs(_EGLDriver *drv, _EGLDisplay *dpy, > + EGLSetBlobFuncANDROID set, > + EGLGetBlobFuncANDROID get) > +{ > + struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); > + > + /* No blob support. */ > + if (!dri2_dpy->blob) > + return; > + Should never(tm) happen. As in: the extension won't be advertised and thus applications shouldn't use the func. pointer they get from eglGetProcAddress. If we'd want to catch such abuse it ought to be in eglapi.c. > + /* No functions to set. */ > + if (!dpy->BlobCacheSet) > + return; > + Not needed - single caller that errors out if the pointer is NULL. > + dri2_dpy->blob->set_cache_funcs(dri2_dpy->dri_screen, > + dpy->BlobCacheSet, > + dpy->BlobCacheGet); > +} > + > static EGLint > dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync, >EGLint flags, EGLTime timeout) > @@ -3234,6 +3258,7 @@ _eglBuiltInDriver(void) > dri2_drv->API.GLInteropQueryDeviceInfo = dri2_interop_query_device_info; > dri2_drv->API.GLInteropExportObject = dri2_interop_export_object; > dri2_drv->API.DupNativeFenceFDANDROID = dri2_dup_native_fence_fd; > + dri2_drv->API.SetBlobCacheFuncsANDROID = dri2_set_blob_cache_funcs; > > dri2_drv->Name = "DRI2"; > > diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h > index cc76c73eab..c49156fbb6 100644 > --- a/src/egl/drivers/dri2/egl_dri2.h > +++ b/src/egl/drivers/dri2/egl_dri2.h > @@ -171,6 +171,7 @@ struct dri2_egl_display > const __DRInoErrorExtension*no_error; > const __DRI2configQueryExtension *config; > const __DRI2fenceExtension *fence; > + const __DRI2blobExtension *blob; > const __DRI2rendererQueryExtension *rendererQuery; > const __DRI2interopExtension *interop; > int fd; > diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c > index 5110688f2d..f2ba260060 100644 > --- a/src/egl/main/eglapi.c > +++ b/src/egl/main/eglapi.c > @@ -476,6 +476,7 @@ _eglCreateExtensionsString(_EGLDisplay *dpy) > char *exts = dpy->ExtensionsString; > > /* Please keep these sorted alphabetically. */ > + _EGL_CHECK_EXTENSION(ANDROID_blob_cache); > _EGL_CHECK_EXTENSION(ANDROID_framebuffer_target); > _EGL_CHECK_EXTENSION(ANDROID_image_native_buffer); > _EGL_CHECK_EXTENSION(ANDROID_native_fence_sync); > @@ -2522,6 +2523,35 @@ eglQueryDmaBufModifiersEXT(EGLDisplay dpy, EGLint > format, EGLint max_modifiers, > RETURN_EGL_EVAL(disp, ret); > } > > +static void EGLAPIENTRY > +eglSetBlobCacheFuncsANDROID(EGLDisplay *dpy, EGLSetBlobFuncANDROID set, > +EGLGetBlobFuncANDROID get) > +{ > + _EGLDisplay *disp = _eglLockDisplay(dpy); > + _EGLDriver *drv = _eglCheckDisplay(disp, __func__); > + This is the only EGL API which has no return type. Hence we cannot use the _EGL_FUNC_START/_EGL_CHECK_DISPLAY macros ;-( We'd want the _eglSetFuncName() call (from the former macro) though. I'd add a very small comment + [sort of] inline the macros. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl/tests: move utility functions in cache_test
On 31 January 2018 at 07:17, Tapani Pälli wrote: > Patch moves functions higher so that we can utilize them from > test_disk_cache_create which is modified by next patch. > > Signed-off-by: Tapani Pälli > --- I realise you're just moving stuff, so feel free to ignore the nits. > src/compiler/glsl/tests/cache_test.c | 70 > ++-- > 1 file changed, 35 insertions(+), 35 deletions(-) > > diff --git a/src/compiler/glsl/tests/cache_test.c > b/src/compiler/glsl/tests/cache_test.c > index 75319f1160..dd11fd5944 100644 > --- a/src/compiler/glsl/tests/cache_test.c > +++ b/src/compiler/glsl/tests/cache_test.c > @@ -147,6 +147,41 @@ check_directories_created(const char *cache_dir) > expect_true(sub_dirs_created, "create sub dirs"); > } > > +static bool > +does_cache_contain(struct disk_cache *cache, const cache_key key) > +{ > + void *result; > + > + result = disk_cache_get(cache, key, NULL); > + > + if (result) { > + free(result); > + return true; > + } > + > + return false; > +} Nit: void *result = disk_cache_get(cache, key, NULL); if (!result) return false; free(result); -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: do not dump meta shader stats
Reviewed-by: Bas Nieuwenhuizen On Wed, Jan 31, 2018 at 11:40 AM, Samuel Pitoiset wrote: > That's quite useless and that pollutes the output. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_pipeline.c | 30 +- > src/amd/vulkan/radv_shader.h | 9 + > 2 files changed, 18 insertions(+), 21 deletions(-) > > diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_ > pipeline.c > index 785c216b4a..6547637338 100644 > --- a/src/amd/vulkan/radv_pipeline.c > +++ b/src/amd/vulkan/radv_pipeline.c > @@ -110,18 +110,6 @@ void radv_DestroyPipeline( > radv_pipeline_destroy(device, pipeline, pAllocator); > } > > -static void radv_dump_pipeline_stats(struct radv_device *device, struct > radv_pipeline *pipeline) > -{ > - int i; > - > - for (i = 0; i < MESA_SHADER_STAGES; i++) { > - if (!pipeline->shaders[i]) > - continue; > - > - radv_shader_dump_stats(device, pipeline->shaders[i], i, > stderr); > - } > -} > - > static uint32_t get_hash_flags(struct radv_device *device) > { > uint32_t hash_flags = 0; > @@ -1861,8 +1849,15 @@ void radv_create_shaders(struct radv_pipeline > *pipeline, > > for (int i = 0; i < MESA_SHADER_STAGES; ++i) { > free(codes[i]); > - if (modules[i] && !pipeline->device->keep_shader_info) > - ralloc_free(nir[i]); > + if (modules[i]) { > + if (!pipeline->device->keep_shader_info) > + ralloc_free(nir[i]); > + > + if (radv_can_dump_shader_stats(device, > modules[i])) > + radv_shader_dump_stats(device, > + > pipeline->shaders[i], > + i, stderr); > + } > } > > if (fs_m.nir) > @@ -3233,10 +3228,6 @@ radv_pipeline_init(struct radv_pipeline *pipeline, > pipeline->graphics.vtx_emit_num = 2; > } > > - if (device->instance->debug_flags & RADV_DEBUG_DUMP_SHADER_STATS) { > - radv_dump_pipeline_stats(device, pipeline); > - } > - > result = radv_pipeline_scratch_init(device, pipeline); > radv_pipeline_generate_pm4(pipeline, pCreateInfo, extra, &blend, > &tess, &gs, prim, gs_out); > > @@ -3400,9 +3391,6 @@ static VkResult radv_compute_pipeline_create( > > *pPipeline = radv_pipeline_to_handle(pipeline); > > - if (device->instance->debug_flags & RADV_DEBUG_DUMP_SHADER_STATS) { > - radv_dump_pipeline_stats(device, pipeline); > - } > return VK_SUCCESS; > } > > diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h > index f6486863f8..b07f8a89e7 100644 > --- a/src/amd/vulkan/radv_shader.h > +++ b/src/amd/vulkan/radv_shader.h > @@ -122,4 +122,13 @@ radv_can_dump_shader(struct radv_device *device, >module && !module->nir; > } > > +static inline bool > +radv_can_dump_shader_stats(struct radv_device *device, > + struct radv_shader_module *module) > +{ > + /* Only dump non-meta shader stats. */ > + return device->instance->debug_flags & > RADV_DEBUG_DUMP_SHADER_STATS && > + module && !module->nir; > +} > + > #endif > -- > 2.16.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 7/7] i965: add __DRI2_BLOB support and set cache functions
On 31.01.2018 11:27, Jordan Justen wrote: On 2018-01-30 23:17:06, Tapani Pälli wrote: v2: adjust to change that moved cache from ctx to screen Signed-off-by: Tapani Pälli --- src/mesa/drivers/dri/i965/intel_screen.c | 21 + 1 file changed, 21 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index e1e520bc89..2c445af1e4 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -36,6 +36,7 @@ #include "main/version.h" #include "swrast/s_renderbuffer.h" #include "util/ralloc.h" +#include "util/disk_cache.h" #include "brw_defines.h" #include "brw_state.h" #include "compiler/nir/nir.h" @@ -1494,6 +1495,19 @@ brw_query_renderer_string(__DRIscreen *dri_screen, return -1; } +static void +brw_set_cache_funcs(__DRIscreen *dri_screen, +__DRIblobCacheSet set, __DRIblobCacheGet get) +{ + const struct intel_screen *const screen = + (struct intel_screen *) dri_screen->driverPrivate; + + if (!screen->disk_cache) + return; Could this cause us to fail tests if the disk cache is not enabled? For example, if they test setting the functions to NULL, or set multiple times? No because those are handled already at EGL level. My Piglit API level test passes even if disk cache is not created. -Jordan + + disk_cache_set_callbacks(screen->disk_cache, set, get); +} + static const __DRI2rendererQueryExtension intelRendererQueryExtension = { .base = { __DRI2_RENDERER_QUERY, 1 }, @@ -1505,6 +1519,11 @@ static const __DRIrobustnessExtension dri2Robustness = { .base = { __DRI2_ROBUSTNESS, 1 } }; +static const __DRI2blobExtension intelBlobExtension = { + .base = { __DRI2_BLOB, 1 }, + .set_cache_funcs = brw_set_cache_funcs +}; + static const __DRIextension *screenExtensions[] = { &intelTexBufferExtension.base, &intelFenceExtension.base, @@ -1513,6 +1532,7 @@ static const __DRIextension *screenExtensions[] = { &intelRendererQueryExtension.base, &dri2ConfigQueryExtension.base, &dri2NoErrorExtension.base, +&intelBlobExtension.base, NULL }; @@ -1525,6 +1545,7 @@ static const __DRIextension *intelRobustScreenExtensions[] = { &dri2ConfigQueryExtension.base, &dri2Robustness.base, &dri2NoErrorExtension.base, +&intelBlobExtension.base, NULL }; -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 6/7] disk cache: add callback functionality
On 31.01.2018 11:18, Jordan Justen wrote: On 2018-01-30 23:17:05, Tapani Pälli wrote: v2: add disk_cache_has_key, disk_cache_put_key support using blob cache (Nicolai, Jordan) v3: rename set_cb as put_cb to match existing naming (Timothy) Signed-off-by: Tapani Pälli --- src/util/disk_cache.c | 49 + src/util/disk_cache.h | 19 +++ 2 files changed, 68 insertions(+) diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c index 9fa264440b..5af0346c7a 100644 --- a/src/util/disk_cache.c +++ b/src/util/disk_cache.c @@ -101,6 +101,9 @@ struct disk_cache { /* Driver cache keys. */ uint8_t *driver_keys_blob; size_t driver_keys_blob_size; + + disk_cache_put_cb blob_put_cb; + disk_cache_get_cb blob_get_cb; }; struct disk_cache_put_job { @@ -1012,6 +1015,11 @@ disk_cache_put(struct disk_cache *cache, const cache_key key, const void *data, size_t size, struct cache_item_metadata *cache_item_metadata) { + if (cache->blob_put_cb) { + cache->blob_put_cb(key, CACHE_KEY_SIZE, data, size); + return; + } + struct disk_cache_put_job *dc_job = create_put_job(cache, key, data, size, cache_item_metadata); @@ -1079,6 +1087,29 @@ disk_cache_get(struct disk_cache *cache, const cache_key key, size_t *size) if (size) *size = 0; + if (cache->blob_get_cb) { +/* This is what Android EGL defines as the maxValueSize in egl_cache_t + * class implementation. + */ +#define MAX_BLOB_SIZE 64 * 1024 What about 'const signed long max_blob_size = 64 * 1024;' instead? Yes, this is fine. I think earlier series I used it in 2 places and that's why it was left like this. Reviewed-by: Jordan Justen + void *blob = malloc(MAX_BLOB_SIZE); + if (!blob) + return NULL; + + signed long bytes = + cache->blob_get_cb(key, CACHE_KEY_SIZE, blob, MAX_BLOB_SIZE); + + if (!bytes) { + free(blob); + return NULL; + } + + if (size) + *size = bytes; + return blob; +#undef MAX_BLOB_SIZE + } + filename = get_cache_file(cache, key); if (filename == NULL) goto fail; @@ -1194,6 +1225,11 @@ disk_cache_put_key(struct disk_cache *cache, const cache_key key) int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; unsigned char *entry; + if (cache->blob_put_cb) { + cache->blob_put_cb(key, CACHE_KEY_SIZE, key_chunk, sizeof(uint32_t)); + return; + } + if (!cache->path) return; @@ -1216,6 +1252,11 @@ disk_cache_has_key(struct disk_cache *cache, const cache_key key) int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; unsigned char *entry; + if (cache->blob_get_cb) { + uint32_t blob; + return cache->blob_get_cb(key, CACHE_KEY_SIZE, &blob, sizeof(uint32_t)); + } + /* Initialize path if not initialized yet. */ if (cache->path_init_failed || (!cache->path && !disk_cache_path_init(cache))) @@ -1239,4 +1280,12 @@ disk_cache_compute_key(struct disk_cache *cache, const void *data, size_t size, _mesa_sha1_final(&ctx, key); } +void +disk_cache_set_callbacks(struct disk_cache *cache, disk_cache_put_cb put, + disk_cache_get_cb get) +{ + cache->blob_put_cb = put; + cache->blob_get_cb = get; +} + #endif /* ENABLE_SHADER_CACHE */ diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h index 488b297ead..f84840fb5c 100644 --- a/src/util/disk_cache.h +++ b/src/util/disk_cache.h @@ -50,6 +50,14 @@ typedef uint8_t cache_key[CACHE_KEY_SIZE]; #define CACHE_ITEM_TYPE_UNKNOWN 0x0 #define CACHE_ITEM_TYPE_GLSL 0x1 +typedef void +(*disk_cache_put_cb) (const void *key, signed long keySize, + const void *value, signed long valueSize); + +typedef signed long +(*disk_cache_get_cb) (const void *key, signed long keySize, + void *value, signed long valueSize); + struct cache_item_metadata { /** * The cache item type. This could be used to identify a GLSL cache item, @@ -207,6 +215,10 @@ void disk_cache_compute_key(struct disk_cache *cache, const void *data, size_t size, cache_key key); +void +disk_cache_set_callbacks(struct disk_cache *cache, disk_cache_put_cb put, + disk_cache_get_cb get); + #else static inline struct disk_cache * @@ -260,6 +272,13 @@ disk_cache_compute_key(struct disk_cache *cache, const void *data, size_t size, return; } +static inline void +disk_cache_set_callbacks(struct disk_cache *cache, disk_cache_put_cb put, + disk_cache_get_cb get) +{ + return; +} + #endif /* ENABLE_SHADER_CACHE */ #ifdef __cplusplus -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/7] egl: add support for EGL_ANDROID_blob_cache
On 31.01.2018 13:58, Emil Velikov wrote: On 31 January 2018 at 07:17, Tapani Pälli wrote: v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) adapt to earlier ctx->screen changes Signed-off-by: Tapani Pälli --- src/egl/drivers/dri2/egl_dri2.c | 25 + src/egl/drivers/dri2/egl_dri2.h | 1 + src/egl/main/eglapi.c | 30 ++ src/egl/main/eglapi.h | 4 src/egl/main/egldisplay.h | 4 src/egl/main/eglentrypoint.h| 1 + 6 files changed, 65 insertions(+) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index d5a4f72e86..a54f8a4d96 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -458,6 +458,7 @@ static const struct dri2_extension_match optional_core_extensions[] = { { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, { __DRI2_FLUSH_CONTROL, 1, offsetof(struct dri2_egl_display, flush_control) }, + { __DRI2_BLOB, 1, offsetof(struct dri2_egl_display, blob) }, { NULL, 0, 0 } }; @@ -727,6 +728,9 @@ dri2_setup_screen(_EGLDisplay *disp) } } + if (dri2_dpy->blob) + disp->Extensions.ANDROID_blob_cache = EGL_TRUE; + disp->Extensions.KHR_reusable_sync = EGL_TRUE; if (dri2_dpy->image) { @@ -3016,6 +3020,26 @@ dri2_dup_native_fence_fd(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync) return dup(sync->SyncFd); } +static void +dri2_set_blob_cache_funcs(_EGLDriver *drv, _EGLDisplay *dpy, + EGLSetBlobFuncANDROID set, + EGLGetBlobFuncANDROID get) +{ + struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); + + /* No blob support. */ + if (!dri2_dpy->blob) + return; + Should never(tm) happen. As in: the extension won't be advertised and thus applications shouldn't use the func. pointer they get from eglGetProcAddress. If we'd want to catch such abuse it ought to be in eglapi.c. Fair enough, it should be unnecessary. I had it because I haven't seen actual apps care too much about extension string but in this case it's fine because it's Android EGL layer that utilizes this, not them buggy apps. + /* No functions to set. */ + if (!dpy->BlobCacheSet) + return; + Not needed - single caller that errors out if the pointer is NULL. Will remove + dri2_dpy->blob->set_cache_funcs(dri2_dpy->dri_screen, + dpy->BlobCacheSet, + dpy->BlobCacheGet); +} + static EGLint dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync, EGLint flags, EGLTime timeout) @@ -3234,6 +3258,7 @@ _eglBuiltInDriver(void) dri2_drv->API.GLInteropQueryDeviceInfo = dri2_interop_query_device_info; dri2_drv->API.GLInteropExportObject = dri2_interop_export_object; dri2_drv->API.DupNativeFenceFDANDROID = dri2_dup_native_fence_fd; + dri2_drv->API.SetBlobCacheFuncsANDROID = dri2_set_blob_cache_funcs; dri2_drv->Name = "DRI2"; diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index cc76c73eab..c49156fbb6 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -171,6 +171,7 @@ struct dri2_egl_display const __DRInoErrorExtension*no_error; const __DRI2configQueryExtension *config; const __DRI2fenceExtension *fence; + const __DRI2blobExtension *blob; const __DRI2rendererQueryExtension *rendererQuery; const __DRI2interopExtension *interop; int fd; diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 5110688f2d..f2ba260060 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -476,6 +476,7 @@ _eglCreateExtensionsString(_EGLDisplay *dpy) char *exts = dpy->ExtensionsString; /* Please keep these sorted alphabetically. */ + _EGL_CHECK_EXTENSION(ANDROID_blob_cache); _EGL_CHECK_EXTENSION(ANDROID_framebuffer_target); _EGL_CHECK_EXTENSION(ANDROID_image_native_buffer); _EGL_CHECK_EXTENSION(ANDROID_native_fence_sync); @@ -2522,6 +2523,35 @@ eglQueryDmaBufModifiersEXT(EGLDisplay dpy, EGLint format, EGLint max_modifiers, RETURN_EGL_EVAL(disp, ret); } +static void EGLAPIENTRY +eglSetBlobCacheFuncsANDROID(EGLDisplay *dpy, EGLSetBlobFuncANDROID set, +EGLGetBlobFuncANDROID get) +{ + _EGLDisplay *disp = _eglLockDisplay(dpy); + _EGLDriver *drv = _eglCheckDisplay(disp, __func__); + This is the only EGL API which has no return type. Hence we cannot use the _EGL_FUNC_START/_EGL_CHECK_DISPLAY macros ;-( We'd want the _eglSetFuncName() call (from the former macro) though. I'd add a very small comment + [sort of] inline the macros. True, I forgot _eglSetFuncName which can be useful for EGL_KHR_debug when _eglError cases are hit. I will add this.
Re: [Mesa-dev] [PATCH] mesa: fix broken glGet*(GL_POLYGON_MODE) query
Reviewed-by: Marek Olšák Marek On Wed, Jan 31, 2018 at 3:35 AM, Brian Paul wrote: > This reverts part of the patch which introduced the GLenum16 change. > Fixes a conform regression found by Roland. > > Fixes: f96a69f916aed405 ("mesa: replace GLenum with GLenum16 in > common structures (v4)") > --- > src/mesa/main/get_hash_params.py | 2 +- > src/mesa/main/mtypes.h | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/get_hash_params.py > b/src/mesa/main/get_hash_params.py > index 7cd195c..b61b16b 100644 > --- a/src/mesa/main/get_hash_params.py > +++ b/src/mesa/main/get_hash_params.py > @@ -743,7 +743,7 @@ descriptor=[ >[ "PIXEL_MAP_R_TO_R_SIZE", "CONTEXT_INT(PixelMaps.RtoR.Size), NO_EXTRA" ], >[ "PIXEL_MAP_S_TO_S_SIZE", "CONTEXT_INT(PixelMaps.StoS.Size), NO_EXTRA" ], >[ "POINT_SIZE_GRANULARITY", "CONTEXT_FLOAT(Const.PointSizeGranularity), > NO_EXTRA" ], > - [ "POLYGON_MODE", "CONTEXT_ENUM16(Polygon.FrontMode), NO_EXTRA" ], > + [ "POLYGON_MODE", "CONTEXT_ENUM2(Polygon.FrontMode), NO_EXTRA" ], >[ "POLYGON_OFFSET_BIAS_EXT", "CONTEXT_FLOAT(Polygon.OffsetUnits), > NO_EXTRA" ], >[ "POLYGON_OFFSET_POINT", "CONTEXT_BOOL(Polygon.OffsetPoint), NO_EXTRA" ], >[ "POLYGON_OFFSET_LINE", "CONTEXT_BOOL(Polygon.OffsetLine), NO_EXTRA" ], > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 35fafa5..f6fa6f4 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -843,8 +843,8 @@ struct gl_point_attrib > struct gl_polygon_attrib > { > GLenum16 FrontFace; /**< Either GL_CW or GL_CCW */ > - GLenum16 FrontMode; /**< Either GL_POINT, GL_LINE or GL_FILL */ > - GLenum16 BackMode; /**< Either GL_POINT, GL_LINE or GL_FILL */ > + GLenum FrontMode; /**< Either GL_POINT, GL_LINE or GL_FILL */ > + GLenum BackMode;/**< Either GL_POINT, GL_LINE or GL_FILL */ > GLboolean CullFlag; /**< Culling on/off flag */ > GLboolean SmoothFlag; /**< True if GL_POLYGON_SMOOTH is enabled */ > GLboolean StippleFlag; /**< True if GL_POLYGON_STIPPLE is enabled */ > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/7] disk cache: initialize cache path and index only when used
On 31 January 2018 at 07:17, Tapani Pälli wrote: > This patch makes disk_cache initialize path and index lazily so > that we can utilize disk_cache without a path using callback > functionality introduced by next patch. > > v2: unmap mmap and destroy queue only if index_mmap exists > > Signed-off-by: Tapani Pälli > --- > src/util/disk_cache.c | 127 > +++--- > 1 file changed, 78 insertions(+), 49 deletions(-) > I'd keep the refactor (disk_cache_create -> disk_cache_path_init + disk_cache_create) and lazy indexing separate patches. As-is tracking all the error paths is a quite fiddly. > @@ -999,6 +1015,11 @@ disk_cache_put(struct disk_cache *cache, const > cache_key key, > struct disk_cache_put_job *dc_job = >create_put_job(cache, key, data, size, cache_item_metadata); > > + /* Initialize path if not initialized yet. */ > + if (cache->path_init_failed || > + (!cache->path && !disk_cache_path_init(cache))) > + return; > + > if (dc_job) { >util_queue_fence_init(&dc_job->fence); >util_queue_add_job(&cache->cache_queue, dc_job, &dc_job->fence, > @@ -1173,6 +1194,9 @@ disk_cache_put_key(struct disk_cache *cache, const > cache_key key) > int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; > unsigned char *entry; > > + if (!cache->path) > + return; > + Any reason why the blurb in disk_cache_put() is missing here? From cache_test.c POV disk_cache_put_key relied on disk_cache_has_key being called first, although I'm not sure if that's the most robust approach. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change
On Wed, Jan 31, 2018 at 4:34 AM, Ian Romanick wrote: > On 01/30/2018 07:48 AM, Marek Olšák wrote: >> From: Marek Olšák >> >> This only affects drivers that set DriverFlags.NewBlend. >> --- >> src/mesa/main/blend.c | 6 -- >> src/mesa/main/blend.h | 41 >> +++ >> src/mesa/main/enable.c| 14 + >> src/mesa/program/prog_statevars.c | 3 ++- >> 4 files changed, 49 insertions(+), 15 deletions(-) >> >> diff --git a/src/mesa/main/blend.c b/src/mesa/main/blend.c >> index 6b379f2..ec8e27e 100644 >> --- a/src/mesa/main/blend.c >> +++ b/src/mesa/main/blend.c >> @@ -528,21 +528,22 @@ _mesa_BlendEquation( GLenum mode ) >> >> if (!changed) >>return; >> >> >> if (!legal_simple_blend_equation(ctx, mode) && !advanced_mode) { >>_mesa_error(ctx, GL_INVALID_ENUM, "glBlendEquation"); >>return; >> } >> >> - _mesa_flush_vertices_for_blend_state(ctx); >> + _mesa_flush_vertices_for_blend_adv(ctx, ctx->Color.BlendEnabled, >> + advanced_mode); >> >> for (buf = 0; buf < numBuffers; buf++) { >>ctx->Color.Blend[buf].EquationRGB = mode; >>ctx->Color.Blend[buf].EquationA = mode; >> } >> ctx->Color._BlendEquationPerBuffer = GL_FALSE; >> ctx->Color._AdvancedBlendMode = advanced_mode; >> >> if (ctx->Driver.BlendEquationSeparate) >>ctx->Driver.BlendEquationSeparate(ctx, mode, mode); >> @@ -553,21 +554,22 @@ _mesa_BlendEquation( GLenum mode ) >> * Set blend equation for one color buffer/target. >> */ >> static void >> blend_equationi(struct gl_context *ctx, GLuint buf, GLenum mode, >> enum gl_advanced_blend_mode advanced_mode) >> { >> if (ctx->Color.Blend[buf].EquationRGB == mode && >> ctx->Color.Blend[buf].EquationA == mode) >>return; /* no change */ >> >> - _mesa_flush_vertices_for_blend_state(ctx); >> + _mesa_flush_vertices_for_blend_adv(ctx, ctx->Color.BlendEnabled, >> + advanced_mode); >> ctx->Color.Blend[buf].EquationRGB = mode; >> ctx->Color.Blend[buf].EquationA = mode; >> ctx->Color._BlendEquationPerBuffer = GL_TRUE; >> >> if (buf == 0) >>ctx->Color._AdvancedBlendMode = advanced_mode; >> } >> >> >> void GLAPIENTRY >> diff --git a/src/mesa/main/blend.h b/src/mesa/main/blend.h >> index 2454e0c..cba5a98 100644 >> --- a/src/mesa/main/blend.h >> +++ b/src/mesa/main/blend.h >> @@ -147,28 +147,53 @@ extern void >> _mesa_update_clamp_vertex_color(struct gl_context *ctx, >> const struct gl_framebuffer *drawFb); >> >> extern mesa_format >> _mesa_get_render_format(const struct gl_context *ctx, mesa_format format); >> >> extern void >> _mesa_init_color( struct gl_context * ctx ); >> >> >> +static inline unsigned >> +_mesa_get_advanded_blend_sh_constant(GLbitfield blend_enabled, >> + enum gl_advanced_blend_mode mode) >> +{ >> + return blend_enabled ? mode : 0; > > Should this be BLEND_NONE with a return type enum gl_advanced_blend_mode? > >> +} >> + >> +static inline bool >> +_mesa_advanded_blend_sh_constant_changed(struct gl_context *ctx, >> + GLbitfield new_blend_enabled, >> + enum gl_advanced_blend_mode >> new_mode) >> +{ >> + return _mesa_get_advanded_blend_sh_constant(new_blend_enabled, new_mode) >> != >> + _mesa_get_advanded_blend_sh_constant(ctx->Color.BlendEnabled, >> + >> ctx->Color._AdvancedBlendMode); >> +} >> + >> static inline void >> _mesa_flush_vertices_for_blend_state(struct gl_context *ctx) >> { >> - /* The advanced blend mode needs _NEW_COLOR to update the state constant, >> -* so we have to set it. This is inefficient. >> -* This should only be done for states that affect the state constant. >> -* It shouldn't be done for other blend states. >> -*/ >> - if (_mesa_has_KHR_blend_equation_advanced(ctx) || >> - !ctx->DriverFlags.NewBlend) { >> + if (!ctx->DriverFlags.NewBlend) { >>FLUSH_VERTICES(ctx, _NEW_COLOR); >> } else { >>FLUSH_VERTICES(ctx, 0); >> + ctx->NewDriverState |= ctx->DriverFlags.NewBlend; >> + } >> +} >> + >> +static inline void >> +_mesa_flush_vertices_for_blend_adv(struct gl_context *ctx, >> + GLbitfield new_blend_enabled, >> + enum gl_advanced_blend_mode new_mode) >> +{ >> + /* The advanced blend mode needs _NEW_COLOR to update the state >> constant. */ >> + if (_mesa_has_KHR_blend_equation_advanced(ctx) && >> + _mesa_advanded_blend_sh_constant_changed(ctx, new_blend_enabled, >> +new_mode)) { >> + FLUSH_VERTICES(ctx, _NEW_COLOR); >> } >> - ctx->NewDriverState |= ctx->DriverFlags.NewBlend; >
Re: [Mesa-dev] [PATCH v2 2/7] egl: add support for EGL_ANDROID_blob_cache
On 31.01.2018 15:07, Tapani Pälli wrote: On 31.01.2018 13:58, Emil Velikov wrote: On 31 January 2018 at 07:17, Tapani Pälli wrote: v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) adapt to earlier ctx->screen changes Signed-off-by: Tapani Pälli --- src/egl/drivers/dri2/egl_dri2.c | 25 + src/egl/drivers/dri2/egl_dri2.h | 1 + src/egl/main/eglapi.c | 30 ++ src/egl/main/eglapi.h | 4 src/egl/main/egldisplay.h | 4 src/egl/main/eglentrypoint.h | 1 + 6 files changed, 65 insertions(+) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index d5a4f72e86..a54f8a4d96 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -458,6 +458,7 @@ static const struct dri2_extension_match optional_core_extensions[] = { { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, { __DRI2_FLUSH_CONTROL, 1, offsetof(struct dri2_egl_display, flush_control) }, + { __DRI2_BLOB, 1, offsetof(struct dri2_egl_display, blob) }, { NULL, 0, 0 } }; @@ -727,6 +728,9 @@ dri2_setup_screen(_EGLDisplay *disp) } } + if (dri2_dpy->blob) + disp->Extensions.ANDROID_blob_cache = EGL_TRUE; + disp->Extensions.KHR_reusable_sync = EGL_TRUE; if (dri2_dpy->image) { @@ -3016,6 +3020,26 @@ dri2_dup_native_fence_fd(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync) return dup(sync->SyncFd); } +static void +dri2_set_blob_cache_funcs(_EGLDriver *drv, _EGLDisplay *dpy, + EGLSetBlobFuncANDROID set, + EGLGetBlobFuncANDROID get) +{ + struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy); + + /* No blob support. */ + if (!dri2_dpy->blob) + return; + Should never(tm) happen. As in: the extension won't be advertised and thus applications shouldn't use the func. pointer they get from eglGetProcAddress. If we'd want to catch such abuse it ought to be in eglapi.c. Fair enough, it should be unnecessary. I had it because I haven't seen actual apps care too much about extension string but in this case it's fine because it's Android EGL layer that utilizes this, not them buggy apps. + /* No functions to set. */ + if (!dpy->BlobCacheSet) + return; + Not needed - single caller that errors out if the pointer is NULL. Will remove + dri2_dpy->blob->set_cache_funcs(dri2_dpy->dri_screen, + dpy->BlobCacheSet, + dpy->BlobCacheGet); +} + static EGLint dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync, EGLint flags, EGLTime timeout) @@ -3234,6 +3258,7 @@ _eglBuiltInDriver(void) dri2_drv->API.GLInteropQueryDeviceInfo = dri2_interop_query_device_info; dri2_drv->API.GLInteropExportObject = dri2_interop_export_object; dri2_drv->API.DupNativeFenceFDANDROID = dri2_dup_native_fence_fd; + dri2_drv->API.SetBlobCacheFuncsANDROID = dri2_set_blob_cache_funcs; dri2_drv->Name = "DRI2"; diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index cc76c73eab..c49156fbb6 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -171,6 +171,7 @@ struct dri2_egl_display const __DRInoErrorExtension *no_error; const __DRI2configQueryExtension *config; const __DRI2fenceExtension *fence; + const __DRI2blobExtension *blob; const __DRI2rendererQueryExtension *rendererQuery; const __DRI2interopExtension *interop; int fd; diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 5110688f2d..f2ba260060 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -476,6 +476,7 @@ _eglCreateExtensionsString(_EGLDisplay *dpy) char *exts = dpy->ExtensionsString; /* Please keep these sorted alphabetically. */ + _EGL_CHECK_EXTENSION(ANDROID_blob_cache); _EGL_CHECK_EXTENSION(ANDROID_framebuffer_target); _EGL_CHECK_EXTENSION(ANDROID_image_native_buffer); _EGL_CHECK_EXTENSION(ANDROID_native_fence_sync); @@ -2522,6 +2523,35 @@ eglQueryDmaBufModifiersEXT(EGLDisplay dpy, EGLint format, EGLint max_modifiers, RETURN_EGL_EVAL(disp, ret); } +static void EGLAPIENTRY +eglSetBlobCacheFuncsANDROID(EGLDisplay *dpy, EGLSetBlobFuncANDROID set, + EGLGetBlobFuncANDROID get) +{ + _EGLDisplay *disp = _eglLockDisplay(dpy); + _EGLDriver *drv = _eglCheckDisplay(disp, __func__); + This is the only EGL API which has no return type. Hence we cannot use the _EGL_FUNC_START/_EGL_CHECK_DISPLAY macros ;-( We'd want the _eglSetFuncName() call (from the former macro) though. I'd add a very small comment + [sort of] inline the macros. True, I forgot _eglSetFuncName which can be useful for EGL_
Re: [Mesa-dev] [PATCH v2 5/7] disk cache: initialize cache path and index only when used
On 31.01.2018 15:18, Emil Velikov wrote: On 31 January 2018 at 07:17, Tapani Pälli wrote: This patch makes disk_cache initialize path and index lazily so that we can utilize disk_cache without a path using callback functionality introduced by next patch. v2: unmap mmap and destroy queue only if index_mmap exists Signed-off-by: Tapani Pälli --- src/util/disk_cache.c | 127 +++--- 1 file changed, 78 insertions(+), 49 deletions(-) I'd keep the refactor (disk_cache_create -> disk_cache_path_init + disk_cache_create) and lazy indexing separate patches. As-is tracking all the error paths is a quite fiddly. @@ -999,6 +1015,11 @@ disk_cache_put(struct disk_cache *cache, const cache_key key, struct disk_cache_put_job *dc_job = create_put_job(cache, key, data, size, cache_item_metadata); + /* Initialize path if not initialized yet. */ + if (cache->path_init_failed || + (!cache->path && !disk_cache_path_init(cache))) + return; + if (dc_job) { util_queue_fence_init(&dc_job->fence); util_queue_add_job(&cache->cache_queue, dc_job, &dc_job->fence, @@ -1173,6 +1194,9 @@ disk_cache_put_key(struct disk_cache *cache, const cache_key key) int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; unsigned char *entry; + if (!cache->path) + return; + Any reason why the blurb in disk_cache_put() is missing here? Reason why cache is created in disk_cache_has_key because that is called before disk_cache_put_key by the compiler. From cache_test.c POV disk_cache_put_key relied on disk_cache_has_key being called first, although I'm not sure if that's the most robust approach. Unit test calls disk_cache_put directly that also tries to create the cache. I'm OK trying to create cache here also but this should not happen either with apps or unit tests. // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] glsl/tests: changes to test_disk_cache_create test
On 31 January 2018 at 07:17, Tapani Pälli wrote: > - /* Before doing anything else, ensure that with > -* MESA_GLSL_CACHE_DISABLE set to true, that disk_cache_create returns > NULL. > -*/ > - setenv("MESA_GLSL_CACHE_DISABLE", "true", 1); > - cache = disk_cache_create("test", "make_check", 0); > - expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DISABLE set"); > We want to ensure that cache can be disabled. If needed, can we tweak i965/other places instead? > - /* Test with XDG_CACHE_HOME set */ > - setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1); > - cache = disk_cache_create("test", "make_check", 0); > - expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with" > - "a non-existing parent directory"); > - > - /* Test with MESA_GLSL_CACHE_DIR set */ > - err = rmrf_local(CACHE_TEST_TMP); > - expect_equal(err, 0, "Removing " CACHE_TEST_TMP); > - > - mkdir(CACHE_TEST_TMP, 0755); > - cache = disk_cache_create("test", "make_check", 0); > - expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set"); > - Why did these tests disappear? Might be having a dull moment, but I cannot see why they won't work. Even with the lazy indexing. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] glsl/tests: changes to test_disk_cache_create test
On 31.01.2018 15:41, Emil Velikov wrote: On 31 January 2018 at 07:17, Tapani Pälli wrote: - /* Before doing anything else, ensure that with -* MESA_GLSL_CACHE_DISABLE set to true, that disk_cache_create returns NULL. -*/ - setenv("MESA_GLSL_CACHE_DISABLE", "true", 1); - cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DISABLE set"); We want to ensure that cache can be disabled. If needed, can we tweak i965/other places instead? Oops yep I think this test can be left as is, cache will be NULL when MESA_GLSL_CACHE_DISABLE is set. - /* Test with XDG_CACHE_HOME set */ - setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1); - cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with" - "a non-existing parent directory"); - - /* Test with MESA_GLSL_CACHE_DIR set */ - err = rmrf_local(CACHE_TEST_TMP); - expect_equal(err, 0, "Removing " CACHE_TEST_TMP); - - mkdir(CACHE_TEST_TMP, 0755); - cache = disk_cache_create("test", "make_check", 0); - expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set"); - Why did these tests disappear? Might be having a dull moment, but I cannot see why they won't work. Even with the lazy indexing. Because the cache struct will be there when you call disk_cache_create whatever the environment variables are, it's just that set/get won't then do anything. Well .. if you really want these, these could be modified to use set/get and then check that NULL is received? // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] glsl/tests: changes to test_disk_cache_create test
On 31.01.2018 15:51, Tapani Pälli wrote: On 31.01.2018 15:41, Emil Velikov wrote: On 31 January 2018 at 07:17, Tapani Pälli wrote: - /* Before doing anything else, ensure that with - * MESA_GLSL_CACHE_DISABLE set to true, that disk_cache_create returns NULL. - */ - setenv("MESA_GLSL_CACHE_DISABLE", "true", 1); - cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DISABLE set"); We want to ensure that cache can be disabled. If needed, can we tweak i965/other places instead? Oops yep I think this test can be left as is, cache will be NULL when MESA_GLSL_CACHE_DISABLE is set. - /* Test with XDG_CACHE_HOME set */ - setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1); - cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with" - "a non-existing parent directory"); - - /* Test with MESA_GLSL_CACHE_DIR set */ - err = rmrf_local(CACHE_TEST_TMP); - expect_equal(err, 0, "Removing " CACHE_TEST_TMP); - - mkdir(CACHE_TEST_TMP, 0755); - cache = disk_cache_create("test", "make_check", 0); - expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set"); - Why did these tests disappear? Might be having a dull moment, but I cannot see why they won't work. Even with the lazy indexing. Because the cache struct will be there when you call disk_cache_create whatever the environment variables are, it's just that set/get won't then do anything. Well .. if you really want these, these could be modified to use set/get and then check that NULL is received? Just to explain a bit more .. what I did is that I added positive tests (that those env vars can be used to control cache) but removed negative tests that cache would not get generated when path is something impossible. // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/7] disk cache: initialize cache path and index only when used
On 31 January 2018 at 13:39, Tapani Pälli wrote: > > > On 31.01.2018 15:18, Emil Velikov wrote: >> >> On 31 January 2018 at 07:17, Tapani Pälli wrote: >>> >>> This patch makes disk_cache initialize path and index lazily so >>> that we can utilize disk_cache without a path using callback >>> functionality introduced by next patch. >>> >>> v2: unmap mmap and destroy queue only if index_mmap exists >>> >>> Signed-off-by: Tapani Pälli >>> --- >>> src/util/disk_cache.c | 127 >>> +++--- >>> 1 file changed, 78 insertions(+), 49 deletions(-) >>> >> I'd keep the refactor (disk_cache_create -> disk_cache_path_init + >> disk_cache_create) and lazy indexing separate patches. >> As-is tracking all the error paths is a quite fiddly. >> >>> @@ -999,6 +1015,11 @@ disk_cache_put(struct disk_cache *cache, const >>> cache_key key, >>> struct disk_cache_put_job *dc_job = >>> create_put_job(cache, key, data, size, cache_item_metadata); >>> >>> + /* Initialize path if not initialized yet. */ >>> + if (cache->path_init_failed || >>> + (!cache->path && !disk_cache_path_init(cache))) >>> + return; >>> + >>> if (dc_job) { >>> util_queue_fence_init(&dc_job->fence); >>> util_queue_add_job(&cache->cache_queue, dc_job, &dc_job->fence, >>> @@ -1173,6 +1194,9 @@ disk_cache_put_key(struct disk_cache *cache, const >>> cache_key key) >>> int i = CPU_TO_LE32(*key_chunk) & CACHE_INDEX_KEY_MASK; >>> unsigned char *entry; >>> >>> + if (!cache->path) >>> + return; >>> + >> >> Any reason why the blurb in disk_cache_put() is missing here? > > > Reason why cache is created in disk_cache_has_key because that is called > before disk_cache_put_key by the compiler. > >> From cache_test.c POV disk_cache_put_key relied on disk_cache_has_key >> being called first, although I'm not sure if that's the most robust >> approach. >> > > Unit test calls disk_cache_put directly that also tries to create the cache. > I'm OK trying to create cache here also but this should not happen either > with apps or unit tests. > Right, in that case I'd add an assert, so it flags up ASAP. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: do not insert shaders in cache when it's disabled
When the application doesn't provide its own pipeline cache, the driver uses a in-memory cache but it shouldn't insert any entries when the cache is explicitely disabled by the user. Found while running my experimental pipeline-db tool with a ton of shaders, the memory footprint was just huge, and sometimes the process was even killed... Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_pipeline_cache.c | 29 - 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline_cache.c b/src/amd/vulkan/radv_pipeline_cache.c index db48895817..7205a3d896 100644 --- a/src/amd/vulkan/radv_pipeline_cache.c +++ b/src/amd/vulkan/radv_pipeline_cache.c @@ -241,6 +241,17 @@ radv_pipeline_cache_add_entry(struct radv_pipeline_cache *cache, radv_pipeline_cache_set_entry(cache, entry); } +static bool +radv_is_cache_disabled(struct radv_device *device) +{ + /* Pipeline caches can be disabled with RADV_DEBUG=nocache, with +* MESA_GLSL_CACHE_DISABLE=1, and when VK_AMD_shader_info is requested. +*/ + return (device->instance->debug_flags & RADV_DEBUG_NO_CACHE) || + !device->physical_device->disk_cache || + device->keep_shader_info; +} + bool radv_create_shader_variants_from_pipeline_cache(struct radv_device *device, struct radv_pipeline_cache *cache, @@ -257,11 +268,10 @@ radv_create_shader_variants_from_pipeline_cache(struct radv_device *device, entry = radv_pipeline_cache_search_unlocked(cache, sha1); if (!entry) { - /* Again, don't cache when we want debug info, since this isn't -* present in the cache. */ - if (!device->physical_device->disk_cache || - (device->instance->debug_flags & RADV_DEBUG_NO_CACHE) || - device->keep_shader_info) { + /* Don't cache when we want debug info, since this isn't +* present in the cache. +*/ + if (radv_is_cache_disabled(device)) { pthread_mutex_unlock(&cache->mutex); return false; } @@ -362,6 +372,15 @@ radv_pipeline_cache_insert_shaders(struct radv_device *device, pthread_mutex_unlock(&cache->mutex); return; } + + /* Don't cache when we want debug info, since this isn't +* present in the cache. +*/ + if (radv_is_cache_disabled(device)) { + pthread_mutex_unlock(&cache->mutex); + return; + } + size_t size = sizeof(*entry); for (int i = 0; i < MESA_SHADER_STAGES; ++i) if (variants[i]) -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] glsl/tests: changes to test_disk_cache_create test
On 31 January 2018 at 13:55, Tapani Pälli wrote: > > > On 31.01.2018 15:51, Tapani Pälli wrote: >> >> >> >> On 31.01.2018 15:41, Emil Velikov wrote: >>> >>> On 31 January 2018 at 07:17, Tapani Pälli wrote: >>> - /* Before doing anything else, ensure that with -* MESA_GLSL_CACHE_DISABLE set to true, that disk_cache_create returns NULL. -*/ - setenv("MESA_GLSL_CACHE_DISABLE", "true", 1); - cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DISABLE set"); >>> We want to ensure that cache can be disabled. If needed, can we tweak >>> i965/other places instead? >> >> >> Oops yep I think this test can be left as is, cache will be NULL when >> MESA_GLSL_CACHE_DISABLE is set. >> >>> - /* Test with XDG_CACHE_HOME set */ - setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1); - cache = disk_cache_create("test", "make_check", 0); - expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with" - "a non-existing parent directory"); - >>> >>> - /* Test with MESA_GLSL_CACHE_DIR set */ - err = rmrf_local(CACHE_TEST_TMP); - expect_equal(err, 0, "Removing " CACHE_TEST_TMP); - >>> >>> >>> - mkdir(CACHE_TEST_TMP, 0755); - cache = disk_cache_create("test", "make_check", 0); - expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set"); - >>> >>> >>> Why did these tests disappear? Might be having a dull moment, but I >>> cannot see why they won't work. Even with the lazy indexing. >>> >> >> Because the cache struct will be there when you call disk_cache_create >> whatever the environment variables are, it's just that set/get won't then do >> anything. Well .. if you really want these, these could be modified to use >> set/get and then check that NULL is received? >> > > Just to explain a bit more .. what I did is that I added positive tests > (that those env vars can be used to control cache) but removed negative > tests that cache would not get generated when path is something impossible. > Right. Since indexing is delayed [create -> get/set/has, with create returning non-null] we can tweak the negative tests in similar way. It might look strange, since we're detecting disk cache creation by trying to fetch/insert data. Yet, we will still be testing that that Mesa isn't doing stupid things wrt folder management :-) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/20] mesa: implement buffer/texture barriers for semaphore signal/wait v2
On 23/01/2018 18:05, Andres Rodriguez wrote: Make sure memory is accessible to the external client, for the specified memory object, before the signal/after the wait. v2: fixed flush order with respect to wait/signal emission Signed-off-by: Andres Rodriguez --- src/mesa/main/dd.h | 14 ++- src/mesa/main/externalobjects.c | 38 +++--- src/mesa/state_tracker/st_cb_semaphoreobjects.c | 53 +++-- 3 files changed, 95 insertions(+), 10 deletions(-) [...] diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c index 4fb3ca07a9..c070d7a28d 100644 --- a/src/mesa/main/externalobjects.c +++ b/src/mesa/main/externalobjects.c @@ -23,6 +23,7 @@ #include "macros.h" #include "mtypes.h" +#include "bufferobj.h" #include "context.h" #include "externalobjects.h" #include "teximage.h" @@ -716,7 +717,8 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, { GET_CURRENT_CONTEXT(ctx); struct gl_semaphore_object *semObj; - + struct gl_buffer_object **bufObjs; + struct gl_texture_object **texObjs; if (!ctx->Extensions.EXT_semaphore) { _mesa_error(ctx, GL_INVALID_OPERATION, "glWaitSemaphoreEXT(unsupported)"); @@ -732,8 +734,20 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, FLUSH_VERTICES( ctx, 0 ); FLUSH_CURRENT( ctx, 0 ); - /* TODO: memory barriers and layout transitions */ - ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj); + bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + for (unsigned i = 0; i < numBufferBarriers; i++) { + bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); + } + + texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + for (unsigned i = 0; i < numTextureBarriers; i++) { + texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); + } + + ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj, + numBufferBarriers, bufObjs, + numTextureBarriers, texObjs, + srcLayouts); } void GLAPIENTRY @@ -746,6 +760,8 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, { GET_CURRENT_CONTEXT(ctx); struct gl_semaphore_object *semObj; + struct gl_buffer_object **bufObjs; + struct gl_texture_object **texObjs; if (!ctx->Extensions.EXT_semaphore) { _mesa_error(ctx, GL_INVALID_OPERATION, "glSignalSemaphoreEXT(unsupported)"); @@ -761,8 +777,20 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, FLUSH_VERTICES( ctx, 0 ); FLUSH_CURRENT( ctx, 0 ); - /* TODO: memory barriers and layout transitions */ - ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj); + bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + for (unsigned i = 0; i < numBufferBarriers; i++) { + bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); + } + + texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + for (unsigned i = 0; i < numTextureBarriers; i++) { + texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); + } + + ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj, + numBufferBarriers, bufObjs, + numTextureBarriers, texObjs, + dstLayouts); } [...] This adds a use of alloca(), without a corresponding #include. Patch attached. See also: https://lists.freedesktop.org/archives/mesa-dev/2017-December/180073.html https://lists.freedesktop.org/archives/mesa-dev/2016-July/122346.html From 46a2c9bbd03234120594d50b48cbad73a355d240 Mon Sep 17 00:00:00 2001 From: Jon Turney Date: Wed, 31 Jan 2018 12:46:22 + Subject: [PATCH] Fix use of alloca() without #include MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix use of alloca() without #include in 29b9bd05 ../src/mesa/main/externalobjects.c:737:14: error: implicit declaration of function ‘alloca’ [-Werror=implicit-function-declaration] Signed-off-by: Jon Turney --- src/mesa/main/externalobjects.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c index 463debd268..4648932a9b 100644 --- a/src/mesa/main/externalobjects.c +++ b/src/mesa/main/externalobjects.c @@ -21,6 +21,7 @@ * DEALINGS IN THE SOFTWARE. */ +#include "c99_alloca.h" #include "macros.h" #include "mtypes.h" #include "bufferobj.h" -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] meson: osx doesn't have librt, so don't require it
On 30 January 2018 at 20:27, Dylan Baker wrote: > Quoting Emil Velikov (2018-01-30 10:56:42) >> Hi Jon, >> >> On 28 January 2018 at 14:24, Jon Turney wrote: >> > --- >> > meson.build | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/meson.build b/meson.build >> > index 7e194a9f10d..8fdbaa8b8d8 100644 >> > --- a/meson.build >> > +++ b/meson.build >> > @@ -935,7 +935,7 @@ elif with_dri_i965 and get_option('shader-cache') >> > endif >> > >> > # Determine whether or not the rt library is needed for time functions >> > -if cc.has_function('clock_gettime') >> > +if cc.has_function('clock_gettime') or (host_machine.system() == 'darwin') >> >> Absolutely no objections against the patch - just a small question. >> If the meson/autotools check fails, does this mean that the resulting >> binaries are having unresolved reference wrt said symbol? >> >> Thanks >> Emil > > Not for meson, it builds -Wl,--no-undefined (or it's MSVC equivalent) by > default, so it shouldn't be possible to get unresolved symbols in a binary or > shared library. > Right, the question is why does the test (has_function) fails? A few possible solutions come to mind, but only one with toolchain handy can confirm. - the test is broken (regardless meson/autotools/foo implementation) - the symbol is indirectly resolved The est does not pull libfoo, while the final binary does. Libfoo pulls libbar with the latter providing the symbol. - the final binary is having unresolved reference - all the clock_gettime references get 'magically' removed due to the linker garbage collector From a quick search OSX 10.12 (or so) has clock_gettime. But details are extremely sparse :-( -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104777] Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error
https://bugs.freedesktop.org/show_bug.cgi?id=104777 Anton Sudak changed: What|Removed |Added CC||anton.su...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] misc pahole repacking
On 01/31/2018 01:48 AM, Tapani Pälli wrote: Reviewed-by: Tapani Pälli (I verified the 1st one and I trust you on the 2nd one.) BTW I witnessed pahole crashing when processing visit() methods of ir_print_visitor class, did you experience that? My pahole version is 1.9, it dies in /lib64/libdwarves.so.1 after some prints like: --- 8< --- die__process_unit: DW_TAG_restrict_type (0x37) @ <0x122be84> not handled! die__process_unit: DW_TAG_unspecified_type (0x3b) @ <0x1230c68> not handled! die__process_unit: DW_TAG_restrict_type (0x37) @ <0x12312d3> not handled! die__process_unit: DW_TAG_unspecified_type (0x3b) @ <0x1231bc4> not handled! die__process_unit: DW_TAG_restrict_type (0x37) @ <0x12340bb> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x123f984> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x1242348> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x1242398> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x12423e8> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x1242572> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x12425c2> not handled! die__process_unit: DW_TAG_rvalue_reference_type (0x42) @ <0x12425ea> not handled! die__process_function: DW_TAG_rvalue_reference_type (0x42) @ <0x1247929> not handled! Me too. I have v1.9 as well and it crashes pretty quickly, for example: $ pahole build-llvmpipe/lib/gallium/libGL.so.1.5.0 struct xm_driver { struct pipe_screen * (*create_pipe_screen)(Display *); /* 0 8 */ struct st_api *(*create_st_api)(void); /* 8 8 */ /* size: 16, cachelines: 1, members: 2 */ /* last cacheline: 16 bytes */ }; struct sw_winsys { void (*destroy)(struct sw_winsys *); /* 0 8 */ boolean(*is_displaytarget_format_supported)(struct sw_winsys *, unsigned int, enum pipe_format); /* 8 8 */ struct sw_displaytarget * (*displaytarget_create)(struct sw_winsys *, unsigned int, enum pipe_format, unsigned int, unsigned int, unsigned int, const void *, unsigned int *); /*16 8 */ struct sw_displaytarget * (*displaytarget_from_handle)(struct sw_winsys *, const struct pipe_resource *, struct winsys_handle *, unsigned int *); /*24 8 */ boolean(*displaytarget_get_handle)(struct sw_winsys *, struct sw_displaytarget *, struct winsys_handle *); /*32 8 */ void * (*displaytarget_map)(struct sw_winsys *, struct sw_displaytarget *, unsigned int); /*40 8 */ void (*displaytarget_unmap)(struct sw_winsys *, struct sw_displaytarget *); /*48 8 */ void (*displaytarget_display)(struct sw_winsys *, struct sw_displaytarget *, void *, struct pipe_box *); /*56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ void (*displaytarget_destroy)(struct sw_winsys *, struct sw_displaytarget *); /*64 8 */ /* size: 72, cachelines: 2, members: 9 */ /* last cacheline: 8 bytes */ }; die__process_unit: DW_TAG_restrict_type (0x37) @ <0x3b09> not handled! -Brian On 31.01.2018 01:41, Dave Airlie wrote: This month's Dave hasn't got enough sleep to do real work, lets repack some structs. The format descriptions one is quite good though it reduces the radv binary data segment. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] meson: osx doesn't have librt, so don't require it
On 31/01/2018 15:21, Emil Velikov wrote: On 30 January 2018 at 20:27, Dylan Baker wrote: Quoting Emil Velikov (2018-01-30 10:56:42) Hi Jon, On 28 January 2018 at 14:24, Jon Turney wrote: --- meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meson.build b/meson.build index 7e194a9f10d..8fdbaa8b8d8 100644 --- a/meson.build +++ b/meson.build @@ -935,7 +935,7 @@ elif with_dri_i965 and get_option('shader-cache') endif # Determine whether or not the rt library is needed for time functions -if cc.has_function('clock_gettime') +if cc.has_function('clock_gettime') or (host_machine.system() == 'darwin') Absolutely no objections against the patch - just a small question. If the meson/autotools check fails, does this mean that the resulting binaries are having unresolved reference wrt said symbol? Thanks Emil Not for meson, it builds -Wl,--no-undefined (or it's MSVC equivalent) by default, so it shouldn't be possible to get unresolved symbols in a binary or shared library. Right, the question is why does the test (has_function) fails? Some misunderstanding going on here, I think. It fails if the function isn't present. (We then go on to report the failure as being librt not found, not that we couldn't work out how to find clock_gettime, but perhaps that's a separate issue) A few possible solutions come to mind, but only one with toolchain handy can confirm. - the test is broken (regardless meson/autotools/foo implementation) - the symbol is indirectly resolved The est does not pull libfoo, while the final binary does. Libfoo pulls libbar with the latter providing the symbol. - the final binary is having unresolved reference - all the clock_gettime references get 'magically' removed due to the linker garbage collector From a quick search OSX 10.12 (or so) has clock_gettime. But details are extremely sparse :-( This change (and the check that autoconf is currently doing) is not needed. If we're targeting OSX 10.12 or later, clock_gettime() exists. Commit 990bd49f might have made sense once, but I think now that clock_gettime is used in include/c11/threads_posix.h, we're going to fail to build for OSX prior to 10.12, even when not trying to link with librt. (I also had a patch to provide an implementation of clock_gettime if the target is an earlier OSX version, but I dropped it as probably not very useful) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/6] meson: osx doesn't have librt, so don't require it
On 31 January 2018 at 16:07, Jon Turney wrote: > On 31/01/2018 15:21, Emil Velikov wrote: >> >> On 30 January 2018 at 20:27, Dylan Baker wrote: >>> >>> Quoting Emil Velikov (2018-01-30 10:56:42) Hi Jon, On 28 January 2018 at 14:24, Jon Turney wrote: > > --- > meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/meson.build b/meson.build > index 7e194a9f10d..8fdbaa8b8d8 100644 > --- a/meson.build > +++ b/meson.build > @@ -935,7 +935,7 @@ elif with_dri_i965 and get_option('shader-cache') > endif > > # Determine whether or not the rt library is needed for time > functions > -if cc.has_function('clock_gettime') > +if cc.has_function('clock_gettime') or (host_machine.system() == > 'darwin') Absolutely no objections against the patch - just a small question. If the meson/autotools check fails, does this mean that the resulting binaries are having unresolved reference wrt said symbol? Thanks Emil >>> >>> >>> Not for meson, it builds -Wl,--no-undefined (or it's MSVC equivalent) by >>> default, so it shouldn't be possible to get unresolved symbols in a >>> binary or >>> shared library. >>> >> Right, the question is why does the test (has_function) fails? > > > Some misunderstanding going on here, I think. It fails if the function isn't > present. > > (We then go on to report the failure as being librt not found, not that we > couldn't work out how to find clock_gettime, but perhaps that's a separate > issue) > >> A few possible solutions come to mind, but only one with toolchain >> handy can confirm. >> - the test is broken (regardless meson/autotools/foo implementation) >> - the symbol is indirectly resolved >> The est does not pull libfoo, while the final binary does. Libfoo >> pulls libbar with the latter providing the symbol. >> - the final binary is having unresolved reference >> - all the clock_gettime references get 'magically' removed due to the >> linker garbage collector >> >> From a quick search OSX 10.12 (or so) has clock_gettime. But details >> are extremely sparse :-( > > > This change (and the check that autoconf is currently doing) is not needed. > > If we're targeting OSX 10.12 or later, clock_gettime() exists. > > Commit 990bd49f might have made sense once, but I think now that > clock_gettime is used in include/c11/threads_posix.h, we're going to fail to > build for OSX prior to 10.12, even when not trying to link with librt. > > (I also had a patch to provide an implementation of clock_gettime if the > target is an earlier OSX version, but I dropped it as probably not very > useful) Right, so in this case: - targeting pre 10.12 where API is missing (be that librt or elsewhere) - 990bd49f and this patch are off, remove the linkage and rely on the linker GC magic I think, one wants to distinct between presence vs linkage requirements. ATM they're assumed to be the same thing. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104836] Missing library link breaks mesa on Debian/ia64
https://bugs.freedesktop.org/show_bug.cgi?id=104836 --- Comment #3 from Emil Velikov --- Having played with this at bit - it's a massive yak exercise. It's not a new bug by any means, so I'll lower it, slightly, on my priority list. Will CC you on the series when it's finished. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] mesa: Put materials at the end of the generic block.
On 01/31/2018 12:55 AM, mathias.froehl...@gmx.net wrote: From: Mathias Fröhlich The materials are now moved to the end of the generic attributes block to the range 4-15. Before, the way the position and generic 0 attribute is handled was dependent on the presence and kind of the currently attached vertex program. With this change the way the position attribute and the generic 0 attribute is treated only depends on the enabled flag of those two arrays. This will later help to untangle the update dependencies between enabled arrays and shader inputs. Signed-off-by: Mathias Fröhlich --- src/compiler/shader_enums.h | 7 ++- src/mesa/tnl/t_context.h | 4 ++-- src/mesa/vbo/vbo_exec_array.c | 14 +++--- src/mesa/vbo/vbo_exec_draw.c | 10 +- src/mesa/vbo/vbo_save_draw.c | 8 5 files changed, 24 insertions(+), 19 deletions(-) diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index aa296adb5a..47a017eebc 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -127,6 +127,8 @@ const char *gl_vert_attrib_name(gl_vert_attrib attrib); * VERT_ATTRIB_MAT * include the generic shader attributes used to alias * varying material values for the TNL shader programs. + * They are located at the end of the generic attribute + * block not to overlap with the generic 0 attribute. */ #define VERT_ATTRIB_FF(i) (VERT_ATTRIB_POS + (i)) #define VERT_ATTRIB_FF_MAX VERT_ATTRIB_GENERIC0 @@ -137,7 +139,10 @@ const char *gl_vert_attrib_name(gl_vert_attrib attrib); #define VERT_ATTRIB_GENERIC(i) (VERT_ATTRIB_GENERIC0 + (i)) #define VERT_ATTRIB_GENERIC_MAX MAX_VERTEX_GENERIC_ATTRIBS -#define VERT_ATTRIB_MAT(i) VERT_ATTRIB_GENERIC(i) +#define VERT_ATTRIB_MAT_OFFSET \ + (VERT_ATTRIB_GENERIC_MAX - VERT_ATTRIB_MAT_MAX) Instead of VERT_ATTRIB_MAT_OFFSET, maybe VERT_ATTRIB_MAT0? This value is the position of the first material attribute so MAT0 seems more direct and mirrors VERT_ATTRIB_GENERIC0. +#define VERT_ATTRIB_MAT(i) \ + VERT_ATTRIB_GENERIC((i) + VERT_ATTRIB_MAT_OFFSET) #define VERT_ATTRIB_MAT_MAX MAT_ATTRIB_MAX /** diff --git a/src/mesa/tnl/t_context.h b/src/mesa/tnl/t_context.h index 48d7ced791..082110c607 100644 --- a/src/mesa/tnl/t_context.h +++ b/src/mesa/tnl/t_context.h @@ -158,8 +158,8 @@ enum { #define _TNL_FIRST_GENERIC _TNL_ATTRIB_GENERIC0 #define _TNL_LAST_GENERIC _TNL_ATTRIB_GENERIC15 -#define _TNL_FIRST_MAT _TNL_ATTRIB_MAT_FRONT_AMBIENT /* GENERIC0 */ -#define _TNL_LAST_MAT_TNL_ATTRIB_MAT_BACK_INDEXES /* GENERIC11 */ +#define _TNL_FIRST_MAT _TNL_ATTRIB_MAT_FRONT_AMBIENT /* GENERIC4 */ +#define _TNL_LAST_MAT_TNL_ATTRIB_MAT_BACK_INDEXES /* GENERIC15 */ /* Number of available texture attributes */ #define _TNL_NUM_TEX 8 diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c index 412b6b669c..28f64e1422 100644 --- a/src/mesa/vbo/vbo_exec_array.c +++ b/src/mesa/vbo/vbo_exec_array.c @@ -335,20 +335,20 @@ recalculate_input_bindings(struct gl_context *ctx) } } - for (i = 0; i < VERT_ATTRIB_MAT_MAX; i++) { - inputs[VERT_ATTRIB_MAT(i)] = -&vbo->currval[VBO_ATTRIB_MAT_FRONT_AMBIENT + i]; - const_inputs |= VERT_BIT_MAT(i); - } - /* Could use just about anything, just to fill in the empty * slots: */ - for (i = VERT_ATTRIB_MAT_MAX; i < VERT_ATTRIB_GENERIC_MAX; i++) { + for (i = 0; i < VERT_ATTRIB_MAT_OFFSET; i++) { inputs[VERT_ATTRIB_GENERIC(i)] = &vbo->currval[VBO_ATTRIB_GENERIC0 + i]; const_inputs |= VERT_BIT_GENERIC(i); } + + for (i = 0; i < VERT_ATTRIB_MAT_MAX; i++) { + inputs[VERT_ATTRIB_MAT(i)] = +&vbo->currval[VBO_ATTRIB_MAT_FRONT_AMBIENT + i]; + const_inputs |= VERT_BIT_MAT(i); + } break; case VP_SHADER: diff --git a/src/mesa/vbo/vbo_exec_draw.c b/src/mesa/vbo/vbo_exec_draw.c index 2b7784694f..b077fbeb6d 100644 --- a/src/mesa/vbo/vbo_exec_draw.c +++ b/src/mesa/vbo/vbo_exec_draw.c @@ -187,16 +187,16 @@ vbo_exec_bind_arrays(struct gl_context *ctx) /* Overlay other active attributes */ switch (get_vp_mode(exec->ctx)) { case VP_FF: + for (attr = 0; attr < VERT_ATTRIB_MAT_OFFSET; attr++) { + assert(VERT_ATTRIB_GENERIC(attr) < ARRAY_SIZE(exec->vtx.inputs)); + exec->vtx.inputs[VERT_ATTRIB_GENERIC(attr)] = +&vbo->currval[VBO_ATTRIB_GENERIC0+attr]; + } for (attr = 0; attr < VERT_ATTRIB_MAT_MAX; attr++) { assert(VERT_ATTRIB_MAT(attr) < ARRAY_SIZE(exec->vtx.inputs)); exec->vtx.inputs[VERT_ATTRIB_MAT(attr)] = &vbo->currval[VBO_ATTRIB_MAT_FRONT_AMBIENT+attr]; } - for (attr = VERT_ATTRIB_MAT_MAX; attr < VERT_ATTRIB_GENERIC_MAX; attr++) { - assert(VERT_ATTRIB_GENERIC(
Re: [Mesa-dev] [PATCH 1/8] vbo: Correctly handle attribute offsets in dlist draw.
Hi Matthias, Nice work! It's nice to get rid of some of those attribute loops. The series looks good to me, just assorted nit-picks and comments on a few patches. -Brian On 01/31/2018 12:55 AM, mathias.froehl...@gmx.net wrote: From: Mathias Fröhlich When executing a display list draw, for the offset list to be correct, the offset computation needs to accumulate all attribute size values in order. Specifically, if we are shuffling around the position and generic0 attributes, we may violate the order or if we do not walk the generic vbo attributes we may skip some of the attributes. Even if this is an unlikely usecase we can fix this "use case" by precomputing the offsets on the full attribute list and store the full offset list in the display list node. v2: Formatting fix v3: Rebase Signed-off-by: Mathias Fröhlich --- src/mesa/vbo/vbo_save.h | 1 + src/mesa/vbo/vbo_save_api.c | 19 ++ src/mesa/vbo/vbo_save_draw.c | 47 +++- 3 files changed, 36 insertions(+), 31 deletions(-) diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h index 1dc66a598e..cb0bff2341 100644 --- a/src/mesa/vbo/vbo_save.h +++ b/src/mesa/vbo/vbo_save.h @@ -64,6 +64,7 @@ struct vbo_save_vertex_list { GLbitfield64 enabled; /**< mask of enabled vbo arrays. */ GLubyte attrsz[VBO_ATTRIB_MAX]; GLenum16 attrtype[VBO_ATTRIB_MAX]; + GLuint offsets[VBO_ATTRIB_MAX]; GLuint vertex_size; /**< size in GLfloats */ /* Copy of the final vertex from node->vertex_store->bufferobj. diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c index a0fb62d814..fb51bdb84e 100644 --- a/src/mesa/vbo/vbo_save_api.c +++ b/src/mesa/vbo/vbo_save_api.c @@ -420,6 +420,8 @@ compile_vertex_list(struct gl_context *ctx) { struct vbo_save_context *save = &vbo_context(ctx)->save; struct vbo_save_vertex_list *node; + GLuint offset; + unsigned i; /* Allocate space for this structure in the display list currently * being compiled. @@ -443,6 +445,23 @@ compile_vertex_list(struct gl_context *ctx) node->vertex_size = save->vertex_size; node->buffer_offset = (save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat); + if (aligned_vertex_buffer_offset(node)) { + /* The vertex size is an exact multiple of the buffer offset. + * This means that we can use zero-based vertex attribute pointers + * and specify the start of the primitive with the _mesa_prim::start + * field. This results in issuing several draw calls with identical + * vertex attribute information. This can result in fewer state + * changes in drivers. In particular, the Gallium CSO module will + * filter out redundant vertex buffer changes. + */ + offset = 0; + } else { + offset = node->buffer_offset; + } + for (i = 0; i < VBO_ATTRIB_MAX; ++i) { + node->offsets[i] = offset; + offset += node->attrsz[i] * sizeof(GLfloat); + } node->vertex_count = save->vert_count; node->wrap_count = save->copied.nr; node->dangling_attr_ref = save->dangling_attr_ref; diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c index fd0ccc1230..291d99ed9a 100644 --- a/src/mesa/vbo/vbo_save_draw.c +++ b/src/mesa/vbo/vbo_save_draw.c @@ -26,6 +26,7 @@ *Keith Whitwell */ +#include #include "main/glheader.h" #include "main/bufferobj.h" #include "main/context.h" @@ -137,29 +138,10 @@ bind_vertex_list(struct gl_context *ctx, struct vbo_context *vbo = vbo_context(ctx); struct vbo_save_context *save = &vbo->save; struct gl_vertex_array *arrays = save->arrays; - GLuint buffer_offset = node->buffer_offset; const GLubyte *map; GLuint attr; - GLubyte node_attrsz[VBO_ATTRIB_MAX]; /* copy of node->attrsz[] */ - GLenum16 node_attrtype[VBO_ATTRIB_MAX]; /* copy of node->attrtype[] */ GLbitfield varying_inputs = 0x0; - - STATIC_ASSERT(sizeof(node_attrsz) == sizeof(node->attrsz)); - memcpy(node_attrsz, node->attrsz, sizeof(node->attrsz)); - STATIC_ASSERT(sizeof(node_attrtype) == sizeof(node->attrtype)); - memcpy(node_attrtype, node->attrtype, sizeof(node->attrtype)); - - if (aligned_vertex_buffer_offset(node)) { - /* The vertex size is an exact multiple of the buffer offset. - * This means that we can use zero-based vertex attribute pointers - * and specify the start of the primitive with the _mesa_prim::start - * field. This results in issuing several draw calls with identical - * vertex attribute information. This can result in fewer state - * changes in drivers. In particular, the Gallium CSO module will - * filter out redundant vertex buffer changes. - */ - buffer_offset = 0; - } + bool generic_from_pos = false; /* Install the default (ie Current) attributes first */ for (attr = 0; attr < VERT_ATTRIB_FF_MAX; attr++
Re: [Mesa-dev] [PATCH 4/8] mesa: Track position/generic0 aliasing in the VAO.
On 01/31/2018 12:55 AM, mathias.froehl...@gmx.net wrote: From: Mathias Fröhlich Since the first material attribute no longer aliases with the generic0 attribute, only aliasing between generic0 and position is left and entirely dependent on the enabled state of the VAO. So introduce a gl_attribute_map_mode in the VAO that is used to track how the position and the generic 0 attribute alias. Provide a static const array that can be used to map from vertex program input indices to VERT_ATTRIB_* indices. The outer dimension of the array is meant to be indexed directly by the new VAO member variable. Also provide methods on the VAO to convert bitmasks of VERT_BIT's from the VAO numbering to the vertex processing inputs numbering. Signed-off-by: Mathias Fröhlich --- src/mesa/main/arrayobj.c | 131 +++ src/mesa/main/arrayobj.h | 64 +++ src/mesa/main/enable.c | 5 ++ src/mesa/main/mtypes.h | 18 +++ src/mesa/main/varray.c | 18 +-- 5 files changed, 232 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c index 7208f4c534..ae414072ab 100644 --- a/src/mesa/main/arrayobj.c +++ b/src/mesa/main/arrayobj.c @@ -54,6 +54,135 @@ #include "util/bitscan.h" +const unsigned char How about GLubyte instead? +_mesa_vao_attribute_map[_ATTRIBUTE_MAP_MODE_MAX][VERT_ATTRIB_MAX] = +{ + /* ATTRIBUTE_MAP_MODE_IDENTITY +* +* Grab vertex processing attribute VERT_ATTRIB_POS from +* the VAO attribute VERT_ATTRIB_POS, and grab vertex processing +* attribute VERT_ATTRIB_GENERIC0 from the VAO attribute +* VERT_ATTRIB_GENERIC0. +*/ + { + VERT_ATTRIB_POS, /* VERT_ATTRIB_POS */ + VERT_ATTRIB_NORMAL, /* VERT_ATTRIB_NORMAL */ + VERT_ATTRIB_COLOR0, /* VERT_ATTRIB_COLOR0 */ + VERT_ATTRIB_COLOR1, /* VERT_ATTRIB_COLOR1 */ + VERT_ATTRIB_FOG, /* VERT_ATTRIB_FOG */ + VERT_ATTRIB_COLOR_INDEX, /* VERT_ATTRIB_COLOR_INDEX */ + VERT_ATTRIB_EDGEFLAG,/* VERT_ATTRIB_EDGEFLAG */ + VERT_ATTRIB_TEX0,/* VERT_ATTRIB_TEX0 */ + VERT_ATTRIB_TEX1,/* VERT_ATTRIB_TEX1 */ + VERT_ATTRIB_TEX2,/* VERT_ATTRIB_TEX2 */ + VERT_ATTRIB_TEX3,/* VERT_ATTRIB_TEX3 */ + VERT_ATTRIB_TEX4,/* VERT_ATTRIB_TEX4 */ + VERT_ATTRIB_TEX5,/* VERT_ATTRIB_TEX5 */ + VERT_ATTRIB_TEX6,/* VERT_ATTRIB_TEX6 */ + VERT_ATTRIB_TEX7,/* VERT_ATTRIB_TEX7 */ + VERT_ATTRIB_POINT_SIZE, /* VERT_ATTRIB_POINT_SIZE */ + VERT_ATTRIB_GENERIC0,/* VERT_ATTRIB_GENERIC0 */ + VERT_ATTRIB_GENERIC1,/* VERT_ATTRIB_GENERIC1 */ + VERT_ATTRIB_GENERIC2,/* VERT_ATTRIB_GENERIC2 */ + VERT_ATTRIB_GENERIC3,/* VERT_ATTRIB_GENERIC3 */ + VERT_ATTRIB_GENERIC4,/* VERT_ATTRIB_GENERIC4 */ + VERT_ATTRIB_GENERIC5,/* VERT_ATTRIB_GENERIC5 */ + VERT_ATTRIB_GENERIC6,/* VERT_ATTRIB_GENERIC6 */ + VERT_ATTRIB_GENERIC7,/* VERT_ATTRIB_GENERIC7 */ + VERT_ATTRIB_GENERIC8,/* VERT_ATTRIB_GENERIC8 */ + VERT_ATTRIB_GENERIC9,/* VERT_ATTRIB_GENERIC9 */ + VERT_ATTRIB_GENERIC10, /* VERT_ATTRIB_GENERIC10 */ + VERT_ATTRIB_GENERIC11, /* VERT_ATTRIB_GENERIC11 */ + VERT_ATTRIB_GENERIC12, /* VERT_ATTRIB_GENERIC12 */ + VERT_ATTRIB_GENERIC13, /* VERT_ATTRIB_GENERIC13 */ + VERT_ATTRIB_GENERIC14, /* VERT_ATTRIB_GENERIC14 */ + VERT_ATTRIB_GENERIC15/* VERT_ATTRIB_GENERIC15 */ + }, + + /* ATTRIBUTE_MAP_MODE_POSITION +* +* Grab vertex processing attribute VERT_ATTRIB_POS as well as +* vertex processing attribute VERT_ATTRIB_GENERIC0 from the +* VAO attribute VERT_ATTRIB_POS. +*/ + { + VERT_ATTRIB_POS, /* VERT_ATTRIB_POS */ + VERT_ATTRIB_NORMAL, /* VERT_ATTRIB_NORMAL */ + VERT_ATTRIB_COLOR0, /* VERT_ATTRIB_COLOR0 */ + VERT_ATTRIB_COLOR1, /* VERT_ATTRIB_COLOR1 */ + VERT_ATTRIB_FOG, /* VERT_ATTRIB_FOG */ + VERT_ATTRIB_COLOR_INDEX, /* VERT_ATTRIB_COLOR_INDEX */ + VERT_ATTRIB_EDGEFLAG,/* VERT_ATTRIB_EDGEFLAG */ + VERT_ATTRIB_TEX0,/* VERT_ATTRIB_TEX0 */ + VERT_ATTRIB_TEX1,/* VERT_ATTRIB_TEX1 */ + VERT_ATTRIB_TEX2,/* VERT_ATTRIB_TEX2 */ + VERT_ATTRIB_TEX3,/* VERT_ATTRIB_TEX3 */ + VERT_ATTRIB_TEX4,/* VERT_ATTRIB_TEX4 */ + VERT_ATTRIB_TEX5,/* VERT_ATTRIB_TEX5 */ + VERT_ATTRIB_TEX6,/* VERT_ATTRIB_TEX6 */ + VERT_ATTRIB_TEX7,/* VERT_ATTRIB_T
Re: [Mesa-dev] [PATCH 5/8] vbo: Use static const VERT_ATTRIB->VBO_ATTRIB maps.
On 01/31/2018 12:55 AM, mathias.froehl...@gmx.net wrote: From: Mathias Fröhlich Instead of each context having its own map instance for this purpose, use a global static const map. Signed-off-by: Mathias Fröhlich --- src/mesa/vbo/vbo_context.c| 23 ++ src/mesa/vbo/vbo_exec.c | 74 +++ src/mesa/vbo/vbo_exec_array.c | 5 ++- src/mesa/vbo/vbo_exec_draw.c | 8 ++--- src/mesa/vbo/vbo_private.h| 15 ++--- src/mesa/vbo/vbo_save_draw.c | 8 ++--- 6 files changed, 98 insertions(+), 35 deletions(-) diff --git a/src/mesa/vbo/vbo_context.c b/src/mesa/vbo/vbo_context.c index 265b73d2db..fe1d0f510a 100644 --- a/src/mesa/vbo/vbo_context.c +++ b/src/mesa/vbo/vbo_context.c @@ -229,27 +229,8 @@ _vbo_CreateContext(struct gl_context *ctx) init_mat_currval(ctx); vbo_set_indirect_draw_func(ctx, vbo_draw_indirect_prims); - /* Build mappings from VERT_ATTRIB -> VBO_ATTRIB depending on type -* of vertex program active. -*/ - { - GLuint i; - - /* make sure all VBO_ATTRIB_ values can fit in an unsigned byte */ - STATIC_ASSERT(VBO_ATTRIB_MAX <= 255); - - /* identity mapping */ - for (i = 0; i < ARRAY_SIZE(vbo->map_vp_none); i++) - vbo->map_vp_none[i] = i; - /* map material attribs to generic slots */ - for (i = 0; i < VERT_ATTRIB_MAT_MAX; i++) - vbo->map_vp_none[VERT_ATTRIB_MAT(i)] -= VBO_ATTRIB_MAT_FRONT_AMBIENT + i; - - for (i = 0; i < ARRAY_SIZE(vbo->map_vp_arb); i++) - vbo->map_vp_arb[i] = i; - } - + /* make sure all VBO_ATTRIB_ values can fit in an unsigned byte */ + STATIC_ASSERT(VBO_ATTRIB_MAX <= 255); /* Hook our functions into exec and compile dispatch tables. These * will pretty much be permanently installed, which means that the diff --git a/src/mesa/vbo/vbo_exec.c b/src/mesa/vbo/vbo_exec.c index 82f204e3dc..987fa84a9b 100644 --- a/src/mesa/vbo/vbo_exec.c +++ b/src/mesa/vbo/vbo_exec.c @@ -32,6 +32,80 @@ #include "main/vtxfmt.h" #include "vbo_private.h" +const unsigned char GLubyte? +_vbo_attribute_alias_map[_VP_MODE_MAX][VERT_ATTRIB_MAX] = { + /* VP_FF: */ + { + VBO_ATTRIB_POS, /* VERT_ATTRIB_POS */ + VBO_ATTRIB_NORMAL, /* VERT_ATTRIB_NORMAL */ + VBO_ATTRIB_COLOR0, /* VERT_ATTRIB_COLOR0 */ + VBO_ATTRIB_COLOR1, /* VERT_ATTRIB_COLOR1 */ + VBO_ATTRIB_FOG, /* VERT_ATTRIB_FOG */ + VBO_ATTRIB_COLOR_INDEX, /* VERT_ATTRIB_COLOR_INDEX */ + VBO_ATTRIB_EDGEFLAG,/* VERT_ATTRIB_EDGEFLAG */ + VBO_ATTRIB_TEX0,/* VERT_ATTRIB_TEX0 */ + VBO_ATTRIB_TEX1,/* VERT_ATTRIB_TEX1 */ + VBO_ATTRIB_TEX2,/* VERT_ATTRIB_TEX2 */ + VBO_ATTRIB_TEX3,/* VERT_ATTRIB_TEX3 */ + VBO_ATTRIB_TEX4,/* VERT_ATTRIB_TEX4 */ + VBO_ATTRIB_TEX5,/* VERT_ATTRIB_TEX5 */ + VBO_ATTRIB_TEX6,/* VERT_ATTRIB_TEX6 */ + VBO_ATTRIB_TEX7,/* VERT_ATTRIB_TEX7 */ + VBO_ATTRIB_POINT_SIZE, /* VERT_ATTRIB_POINT_SIZE */ + VBO_ATTRIB_GENERIC0,/* VERT_ATTRIB_GENERIC0 */ + VBO_ATTRIB_GENERIC1,/* VERT_ATTRIB_GENERIC1 */ + VBO_ATTRIB_GENERIC2,/* VERT_ATTRIB_GENERIC2 */ + VBO_ATTRIB_GENERIC3,/* VERT_ATTRIB_GENERIC3 */ + VBO_ATTRIB_MAT_FRONT_AMBIENT, /* VERT_ATTRIB_GENERIC4 */ + VBO_ATTRIB_MAT_BACK_AMBIENT,/* VERT_ATTRIB_GENERIC5 */ + VBO_ATTRIB_MAT_FRONT_DIFFUSE, /* VERT_ATTRIB_GENERIC6 */ + VBO_ATTRIB_MAT_BACK_DIFFUSE,/* VERT_ATTRIB_GENERIC7 */ + VBO_ATTRIB_MAT_FRONT_SPECULAR, /* VERT_ATTRIB_GENERIC8 */ + VBO_ATTRIB_MAT_BACK_SPECULAR, /* VERT_ATTRIB_GENERIC9 */ + VBO_ATTRIB_MAT_FRONT_EMISSION, /* VERT_ATTRIB_GENERIC10 */ + VBO_ATTRIB_MAT_BACK_EMISSION, /* VERT_ATTRIB_GENERIC11 */ + VBO_ATTRIB_MAT_FRONT_SHININESS, /* VERT_ATTRIB_GENERIC12 */ + VBO_ATTRIB_MAT_BACK_SHININESS, /* VERT_ATTRIB_GENERIC13 */ + VBO_ATTRIB_MAT_FRONT_INDEXES, /* VERT_ATTRIB_GENERIC14 */ + VBO_ATTRIB_MAT_BACK_INDEXES /* VERT_ATTRIB_GENERIC15 */ + }, + + /* VP_SHADER: */ + { + VBO_ATTRIB_POS, /* VERT_ATTRIB_POS */ + VBO_ATTRIB_NORMAL, /* VERT_ATTRIB_NORMAL */ + VBO_ATTRIB_COLOR0, /* VERT_ATTRIB_COLOR0 */ + VBO_ATTRIB_COLOR1, /* VERT_ATTRIB_COLOR1 */ + VBO_ATTRIB_FOG, /* VERT_ATTRIB_FOG */ + VBO_ATTRIB_COLOR_INDEX, /* VERT_ATTRIB_COLOR_INDEX */ + VBO_ATTRIB_EDGEFLAG,/* VERT_ATTRIB_EDGEFLAG */ + VBO_ATTRIB_TEX0,/* VERT_ATTRIB_TEX0 */ + VBO_ATTRIB_TEX1,/* VERT_ATTRIB_TEX1 */ + VBO_ATTRIB_TEX2,/* VERT_ATTRIB_TEX2 */ + VBO_ATTRIB_TEX3,/* VER
Re: [Mesa-dev] [PATCH 1/8] vbo: Correctly handle attribute offsets in dlist draw.
On 01/31/2018 09:54 AM, Brian Paul wrote: Hi Matthias, Nice work! It's nice to get rid of some of those attribute loops. The series looks good to me, just assorted nit-picks and comments on a few patches. Oh, and it might be good to write a new piglit test or two to exercise the POS/GENERIC0 aliasing stuff. I _think_ we might have one test along those lines now (and I _may_ have even written it), but I can't look right now. -Brian On 01/31/2018 12:55 AM, mathias.froehl...@gmx.net wrote: From: Mathias Fröhlich When executing a display list draw, for the offset list to be correct, the offset computation needs to accumulate all attribute size values in order. Specifically, if we are shuffling around the position and generic0 attributes, we may violate the order or if we do not walk the generic vbo attributes we may skip some of the attributes. Even if this is an unlikely usecase we can fix this "use case" by precomputing the offsets on the full attribute list and store the full offset list in the display list node. v2: Formatting fix v3: Rebase Signed-off-by: Mathias Fröhlich --- src/mesa/vbo/vbo_save.h | 1 + src/mesa/vbo/vbo_save_api.c | 19 ++ src/mesa/vbo/vbo_save_draw.c | 47 +++- 3 files changed, 36 insertions(+), 31 deletions(-) diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h index 1dc66a598e..cb0bff2341 100644 --- a/src/mesa/vbo/vbo_save.h +++ b/src/mesa/vbo/vbo_save.h @@ -64,6 +64,7 @@ struct vbo_save_vertex_list { GLbitfield64 enabled; /**< mask of enabled vbo arrays. */ GLubyte attrsz[VBO_ATTRIB_MAX]; GLenum16 attrtype[VBO_ATTRIB_MAX]; + GLuint offsets[VBO_ATTRIB_MAX]; GLuint vertex_size; /**< size in GLfloats */ /* Copy of the final vertex from node->vertex_store->bufferobj. diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c index a0fb62d814..fb51bdb84e 100644 --- a/src/mesa/vbo/vbo_save_api.c +++ b/src/mesa/vbo/vbo_save_api.c @@ -420,6 +420,8 @@ compile_vertex_list(struct gl_context *ctx) { struct vbo_save_context *save = &vbo_context(ctx)->save; struct vbo_save_vertex_list *node; + GLuint offset; + unsigned i; /* Allocate space for this structure in the display list currently * being compiled. @@ -443,6 +445,23 @@ compile_vertex_list(struct gl_context *ctx) node->vertex_size = save->vertex_size; node->buffer_offset = (save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat); + if (aligned_vertex_buffer_offset(node)) { + /* The vertex size is an exact multiple of the buffer offset. + * This means that we can use zero-based vertex attribute pointers + * and specify the start of the primitive with the _mesa_prim::start + * field. This results in issuing several draw calls with identical + * vertex attribute information. This can result in fewer state + * changes in drivers. In particular, the Gallium CSO module will + * filter out redundant vertex buffer changes. + */ + offset = 0; + } else { + offset = node->buffer_offset; + } + for (i = 0; i < VBO_ATTRIB_MAX; ++i) { + node->offsets[i] = offset; + offset += node->attrsz[i] * sizeof(GLfloat); + } node->vertex_count = save->vert_count; node->wrap_count = save->copied.nr; node->dangling_attr_ref = save->dangling_attr_ref; diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c index fd0ccc1230..291d99ed9a 100644 --- a/src/mesa/vbo/vbo_save_draw.c +++ b/src/mesa/vbo/vbo_save_draw.c @@ -26,6 +26,7 @@ * Keith Whitwell */ +#include #include "main/glheader.h" #include "main/bufferobj.h" #include "main/context.h" @@ -137,29 +138,10 @@ bind_vertex_list(struct gl_context *ctx, struct vbo_context *vbo = vbo_context(ctx); struct vbo_save_context *save = &vbo->save; struct gl_vertex_array *arrays = save->arrays; - GLuint buffer_offset = node->buffer_offset; const GLubyte *map; GLuint attr; - GLubyte node_attrsz[VBO_ATTRIB_MAX]; /* copy of node->attrsz[] */ - GLenum16 node_attrtype[VBO_ATTRIB_MAX]; /* copy of node->attrtype[] */ GLbitfield varying_inputs = 0x0; - - STATIC_ASSERT(sizeof(node_attrsz) == sizeof(node->attrsz)); - memcpy(node_attrsz, node->attrsz, sizeof(node->attrsz)); - STATIC_ASSERT(sizeof(node_attrtype) == sizeof(node->attrtype)); - memcpy(node_attrtype, node->attrtype, sizeof(node->attrtype)); - - if (aligned_vertex_buffer_offset(node)) { - /* The vertex size is an exact multiple of the buffer offset. - * This means that we can use zero-based vertex attribute pointers - * and specify the start of the primitive with the _mesa_prim::start - * field. This results in issuing several draw calls with identical - * vertex attribute information. This can result in fewer state - * changes in drivers. In particular, the
[Mesa-dev] [PATCH] docs/features: mark EXT_semaphore(_fd) as DONE v2
Support for these extensions is available in radeonsi. v2: also updated relnotes Signed-off-by: Andres Rodriguez --- Let me know if the formatting for the relnotes is what is expected. I based it on the previous versions. docs/features.txt | 4 ++-- docs/relnotes/18.1.0.html | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index 2e110d9994..1672460a2f 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -316,8 +316,8 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve GL_EXT_memory_object DONE (radeonsi) GL_EXT_memory_object_fd DONE (radeonsi) GL_EXT_memory_object_win32not started - GL_EXT_semaphore not started - GL_EXT_semaphore_fd not started + GL_EXT_semaphore DONE (radeonsi) + GL_EXT_semaphore_fd DONE (radeonsi) GL_EXT_semaphore_win32not started GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+) GL_KHR_texture_compression_astc_hdr DONE (i965/bxt) diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html index ddacbb4656..b8a0cd0d02 100644 --- a/docs/relnotes/18.1.0.html +++ b/docs/relnotes/18.1.0.html @@ -44,7 +44,8 @@ Note: some of the new features are only available with certain drivers. -TBD +GL_EXT_semaphore on radeonsi +GL_EXT_semaphore_fd on radeonsi Bug fixes -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/20] mesa: implement buffer/texture barriers for semaphore signal/wait v2
On 31 January 2018 at 15:20, Jon Turney wrote: > On 23/01/2018 18:05, Andres Rodriguez wrote: >> >> Make sure memory is accessible to the external client, for the specified >> memory object, before the signal/after the wait. >> >> v2: fixed flush order with respect to wait/signal emission >> >> Signed-off-by: Andres Rodriguez >> --- >> src/mesa/main/dd.h | 14 ++- >> src/mesa/main/externalobjects.c | 38 +++--- >> src/mesa/state_tracker/st_cb_semaphoreobjects.c | 53 >> +++-- >> 3 files changed, 95 insertions(+), 10 deletions(-) > > [...] > >> diff --git a/src/mesa/main/externalobjects.c >> b/src/mesa/main/externalobjects.c >> index 4fb3ca07a9..c070d7a28d 100644 >> --- a/src/mesa/main/externalobjects.c >> +++ b/src/mesa/main/externalobjects.c >> @@ -23,6 +23,7 @@ >> #include "macros.h" >> #include "mtypes.h" >> +#include "bufferobj.h" >> #include "context.h" >> #include "externalobjects.h" >> #include "teximage.h" >> @@ -716,7 +717,8 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, >> { >> GET_CURRENT_CONTEXT(ctx); >> struct gl_semaphore_object *semObj; >> - >> + struct gl_buffer_object **bufObjs; >> + struct gl_texture_object **texObjs; >>if (!ctx->Extensions.EXT_semaphore) { >> _mesa_error(ctx, GL_INVALID_OPERATION, >> "glWaitSemaphoreEXT(unsupported)"); >> @@ -732,8 +734,20 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, >> FLUSH_VERTICES( ctx, 0 ); >> FLUSH_CURRENT( ctx, 0 ); >> - /* TODO: memory barriers and layout transitions */ >> - ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj); >> + bufObjs = alloca(sizeof(struct gl_buffer_object **) * >> numBufferBarriers); >> + for (unsigned i = 0; i < numBufferBarriers; i++) { >> + bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); >> + } >> + >> + texObjs = alloca(sizeof(struct gl_texture_object **) * >> numTextureBarriers); >> + for (unsigned i = 0; i < numTextureBarriers; i++) { >> + texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); >> + } >> + >> + ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj, >> + numBufferBarriers, bufObjs, >> + numTextureBarriers, texObjs, >> + srcLayouts); >> } >> void GLAPIENTRY >> @@ -746,6 +760,8 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, >> { >> GET_CURRENT_CONTEXT(ctx); >> struct gl_semaphore_object *semObj; >> + struct gl_buffer_object **bufObjs; >> + struct gl_texture_object **texObjs; >>if (!ctx->Extensions.EXT_semaphore) { >> _mesa_error(ctx, GL_INVALID_OPERATION, >> "glSignalSemaphoreEXT(unsupported)"); >> @@ -761,8 +777,20 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, >> FLUSH_VERTICES( ctx, 0 ); >> FLUSH_CURRENT( ctx, 0 ); >> - /* TODO: memory barriers and layout transitions */ >> - ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj); >> + bufObjs = alloca(sizeof(struct gl_buffer_object **) * >> numBufferBarriers); >> + for (unsigned i = 0; i < numBufferBarriers; i++) { >> + bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); >> + } >> + >> + texObjs = alloca(sizeof(struct gl_texture_object **) * >> numTextureBarriers); >> + for (unsigned i = 0; i < numTextureBarriers; i++) { >> + texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); >> + } >> + >> + ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj, >> + numBufferBarriers, bufObjs, >> + numTextureBarriers, texObjs, >> + dstLayouts); >> } > > [...] > > This adds a use of alloca(), without a corresponding #include. > > Patch attached. > > See also: > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180073.html > https://lists.freedesktop.org/archives/mesa-dev/2016-July/122346.html > We have only a few instances in alloca in-tree. The num* variables are user provided, so it's very likely to wreck chaos. I'd stick with malloc, esp. considering the different alloca implementations and header madness. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] ac/shader: repack ac_shader_info
Patches 1 & 2: Reviewed-by: Nicolai Hähnle On 31.01.2018 00:41, Dave Airlie wrote: From: Dave Airlie This reduces the size from 28->24 bytes. Signed-off-by: Dave Airlie --- src/amd/common/ac_shader_info.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/common/ac_shader_info.h b/src/amd/common/ac_shader_info.h index 59b749576aa..7283433bfce 100644 --- a/src/amd/common/ac_shader_info.h +++ b/src/amd/common/ac_shader_info.h @@ -28,8 +28,8 @@ struct nir_shader; struct ac_nir_compiler_options; struct ac_shader_info { - bool loads_push_constants; uint32_t desc_set_used_mask; + bool loads_push_constants; bool needs_multiview_view_index; bool uses_invocation_id; bool uses_prim_id; -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs/features: mark EXT_semaphore(_fd) as DONE v2
Reviewed-by: Samuel Pitoiset On 01/31/2018 06:04 PM, Andres Rodriguez wrote: Support for these extensions is available in radeonsi. v2: also updated relnotes Signed-off-by: Andres Rodriguez --- Let me know if the formatting for the relnotes is what is expected. I based it on the previous versions. docs/features.txt | 4 ++-- docs/relnotes/18.1.0.html | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index 2e110d9994..1672460a2f 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -316,8 +316,8 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve GL_EXT_memory_object DONE (radeonsi) GL_EXT_memory_object_fd DONE (radeonsi) GL_EXT_memory_object_win32not started - GL_EXT_semaphore not started - GL_EXT_semaphore_fd not started + GL_EXT_semaphore DONE (radeonsi) + GL_EXT_semaphore_fd DONE (radeonsi) GL_EXT_semaphore_win32not started GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+) GL_KHR_texture_compression_astc_hdr DONE (i965/bxt) diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html index ddacbb4656..b8a0cd0d02 100644 --- a/docs/relnotes/18.1.0.html +++ b/docs/relnotes/18.1.0.html @@ -44,7 +44,8 @@ Note: some of the new features are only available with certain drivers. -TBD +GL_EXT_semaphore on radeonsi +GL_EXT_semaphore_fd on radeonsi Bug fixes ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: remove usage of alloca in externalobjects.c
Don't want an overly large numBufferBarriers/numTextureBarriers to blow up the stack. Suggested-by: Emil Velikov Signed-off-by: Andres Rodriguez --- src/mesa/main/externalobjects.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c index 463debd268..6a248f35a6 100644 --- a/src/mesa/main/externalobjects.c +++ b/src/mesa/main/externalobjects.c @@ -727,34 +727,37 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, srcLayouts); + + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_SignalSemaphoreEXT(GLuint semaphore, GLuint numBufferBarriers, const GLuint *buffers, GLuint numTextureBarriers, const GLuint *textures, const GLenum *dstLayouts) { @@ -770,34 +773,37 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, dstLayouts); + + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_ImportMemoryFdEXT(GLuint memory, GLuint64 size, GLenum handleType, GLint fd) { GET_CURRENT_CONTEXT(ctx); -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa] gallium/util: used designated initialisers in formats table
Signed-off-by: Eric Engestrom --- v2: I missed a few lines in v1 for some reason; converted as well --- src/gallium/auxiliary/util/u_format_table.py | 108 ++- 1 file changed, 55 insertions(+), 53 deletions(-) diff --git a/src/gallium/auxiliary/util/u_format_table.py b/src/gallium/auxiliary/util/u_format_table.py index a09ae53cbc8966ba5bb1..f8fd138e0bf40aafa1ce 100644 --- a/src/gallium/auxiliary/util/u_format_table.py +++ b/src/gallium/auxiliary/util/u_format_table.py @@ -125,77 +125,79 @@ def do_swizzle_array(channels, swizzles): for format in formats: print 'const struct util_format_description' print 'util_format_%s_description = {' % (format.short_name(),) -print " %s," % (format.name,) -print " \"%s\"," % (format.name,) -print " \"%s\"," % (format.short_name(),) -print " {%u, %u, %u},\t/* block */" % (format.block_width, format.block_height, format.block_size()) -print " %s," % (layout_map(format.layout),) -print " %u,\t/* nr_channels */" % (format.nr_channels(),) -print " %s,\t/* is_array */" % (bool_map(format.is_array()),) -print " %s,\t/* is_bitmask */" % (bool_map(format.is_bitmask()),) -print " %s,\t/* is_mixed */" % (bool_map(format.is_mixed()),) +print " .format = %s," % (format.name,) +print " .name = \"%s\"," % (format.name,) +print " .short_name = \"%s\"," % (format.short_name(),) +print " .block = {%u, %u, %u}," % (format.block_width, format.block_height, format.block_size()) +print " .layout = %s," % (layout_map(format.layout),) +print " .nr_channels = %u," % (format.nr_channels(),) +print " .is_array = %s," % (bool_map(format.is_array()),) +print " .is_bitmask = %s," % (bool_map(format.is_bitmask()),) +print " .is_mixed = %s," % (bool_map(format.is_mixed()),) +print " .channel = " u_format_pack.print_channels(format, do_channel_array) +print " .swizzle = " u_format_pack.print_channels(format, do_swizzle_array) -print " %s," % (colorspace_map(format.colorspace),) +print " .colorspace = %s," % (colorspace_map(format.colorspace),) access = True if format.layout in ('bptc', 'astc'): access = False if format.layout == 'etc' and format.short_name() != 'etc1_rgb8': access = False if format.colorspace != ZS and not format.is_pure_color() and access: -print " &util_format_%s_unpack_rgba_8unorm," % format.short_name() -print " &util_format_%s_pack_rgba_8unorm," % format.short_name() +print " .unpack_rgba_8unorm = &util_format_%s_unpack_rgba_8unorm," % format.short_name() +print " .pack_rgba_8unorm = &util_format_%s_pack_rgba_8unorm," % format.short_name() if format.layout == 's3tc' or format.layout == 'rgtc': -print " &util_format_%s_fetch_rgba_8unorm," % format.short_name() +print " .fetch_rgba_8unorm = &util_format_%s_fetch_rgba_8unorm," % format.short_name() else: -print " NULL, /* fetch_rgba_8unorm */" -print " &util_format_%s_unpack_rgba_float," % format.short_name() -print " &util_format_%s_pack_rgba_float," % format.short_name() -print " &util_format_%s_fetch_rgba_float," % format.short_name() +print " .fetch_rgba_8unorm = NULL," +print " .unpack_rgba_float = &util_format_%s_unpack_rgba_float," % format.short_name() +print " .pack_rgba_float = &util_format_%s_pack_rgba_float," % format.short_name() +print " .fetch_rgba_float = &util_format_%s_fetch_rgba_float," % format.short_name() else: -print " NULL, /* unpack_rgba_8unorm */" -print " NULL, /* pack_rgba_8unorm */" -print " NULL, /* fetch_rgba_8unorm */" -print " NULL, /* unpack_rgba_float */" -print " NULL, /* pack_rgba_float */" -print " NULL, /* fetch_rgba_float */" +print " .unpack_rgba_8unorm = NULL," +print " .pack_rgba_8unorm = NULL," +print " .fetch_rgba_8unorm = NULL," +print " .unpack_rgba_float = NULL," +print " .pack_rgba_float = NULL," +print " .fetch_rgba_float = NULL," if format.has_depth(): -print " &util_format_%s_unpack_z_32unorm," % format.short_name() -print " &util_format_%s_pack_z_32unorm," % format.short_name() -print " &util_format_%s_unpack_z_float," % format.short_name() -print " &util_format_%s_pack_z_float," % format.short_name() +print " .unpack_z_32unorm = &util_format_%s_unpack_z_32unorm," % format.short_name() +print " .pack_z_32unorm = &
Re: [Mesa-dev] [PATCH v2 17/24] anv: Use blorp_ccs_ambiguate instead of fast-clears
On Tue, Jan 30, 2018 at 05:14:39PM -0800, Jason Ekstrand wrote: > On Tue, Jan 30, 2018 at 5:03 PM, Nanley Chery wrote: > > > On Tue, Jan 30, 2018 at 04:25:59PM -0800, Jason Ekstrand wrote: > > > On Tue, Jan 30, 2018 at 2:54 PM, Nanley Chery > > wrote: > > > > > > > On Fri, Jan 19, 2018 at 03:47:34PM -0800, Jason Ekstrand wrote: > > > > > Even though the blorp pass looks a bit on the sketchy side, the end > > > > > result in the Vulkan driver is very nice. Instead of having this > > weird > > > > > case where you do a fast clear and then maybe have to resolve, we > > just > > > > > do the ambiguate and are done with it. The ambiguate does exactly > > what > > > > > we want of setting all the CCS values to 0 which puts it inot the > > > >^ > > > >in > > > > Typo. > > > > Yup. Meant into > > > > > > > pass-through state. > > > > > > > > > > This should also improve performance a bit in certain cases. For > > > > > instance, if we did a transition from UNDEFINED to GENERAL for a > > surface > > > > > that doesn't have CCS enabled all the time, we would end up doing a > > > > > fast-clear and then a full resolve which ends up touching every byte > > in > > > > > the main surface as well as the CCS. With the ambiguate pass, that > > > > > transition only touches the CCS. > > > > > --- > > > > > src/intel/vulkan/anv_blorp.c | 5 > > > > > src/intel/vulkan/genX_cmd_buffer.c | 54 > > +- > > > > > > > > > 2 files changed, 17 insertions(+), 42 deletions(-) > > > > > > > > > > diff --git a/src/intel/vulkan/anv_blorp.c > > b/src/intel/vulkan/anv_blorp.c > > > > > index 05efc6d..3698543 100644 > > > > > --- a/src/intel/vulkan/anv_blorp.c > > > > > +++ b/src/intel/vulkan/anv_blorp.c > > > > > @@ -1792,6 +1792,11 @@ anv_image_ccs_op(struct anv_cmd_buffer > > > > *cmd_buffer, > > > > > surf.surf->format, > > isl_to_blorp_fast_clear_op( > > > > ccs_op)); > > > > >break; > > > > > case ISL_AUX_OP_AMBIGUATE: > > > > > + for (uint32_t a = 0; a < layer_count; a++) { > > > > > + const uint32_t layer = base_layer + a; > > > > > + blorp_ccs_ambiguate(&batch, &surf, level, layer); > > > > > + } > > > > > + break; > > > > > default: > > > > >unreachable("Unsupported CCS operation"); > > > > > } > > > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > > > > b/src/intel/vulkan/genX_cmd_buffer.c > > > > > index 77fdadf..9e2eba3 100644 > > > > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > > > > @@ -486,15 +486,6 @@ init_fast_clear_state_entry(struct > > anv_cmd_buffer > > > > *cmd_buffer, > > > > > uint32_t plane = anv_image_aspect_to_plane(image->aspects, > > aspect); > > > > > enum isl_aux_usage aux_usage = image->planes[plane].aux_usage; > > > > > > > > > > - /* The resolve flag should updated to signify that > > > > fast-clear/compression > > > > > -* data needs to be removed when leaving the undefined layout. > > Such > > > > data > > > > > -* may need to be removed if it would cause accesses to the color > > > > buffer > > > > > -* to return incorrect data. The fast clear data in CCS_D buffers > > > > should > > > > > -* be removed because CCS_D isn't enabled all the time. > > > > > -*/ > > > > > - genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, > > > > > - aux_usage == ISL_AUX_USAGE_NONE); > > > > > - > > > > > /* The fast clear value dword(s) will be copied into a surface > > state > > > > object. > > > > > * Ensure that the restrictions of the fields in the dword(s) are > > > > followed. > > > > > * > > > > > @@ -677,10 +668,9 @@ transition_color_buffer(struct anv_cmd_buffer > > > > *cmd_buffer, > > > > >for (unsigned level = base_level; level < last_level_num; > > level++) > > > > > init_fast_clear_state_entry(cmd_buffer, image, aspect, > > level); > > > > > > > > > > - /* Initialize the aux buffers to enable correct rendering. > > This > > > > operation > > > > > - * requires up to two steps: one to rid the aux buffer of data > > > > that may > > > > > - * cause GPU hangs, and another to ensure that writes done > > > > without aux > > > > > - * will be visible to reads done with aux. > > > > > + /* Initialize the aux buffers to enable correct rendering. In > > > > order to > > > > > + * ensure that things such as storage images work correctly, > > aux > > > > buffers > > > > > + * are initialized to the pass-through state. > > > > > > > > Only CCS is initialized to the pass-through state while MCS is > > > > fast-cleared. We may also want to update the comment below since we're > > > > no longer fast-clearing CCS. > > > > > > > > > > Right. I've replaced this
Re: [Mesa-dev] [PATCH] mesa: remove usage of alloca in externalobjects.c
On 31 January 2018 at 17:30, Andres Rodriguez wrote: > Don't want an overly large numBufferBarriers/numTextureBarriers to blow > up the stack. > > Suggested-by: Emil Velikov > Signed-off-by: Andres Rodriguez > --- Thanks for sorting this Andres. > src/mesa/main/externalobjects.c | 14 ++ > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c > index 463debd268..6a248f35a6 100644 > --- a/src/mesa/main/externalobjects.c > +++ b/src/mesa/main/externalobjects.c > @@ -727,34 +727,37 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, > > ASSERT_OUTSIDE_BEGIN_END(ctx); > > semObj = _mesa_lookup_semaphore_object(ctx, semaphore); > if (!semObj) >return; > > FLUSH_VERTICES(ctx, 0); > FLUSH_CURRENT(ctx, 0); > > - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); > + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); Through the patch, please check for malloc failure and _mesa_error(OOM)/free if so -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: remove usage of alloca in externalobjects.c v2
Don't want an overly large numBufferBarriers/numTextureBarriers to blow up the stack. v2: handle malloc errors Suggested-by: Emil Velikov Signed-off-by: Andres Rodriguez --- src/mesa/main/externalobjects.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c index 463debd268..6a248f35a6 100644 --- a/src/mesa/main/externalobjects.c +++ b/src/mesa/main/externalobjects.c @@ -727,34 +727,37 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, srcLayouts); + + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_SignalSemaphoreEXT(GLuint semaphore, GLuint numBufferBarriers, const GLuint *buffers, GLuint numTextureBarriers, const GLuint *textures, const GLenum *dstLayouts) { @@ -770,34 +773,37 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, dstLayouts); + + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_ImportMemoryFdEXT(GLuint memory, GLuint64 size, GLenum handleType, GLint fd) { GET_CURRENT_CONTEXT(ctx); -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
Ugh, I had to read this change many times to understand it, but I do think it makes sense. The comments in the code helped a lot too. Reviewed-by: Rafael Antognolli On Mon, Jan 29, 2018 at 06:25:30PM +0200, Andres Gomez wrote: > The emission of vertex attributes corresponding to dvec3 and dvec4 > vertex shader input variables was not correct when the passed > to the VertexAttribL* commands was <= 2. > > In 61a8a55f557 ("i965/gen8: Fix vertex attrib upload for dvec3/4 > shader inputs"), for gen8+ we needed to determine if the attrib was > dual slot to emit 128 or 256-bit, independently of the VAO size. > > Similarly, for gen < 8 we also need to determine whether the attrib is > dual slot to force the emission of 256-bits through 2 uploads. > > Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this > second upload to fill these unspecified components with zeros, as we > also do for gen8+. > > Fixes the following test on Haswell: > KHR-GL46.vertex_attrib_binding.basic-inputL-case1 > > v2: Added more inline comments to explain why we are using > ISL_FORMAT_R32_FLOAT and its consequences, as requested by > Alejandro and Antía. > > Fixes: 75968a668e4 ("i965/gen7: expose OpenGL 4.2 on Haswell when > supported") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006 > Cc: Alejandro Piñeiro > Cc: Juan A. Suarez Romero > Cc: Antia Puentes > Cc: Rafael Antognolli > Cc: Kenneth Graunke > Signed-off-by: Andres Gomez > Reviewed-by: Alejandro Piñeiro > Reviewed-by: Antia Puentes > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 32 > ++- > 1 file changed, 27 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index aa4d64d08e2..a39a254dacd 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -364,11 +364,15 @@ is_passthru_format(uint32_t format) > } > > UNUSED static int > -uploads_needed(uint32_t format) > +uploads_needed(uint32_t format, > +bool is_dual_slot) > { > if (!is_passthru_format(format)) >return 1; > > + if (is_dual_slot) > + return 2; > + > switch (format) { > case ISL_FORMAT_R64_PASSTHRU: > case ISL_FORMAT_R64G64_PASSTHRU: > @@ -397,11 +401,19 @@ downsize_format_if_needed(uint32_t format, > if (!is_passthru_format(format)) >return format; > > + /* ISL_FORMAT_R64_PASSTHRU and ISL_FORMAT_R64G64_PASSTHRU with an upload > == > +* 1 means that we have been forced to do 2 uploads for a size <= 2. This > +* happens with gen < 8 and dvec3 or dvec4 vertex shader input > +* variables. In those cases, we return ISL_FORMAT_R32_FLOAT as a way of > +* flagging that we want to fill with zeroes this second forced upload. > +*/ > switch (format) { > case ISL_FORMAT_R64_PASSTHRU: > - return ISL_FORMAT_R32G32_FLOAT; > + return !upload ? ISL_FORMAT_R32G32_FLOAT > + : ISL_FORMAT_R32_FLOAT; > case ISL_FORMAT_R64G64_PASSTHRU: > - return ISL_FORMAT_R32G32B32A32_FLOAT; > + return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT > + : ISL_FORMAT_R32_FLOAT; >> case ISL_FORMAT_R64G64B64_PASSTHRU: >return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT > : ISL_FORMAT_R32G32_FLOAT; > @@ -420,6 +432,15 @@ static int > upload_format_size(uint32_t upload_format) > { > switch (upload_format) { > + case ISL_FORMAT_R32_FLOAT: > + > + /* downsized_format has returned this one in order to flag that we are > + * performing a second upload which we want to have filled with > + * zeroes. This happens with gen < 8, a size <= 2, and dvec3 or dvec4 > + * vertex shader input variables. > + */ > + > + return 0; > case ISL_FORMAT_R32G32_FLOAT: >return 2; > case ISL_FORMAT_R32G32B32A32_FLOAT: > @@ -517,7 +538,7 @@ genX(emit_vertices)(struct brw_context *brw) >struct brw_vertex_element *input = brw->vb.enabled[i]; >uint32_t format = brw_get_vertex_surface_type(brw, input->glarray); > > - if (uploads_needed(format) > 1) > + if (uploads_needed(format, input->is_dual_slot) > 1) > nr_elements++; > } > #endif > @@ -613,7 +634,8 @@ genX(emit_vertices)(struct brw_context *brw) >uint32_t comp1 = VFCOMP_STORE_SRC; >uint32_t comp2 = VFCOMP_STORE_SRC; >uint32_t comp3 = VFCOMP_STORE_SRC; > - const unsigned num_uploads = GEN_GEN < 8 ? uploads_needed(format) : 1; > + const unsigned num_uploads = GEN_GEN < 8 ? > + uploads_needed(format, input->is_dual_slot) : 1; > > #if GEN_GEN >= 8 >/* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE): > -- > 2.11.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/
Re: [Mesa-dev] [PATCH] mesa: remove usage of alloca in externalobjects.c v2
On 2018-01-31 01:25 PM, Andres Rodriguez wrote: Don't want an overly large numBufferBarriers/numTextureBarriers to blow up the stack. v2: handle malloc errors Someone forgot to update his patch correctly before sending it out... Suggested-by: Emil Velikov Signed-off-by: Andres Rodriguez --- src/mesa/main/externalobjects.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c index 463debd268..6a248f35a6 100644 --- a/src/mesa/main/externalobjects.c +++ b/src/mesa/main/externalobjects.c @@ -727,34 +727,37 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, srcLayouts); + + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_SignalSemaphoreEXT(GLuint semaphore, GLuint numBufferBarriers, const GLuint *buffers, GLuint numTextureBarriers, const GLuint *textures, const GLenum *dstLayouts) { @@ -770,34 +773,37 @@ _mesa_SignalSemaphoreEXT(GLuint semaphore, ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, dstLayouts); + + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_ImportMemoryFdEXT(GLuint memory, GLuint64 size, GLenum handleType, GLint fd) { GET_CURRENT_CONTEXT(ctx); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: remove usage of alloca in externalobjects.c v3
Don't want an overly large numBufferBarriers/numTextureBarriers to blow up the stack. v2: handle malloc errors v3: fix patch Suggested-by: Emil Velikov Signed-off-by: Andres Rodriguez --- src/mesa/main/externalobjects.c | 48 +++-- 1 file changed, 42 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/externalobjects.c b/src/mesa/main/externalobjects.c index 463debd268..a28d6dba6f 100644 --- a/src/mesa/main/externalobjects.c +++ b/src/mesa/main/externalobjects.c @@ -713,91 +713,127 @@ _mesa_WaitSemaphoreEXT(GLuint semaphore, const GLuint *buffers, GLuint numTextureBarriers, const GLuint *textures, const GLenum *srcLayouts) { GET_CURRENT_CONTEXT(ctx); struct gl_semaphore_object *semObj; struct gl_buffer_object **bufObjs; struct gl_texture_object **texObjs; + const char *func = "glWaitSemaphoreEXT"; + if (!ctx->Extensions.EXT_semaphore) { - _mesa_error(ctx, GL_INVALID_OPERATION, "glWaitSemaphoreEXT(unsupported)"); + _mesa_error(ctx, GL_INVALID_OPERATION, "%s(unsupported)", func); return; } ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); + if (!bufObjs) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "%s(numBufferBarriers=%u)", + func, numBufferBarriers); + goto end; + } + for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); + if (!texObjs) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "%s(numTextureBarriers=%u)", + func, numTextureBarriers); + goto end; + } + for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerWaitSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, srcLayouts); + +end: + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_SignalSemaphoreEXT(GLuint semaphore, GLuint numBufferBarriers, const GLuint *buffers, GLuint numTextureBarriers, const GLuint *textures, const GLenum *dstLayouts) { GET_CURRENT_CONTEXT(ctx); struct gl_semaphore_object *semObj; struct gl_buffer_object **bufObjs; struct gl_texture_object **texObjs; + const char *func = "glSignalSemaphoreEXT"; + if (!ctx->Extensions.EXT_semaphore) { - _mesa_error(ctx, GL_INVALID_OPERATION, "glSignalSemaphoreEXT(unsupported)"); + _mesa_error(ctx, GL_INVALID_OPERATION, "%s(unsupported)", func); return; } ASSERT_OUTSIDE_BEGIN_END(ctx); semObj = _mesa_lookup_semaphore_object(ctx, semaphore); if (!semObj) return; FLUSH_VERTICES(ctx, 0); FLUSH_CURRENT(ctx, 0); - bufObjs = alloca(sizeof(struct gl_buffer_object **) * numBufferBarriers); + bufObjs = malloc(sizeof(struct gl_buffer_object **) * numBufferBarriers); + if (!bufObjs) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "%s(numBufferBarriers=%u)", + func, numBufferBarriers); + goto end; + } + for (unsigned i = 0; i < numBufferBarriers; i++) { bufObjs[i] = _mesa_lookup_bufferobj(ctx, buffers[i]); } - texObjs = alloca(sizeof(struct gl_texture_object **) * numTextureBarriers); + texObjs = malloc(sizeof(struct gl_texture_object **) * numTextureBarriers); + if (!texObjs) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "%s(numTextureBarriers=%u)", + func, numTextureBarriers); + goto end; + } + for (unsigned i = 0; i < numTextureBarriers; i++) { texObjs[i] = _mesa_lookup_texture(ctx, textures[i]); } ctx->Driver.ServerSignalSemaphoreObject(ctx, semObj, numBufferBarriers, bufObjs, numTextureBarriers, texObjs, dstLayouts); + +end: + free(bufObjs); + free(texObjs); } void GLAPIENTRY _mesa_ImportMemoryFdEXT(GLuint memory, GLuint64 size, GLenum handleType, GLint fd) { GET_CURRENT_CONTEXT(ctx); -- 2.14.1 ___ mesa-dev mailing list mesa-dev@
[Mesa-dev] [PATCH 1/2] nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant intrinsics when some components are unused. For now, it can just shrink vec4 to vec3, vec3 to vec2 and so on. Signed-off-by: Samuel Pitoiset --- src/compiler/Makefile.sources | 1 + src/compiler/nir/meson.build| 1 + src/compiler/nir/nir.h | 2 + src/compiler/nir/nir_opt_shrink_load_constant.c | 67 + 4 files changed, 71 insertions(+) create mode 100644 src/compiler/nir/nir_opt_shrink_load_constant.c diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index d3f746f5f9..84dbc26d19 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -268,6 +268,7 @@ NIR_FILES = \ nir/nir_opt_move_comparisons.c \ nir/nir_opt_peephole_select.c \ nir/nir_opt_remove_phis.c \ + nir/nir_opt_shrink_load.c \ nir/nir_opt_trivial_continues.c \ nir/nir_opt_undef.c \ nir/nir_phi_builder.c \ diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build index b5f27ad667..859a0c1e62 100644 --- a/src/compiler/nir/meson.build +++ b/src/compiler/nir/meson.build @@ -162,6 +162,7 @@ files_libnir = files( 'nir_opt_move_comparisons.c', 'nir_opt_peephole_select.c', 'nir_opt_remove_phis.c', + 'nir_opt_shrink_load.c', 'nir_opt_trivial_continues.c', 'nir_opt_undef.c', 'nir_phi_builder.c', diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 9ab2769e06..5ea8c9926b 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2770,6 +2770,8 @@ bool nir_opt_peephole_select(nir_shader *shader, unsigned limit); bool nir_opt_remove_phis(nir_shader *shader); +bool nir_opt_shrink_load(nir_shader *shader); + bool nir_opt_trivial_continues(nir_shader *shader); bool nir_opt_undef(nir_shader *shader); diff --git a/src/compiler/nir/nir_opt_shrink_load_constant.c b/src/compiler/nir/nir_opt_shrink_load_constant.c new file mode 100644 index 00..f97b7f9b67 --- /dev/null +++ b/src/compiler/nir/nir_opt_shrink_load_constant.c @@ -0,0 +1,67 @@ +/* + * Copyright © 2018 Valve Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "nir.h" + +static bool +opt_shrink_load(nir_intrinsic_instr *instr) +{ + bool progress = false; + + if (instr->intrinsic == nir_intrinsic_load_push_constant) { + unsigned mask = nir_ssa_def_components_read(&instr->dest.ssa); + + if (instr->num_components > util_last_bit(mask)) { + instr->num_components = util_last_bit(mask); + instr->dest.ssa.num_components = instr->num_components; + progress = true; + } + } + + return progress; +} + +bool +nir_opt_shrink_load(nir_shader *shader) +{ + bool progress = false; + + nir_foreach_function(function, shader) { + if (!function->impl) + continue; + + nir_foreach_block(block, function->impl) { + nir_foreach_instr(instr, block) { +if (instr->type != nir_instr_type_intrinsic) + continue; + +progress |= opt_shrink_load(nir_instr_as_intrinsic(instr)); + } + } + + nir_metadata_preserve(function->impl, nir_metadata_block_index | +nir_metadata_dominance); + } + + return progress; +} -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] radv: run nir_opt_shrink_load
LLVM can't shrink loads. Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1355528 -> 1355172 (-0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 11710 -> 11670 (-0.34 %) Wait states: 0 -> 0 (0.00 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Note that we could just do the same optimization directly in visit_load_push_constant(), but I think it's better to move that pass in common code if other drivers want to use it. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_shader.c | 1 + .../nir/{nir_opt_shrink_load_constant.c => nir_opt_shrink_load.c}| 0 2 files changed, 1 insertion(+) rename src/compiler/nir/{nir_opt_shrink_load_constant.c => nir_opt_shrink_load.c} (100%) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index af094e6220..ad68873055 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -147,6 +147,7 @@ radv_optimize_nir(struct nir_shader *shader) if (shader->options->max_unroll_iterations) { NIR_PASS(progress, shader, nir_opt_loop_unroll, 0); } +NIR_PASS(progress, shader, nir_opt_shrink_load); } while (progress); } diff --git a/src/compiler/nir/nir_opt_shrink_load_constant.c b/src/compiler/nir/nir_opt_shrink_load.c similarity index 100% rename from src/compiler/nir/nir_opt_shrink_load_constant.c rename to src/compiler/nir/nir_opt_shrink_load.c -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Bump official kernel requirement to Linux v3.9.
In commit 3f353342a6b6744773c26ed66b12afed42bd57af (present in 17.3.0) we started unconditionally using I915_EXEC_NO_RELOC, which was introduced in Linux v3.9. ChromeOS kernel 3.8 has backported this, so it should work too. Running on older kernels would likely result in every single batch being rejected by the kernel, which is pretty catastrophic. Yet, it appears that nobody noticed. So, let's just bump the official requirement and move forward ever so slowly. Fixes: 3f353342a6b ("i965: Use I915_EXEC_NO_RELOC") --- src/mesa/drivers/dri/i965/intel_screen.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index e1e520bc899..8c78b73b640 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1779,8 +1779,8 @@ intel_init_bufmgr(struct intel_screen *screen) return false; } - if (!intel_get_boolean(screen, I915_PARAM_HAS_WAIT_TIMEOUT)) { - fprintf(stderr, "[%s: %u] Kernel 3.6 required.\n", __func__, __LINE__); + if (!intel_get_boolean(screen, I915_PARAM_HAS_EXEC_NO_RELOC)) { + fprintf(stderr, "[%s: %u] Kernel 3.9 required.\n", __func__, __LINE__); return false; } -- 2.15.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change
From: Marek Olšák This only affects drivers that set DriverFlags.NewBlend. v2: - fix typo advanded -> advanced - return "enum gl_advanced_blend_mode" from _mesa_get_advanced_blend_sh_constant - don't call FLUSH_VERTICES twice --- src/mesa/main/blend.c | 6 -- src/mesa/main/blend.h | 43 +++ src/mesa/main/enable.c| 14 + src/mesa/program/prog_statevars.c | 3 ++- 4 files changed, 51 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/blend.c b/src/mesa/main/blend.c index 6b379f2..ec8e27e 100644 --- a/src/mesa/main/blend.c +++ b/src/mesa/main/blend.c @@ -528,21 +528,22 @@ _mesa_BlendEquation( GLenum mode ) if (!changed) return; if (!legal_simple_blend_equation(ctx, mode) && !advanced_mode) { _mesa_error(ctx, GL_INVALID_ENUM, "glBlendEquation"); return; } - _mesa_flush_vertices_for_blend_state(ctx); + _mesa_flush_vertices_for_blend_adv(ctx, ctx->Color.BlendEnabled, + advanced_mode); for (buf = 0; buf < numBuffers; buf++) { ctx->Color.Blend[buf].EquationRGB = mode; ctx->Color.Blend[buf].EquationA = mode; } ctx->Color._BlendEquationPerBuffer = GL_FALSE; ctx->Color._AdvancedBlendMode = advanced_mode; if (ctx->Driver.BlendEquationSeparate) ctx->Driver.BlendEquationSeparate(ctx, mode, mode); @@ -553,21 +554,22 @@ _mesa_BlendEquation( GLenum mode ) * Set blend equation for one color buffer/target. */ static void blend_equationi(struct gl_context *ctx, GLuint buf, GLenum mode, enum gl_advanced_blend_mode advanced_mode) { if (ctx->Color.Blend[buf].EquationRGB == mode && ctx->Color.Blend[buf].EquationA == mode) return; /* no change */ - _mesa_flush_vertices_for_blend_state(ctx); + _mesa_flush_vertices_for_blend_adv(ctx, ctx->Color.BlendEnabled, + advanced_mode); ctx->Color.Blend[buf].EquationRGB = mode; ctx->Color.Blend[buf].EquationA = mode; ctx->Color._BlendEquationPerBuffer = GL_TRUE; if (buf == 0) ctx->Color._AdvancedBlendMode = advanced_mode; } void GLAPIENTRY diff --git a/src/mesa/main/blend.h b/src/mesa/main/blend.h index 2454e0c..c95bc57 100644 --- a/src/mesa/main/blend.h +++ b/src/mesa/main/blend.h @@ -147,28 +147,55 @@ extern void _mesa_update_clamp_vertex_color(struct gl_context *ctx, const struct gl_framebuffer *drawFb); extern mesa_format _mesa_get_render_format(const struct gl_context *ctx, mesa_format format); extern void _mesa_init_color( struct gl_context * ctx ); +static inline enum gl_advanced_blend_mode +_mesa_get_advanced_blend_sh_constant(GLbitfield blend_enabled, + enum gl_advanced_blend_mode mode) +{ + return blend_enabled ? mode : BLEND_NONE; +} + +static inline bool +_mesa_advanded_blend_sh_constant_changed(struct gl_context *ctx, + GLbitfield new_blend_enabled, + enum gl_advanced_blend_mode new_mode) +{ + return _mesa_get_advanced_blend_sh_constant(new_blend_enabled, new_mode) != + _mesa_get_advanced_blend_sh_constant(ctx->Color.BlendEnabled, + ctx->Color._AdvancedBlendMode); +} + static inline void _mesa_flush_vertices_for_blend_state(struct gl_context *ctx) { - /* The advanced blend mode needs _NEW_COLOR to update the state constant, -* so we have to set it. This is inefficient. -* This should only be done for states that affect the state constant. -* It shouldn't be done for other blend states. -*/ - if (_mesa_has_KHR_blend_equation_advanced(ctx) || - !ctx->DriverFlags.NewBlend) { + if (!ctx->DriverFlags.NewBlend) { FLUSH_VERTICES(ctx, _NEW_COLOR); } else { FLUSH_VERTICES(ctx, 0); + ctx->NewDriverState |= ctx->DriverFlags.NewBlend; + } +} + +static inline void +_mesa_flush_vertices_for_blend_adv(struct gl_context *ctx, + GLbitfield new_blend_enabled, + enum gl_advanced_blend_mode new_mode) +{ + /* The advanced blend mode needs _NEW_COLOR to update the state constant. */ + if (_mesa_has_KHR_blend_equation_advanced(ctx) && + _mesa_advanded_blend_sh_constant_changed(ctx, new_blend_enabled, +new_mode)) { + FLUSH_VERTICES(ctx, _NEW_COLOR); + ctx->NewDriverState |= ctx->DriverFlags.NewBlend; + return; } - ctx->NewDriverState |= ctx->DriverFlags.NewBlend; + _mesa_flush_vertices_for_blend_state(ctx); } #endif diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c index 0b3de52..f733589 100644 --- a/src/mesa/main/enable.c +++ b/src/mesa/main/enable.c @@ -317,21 +317,22 @@ _mesa_set_enable(struct gl_context *ctx, GL
[Mesa-dev] [PATCH 3/5] st/mesa: don't translate blend state when it's disabled for a colorbuffer
From: Marek Olšák --- src/mesa/state_tracker/st_atom_blend.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_atom_blend.c b/src/mesa/state_tracker/st_atom_blend.c index 8f644ba..62042a6 100644 --- a/src/mesa/state_tracker/st_atom_blend.c +++ b/src/mesa/state_tracker/st_atom_blend.c @@ -154,22 +154,24 @@ st_update_blend( struct st_context *st ) blend->independent_blend_enable = 1; } if (ctx->Color.ColorLogicOpEnabled) { /* logicop enabled */ blend->logicop_enable = 1; blend->logicop_func = ctx->Color._LogicOp; } else if (ctx->Color.BlendEnabled && !ctx->Color._AdvancedBlendMode) { /* blending enabled */ for (i = 0; i < num_state; i++) { - blend->rt[i].blend_enable = (ctx->Color.BlendEnabled >> i) & 0x1; + if (!(ctx->Color.BlendEnabled & (1 << i))) +continue; + blend->rt[i].blend_enable = 1; blend->rt[i].rgb_func = translate_blend(ctx->Color.Blend[i].EquationRGB); if (ctx->Color.Blend[i].EquationRGB == GL_MIN || ctx->Color.Blend[i].EquationRGB == GL_MAX) { /* Min/max are special */ blend->rt[i].rgb_src_factor = PIPE_BLENDFACTOR_ONE; blend->rt[i].rgb_dst_factor = PIPE_BLENDFACTOR_ONE; } else { -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] st/mesa: don't check for ARB_draw_buffers_blend in update_blend
From: Marek Olšák If the GL API is missing, different blend functions can't be set through GL. --- src/mesa/state_tracker/st_atom_blend.c | 20 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_blend.c b/src/mesa/state_tracker/st_atom_blend.c index f7327d6..8f644ba 100644 --- a/src/mesa/state_tracker/st_atom_blend.c +++ b/src/mesa/state_tracker/st_atom_blend.c @@ -138,72 +138,68 @@ blend_per_rt(const struct gl_context *ctx) } return GL_FALSE; } void st_update_blend( struct st_context *st ) { struct pipe_blend_state *blend = &st->state.blend; const struct gl_context *ctx = st->ctx; unsigned num_state = 1; - unsigned i, j; + unsigned i; memset(blend, 0, sizeof(*blend)); if (blend_per_rt(ctx) || colormask_per_rt(ctx)) { num_state = ctx->Const.MaxDrawBuffers; blend->independent_blend_enable = 1; } if (ctx->Color.ColorLogicOpEnabled) { /* logicop enabled */ blend->logicop_enable = 1; blend->logicop_func = ctx->Color._LogicOp; } else if (ctx->Color.BlendEnabled && !ctx->Color._AdvancedBlendMode) { /* blending enabled */ - for (i = 0, j = 0; i < num_state; i++) { - + for (i = 0; i < num_state; i++) { blend->rt[i].blend_enable = (ctx->Color.BlendEnabled >> i) & 0x1; - if (ctx->Extensions.ARB_draw_buffers_blend) -j = i; - blend->rt[i].rgb_func = -translate_blend(ctx->Color.Blend[j].EquationRGB); +translate_blend(ctx->Color.Blend[i].EquationRGB); if (ctx->Color.Blend[i].EquationRGB == GL_MIN || ctx->Color.Blend[i].EquationRGB == GL_MAX) { /* Min/max are special */ blend->rt[i].rgb_src_factor = PIPE_BLENDFACTOR_ONE; blend->rt[i].rgb_dst_factor = PIPE_BLENDFACTOR_ONE; } else { blend->rt[i].rgb_src_factor = - translate_blend(ctx->Color.Blend[j].SrcRGB); + translate_blend(ctx->Color.Blend[i].SrcRGB); blend->rt[i].rgb_dst_factor = - translate_blend(ctx->Color.Blend[j].DstRGB); + translate_blend(ctx->Color.Blend[i].DstRGB); } blend->rt[i].alpha_func = -translate_blend(ctx->Color.Blend[j].EquationA); +translate_blend(ctx->Color.Blend[i].EquationA); if (ctx->Color.Blend[i].EquationA == GL_MIN || ctx->Color.Blend[i].EquationA == GL_MAX) { /* Min/max are special */ blend->rt[i].alpha_src_factor = PIPE_BLENDFACTOR_ONE; blend->rt[i].alpha_dst_factor = PIPE_BLENDFACTOR_ONE; } else { blend->rt[i].alpha_src_factor = - translate_blend(ctx->Color.Blend[j].SrcA); + translate_blend(ctx->Color.Blend[i].SrcA); blend->rt[i].alpha_dst_factor = - translate_blend(ctx->Color.Blend[j].DstA); + translate_blend(ctx->Color.Blend[i].DstA); } } } else { /* no blending / logicop */ } for (i = 0; i < num_state; i++) blend->rt[i].colormask = GET_COLORMASK(ctx->Color.ColorMask, i); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] st/mesa: generate blend state according to the number of enabled color buffers
From: Marek Olšák Non-MRT cases always translate blend state for 1 color buffer only. MRT cases only check and translate blend state for enabled color buffers. This also avoids an assertion failure in translate_blend for: dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq --- src/mesa/state_tracker/st_atom_blend.c | 21 + src/mesa/state_tracker/st_atom_framebuffer.c | 1 + src/mesa/state_tracker/st_atom_list.h| 2 +- src/mesa/state_tracker/st_context.h | 1 + 4 files changed, 16 insertions(+), 9 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_blend.c b/src/mesa/state_tracker/st_atom_blend.c index 24ef09b..cc2c288 100644 --- a/src/mesa/state_tracker/st_atom_blend.c +++ b/src/mesa/state_tracker/st_atom_blend.c @@ -105,59 +105,64 @@ translate_blend(GLenum blend) default: assert("invalid GL token in translate_blend()" == NULL); return 0; } } /** * Figure out if colormasks are different per rt. */ static GLboolean -colormask_per_rt(const struct gl_context *ctx) +colormask_per_rt(const struct gl_context *ctx, unsigned num_cb) { + GLbitfield full_mask = _mesa_replicate_colormask(0xf, num_cb); GLbitfield repl_mask0 = _mesa_replicate_colormask(GET_COLORMASK(ctx->Color.ColorMask, 0), -ctx->Const.MaxDrawBuffers); +num_cb); - return ctx->Color.ColorMask != repl_mask0; + return (ctx->Color.ColorMask & full_mask) != repl_mask0; } /** * Figure out if blend enables/state are different per rt. */ static GLboolean -blend_per_rt(const struct gl_context *ctx) +blend_per_rt(const struct gl_context *ctx, unsigned num_cb) { - if (ctx->Color.BlendEnabled && - (ctx->Color.BlendEnabled != ((1U << ctx->Const.MaxDrawBuffers) - 1))) { + GLbitfield cb_mask = u_bit_consecutive(0, num_cb); + GLbitfield blend_enabled = ctx->Color.BlendEnabled & cb_mask; + + if (blend_enabled && blend_enabled != cb_mask) { /* This can only happen if GL_EXT_draw_buffers2 is enabled */ return GL_TRUE; } if (ctx->Color._BlendFuncPerBuffer || ctx->Color._BlendEquationPerBuffer) { /* this can only happen if GL_ARB_draw_buffers_blend is enabled */ return GL_TRUE; } return GL_FALSE; } void st_update_blend( struct st_context *st ) { struct pipe_blend_state *blend = &st->state.blend; const struct gl_context *ctx = st->ctx; + unsigned num_cb = st->state.fb_num_cb; unsigned num_state = 1; unsigned i; memset(blend, 0, sizeof(*blend)); - if (blend_per_rt(ctx) || colormask_per_rt(ctx)) { - num_state = ctx->Const.MaxDrawBuffers; + if (num_cb > 1 && + (blend_per_rt(ctx, num_cb) || colormask_per_rt(ctx, num_cb))) { + num_state = num_cb; blend->independent_blend_enable = 1; } for (i = 0; i < num_state; i++) blend->rt[i].colormask = GET_COLORMASK(ctx->Color.ColorMask, i); if (ctx->Color.ColorLogicOpEnabled) { /* logicop enabled */ blend->logicop_enable = 1; blend->logicop_func = ctx->Color._LogicOp; diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c b/src/mesa/state_tracker/st_atom_framebuffer.c index acbe980..a29f5b3 100644 --- a/src/mesa/state_tracker/st_atom_framebuffer.c +++ b/src/mesa/state_tracker/st_atom_framebuffer.c @@ -209,11 +209,12 @@ st_update_framebuffer_state( struct st_context *st ) framebuffer.width = 0; if (framebuffer.height == USHRT_MAX) framebuffer.height = 0; cso_set_framebuffer(st->cso_context, &framebuffer); st->state.fb_width = framebuffer.width; st->state.fb_height = framebuffer.height; st->state.fb_num_samples = util_framebuffer_get_num_samples(&framebuffer); st->state.fb_num_layers = util_framebuffer_get_num_layers(&framebuffer); + st->state.fb_num_cb = framebuffer.nr_cbufs; } diff --git a/src/mesa/state_tracker/st_atom_list.h b/src/mesa/state_tracker/st_atom_list.h index 8f50a72..5391d47 100644 --- a/src/mesa/state_tracker/st_atom_list.h +++ b/src/mesa/state_tracker/st_atom_list.h @@ -3,21 +3,20 @@ ST_STATE(ST_NEW_DSA, st_update_depth_stencil_alpha) ST_STATE(ST_NEW_CLIP_STATE, st_update_clip) ST_STATE(ST_NEW_FS_STATE, st_update_fp) ST_STATE(ST_NEW_GS_STATE, st_update_gp) ST_STATE(ST_NEW_TES_STATE, st_update_tep) ST_STATE(ST_NEW_TCS_STATE, st_update_tcp) ST_STATE(ST_NEW_VS_STATE, st_update_vp) ST_STATE(ST_NEW_POLY_STIPPLE, st_update_polygon_stipple) ST_STATE(ST_NEW_WINDOW_RECTANGLES, st_update_window_rectangles) -ST_STATE(ST_NEW_BLEND, st_update_blend) ST_STATE(ST_NEW_BLEND_COLOR, st_update_blend_color) ST_STATE(ST_NEW_VS_SAMPLER_VIEWS, st_update_vertex_textures) ST_STATE(ST_NEW_FS_SAMPLER_VIEWS, st_update_fragment_textures) ST_STATE(ST_NEW_GS_SAMPLER_VIEWS, st_update_geometry_textures) ST_STATE(ST_NEW_TCS_SAMPLER_VIEWS, st_update_tessctrl_textures) ST_STATE(ST_NEW_TES_SAMPLER_VIEWS
[Mesa-dev] [PATCH 1/5] mesa: change ctx->Color.ColorMask into a 32-bit bitmask
From: Marek Olšák 4 bits per draw buffer, 8 draw buffers in total --> 32 bits. This is easier to work with. --- src/mesa/drivers/common/driverfuncs.c| 8 ++-- src/mesa/drivers/common/meta.c | 41 +++--- src/mesa/drivers/common/meta.h | 2 +- src/mesa/drivers/dri/i915/intel_clear.c | 5 +-- src/mesa/drivers/dri/i915/intel_pixel.c | 5 +-- src/mesa/drivers/dri/i915/intel_pixel_copy.c | 5 +-- src/mesa/drivers/dri/i965/brw_blorp.c| 9 ++-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 ++-- src/mesa/drivers/dri/i965/genX_state_upload.c| 13 +++--- src/mesa/drivers/dri/i965/intel_pixel.c | 5 +-- src/mesa/drivers/dri/i965/intel_pixel_copy.c | 5 +-- src/mesa/drivers/dri/nouveau/nouveau_driver.c| 8 +++- src/mesa/drivers/dri/nouveau/nv04_context.c | 5 +-- src/mesa/drivers/dri/nouveau/nv04_state_raster.c | 8 ++-- src/mesa/drivers/dri/nouveau/nv10_state_raster.c | 8 ++-- src/mesa/drivers/dri/nouveau/nv20_context.c | 8 ++-- src/mesa/drivers/dri/r200/r200_state.c | 8 ++-- src/mesa/drivers/dri/radeon/radeon_state.c | 8 ++-- src/mesa/drivers/x11/xm_dd.c | 4 +- src/mesa/main/accum.c| 16 +++ src/mesa/main/attrib.c | 16 +++ src/mesa/main/blend.c| 53 +--- src/mesa/main/blend.h| 10 + src/mesa/main/clear.c| 2 +- src/mesa/main/get.c | 16 +++ src/mesa/main/mtypes.h | 8 +++- src/mesa/state_tracker/st_atom_blend.c | 27 src/mesa/state_tracker/st_cb_clear.c | 39 ++--- src/mesa/swrast/s_clear.c| 8 +++- src/mesa/swrast/s_context.c | 10 + src/mesa/swrast/s_masking.c | 24 +++ src/mesa/swrast/s_span.c | 6 +-- src/mesa/swrast/s_triangle.c | 8 ++-- 33 files changed, 174 insertions(+), 232 deletions(-) diff --git a/src/mesa/drivers/common/driverfuncs.c b/src/mesa/drivers/common/driverfuncs.c index 99c1520..8f2e3e0 100644 --- a/src/mesa/drivers/common/driverfuncs.c +++ b/src/mesa/drivers/common/driverfuncs.c @@ -225,24 +225,24 @@ _mesa_init_driver_state(struct gl_context *ctx) ctx->Color.Blend[0].EquationRGB, ctx->Color.Blend[0].EquationA); ctx->Driver.BlendFuncSeparate(ctx, ctx->Color.Blend[0].SrcRGB, ctx->Color.Blend[0].DstRGB, ctx->Color.Blend[0].SrcA, ctx->Color.Blend[0].DstA); ctx->Driver.ColorMask(ctx, - ctx->Color.ColorMask[0][RCOMP], - ctx->Color.ColorMask[0][GCOMP], - ctx->Color.ColorMask[0][BCOMP], - ctx->Color.ColorMask[0][ACOMP]); + GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 0), + GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 1), + GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 2), + GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 3)); ctx->Driver.CullFace(ctx, ctx->Polygon.CullFaceMode); ctx->Driver.DepthFunc(ctx, ctx->Depth.Func); ctx->Driver.DepthMask(ctx, ctx->Depth.Mask); ctx->Driver.Enable(ctx, GL_ALPHA_TEST, ctx->Color.AlphaEnabled); ctx->Driver.Enable(ctx, GL_BLEND, ctx->Color.BlendEnabled); ctx->Driver.Enable(ctx, GL_COLOR_LOGIC_OP, ctx->Color.ColorLogicOpEnabled); ctx->Driver.Enable(ctx, GL_COLOR_SUM, ctx->Fog.ColorSumEnabled); ctx->Driver.Enable(ctx, GL_CULL_FACE, ctx->Polygon.CullFlag); diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index a48f700..a7dd139 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -508,24 +508,22 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state) save->ColorLogicOpEnabled = ctx->Color.ColorLogicOpEnabled; if (ctx->Color.ColorLogicOpEnabled) _mesa_set_enable(ctx, GL_COLOR_LOGIC_OP, GL_FALSE); } if (state & MESA_META_DITHER) { save->DitherFlag = ctx->Color.DitherFlag; _mesa_set_enable(ctx, GL_DITHER, GL_TRUE); } - if (state & MESA_META_COLOR_MASK) { - memcpy(save->ColorMask, ctx->Color.ColorMask, - sizeof(ctx->Color.ColorMask)); - } + if (state & MESA_META_COLOR_MASK) + save->ColorMask = ctx->Color.ColorMask; if (state & MESA_META_DEPTH_TEST) { save->Depth = ctx->Depth; /* struct copy */ if (ctx->Depth.Test) _mesa_set_enable(ctx, GL_DEPTH_TEST, GL_FALSE); } if (sta
[Mesa-dev] [PATCH 4/5] st/mesa: don't translate blend state when color writes are disabled
From: Marek Olšák --- src/mesa/state_tracker/st_atom_blend.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_blend.c b/src/mesa/state_tracker/st_atom_blend.c index 62042a6..24ef09b 100644 --- a/src/mesa/state_tracker/st_atom_blend.c +++ b/src/mesa/state_tracker/st_atom_blend.c @@ -146,29 +146,34 @@ st_update_blend( struct st_context *st ) const struct gl_context *ctx = st->ctx; unsigned num_state = 1; unsigned i; memset(blend, 0, sizeof(*blend)); if (blend_per_rt(ctx) || colormask_per_rt(ctx)) { num_state = ctx->Const.MaxDrawBuffers; blend->independent_blend_enable = 1; } + + for (i = 0; i < num_state; i++) + blend->rt[i].colormask = GET_COLORMASK(ctx->Color.ColorMask, i); + if (ctx->Color.ColorLogicOpEnabled) { /* logicop enabled */ blend->logicop_enable = 1; blend->logicop_func = ctx->Color._LogicOp; } else if (ctx->Color.BlendEnabled && !ctx->Color._AdvancedBlendMode) { /* blending enabled */ for (i = 0; i < num_state; i++) { - if (!(ctx->Color.BlendEnabled & (1 << i))) + if (!(ctx->Color.BlendEnabled & (1 << i)) || + !blend->rt[i].colormask) continue; blend->rt[i].blend_enable = 1; blend->rt[i].rgb_func = translate_blend(ctx->Color.Blend[i].EquationRGB); if (ctx->Color.Blend[i].EquationRGB == GL_MIN || ctx->Color.Blend[i].EquationRGB == GL_MAX) { /* Min/max are special */ blend->rt[i].rgb_src_factor = PIPE_BLENDFACTOR_ONE; @@ -195,23 +200,20 @@ st_update_blend( struct st_context *st ) translate_blend(ctx->Color.Blend[i].SrcA); blend->rt[i].alpha_dst_factor = translate_blend(ctx->Color.Blend[i].DstA); } } } else { /* no blending / logicop */ } - for (i = 0; i < num_state; i++) - blend->rt[i].colormask = GET_COLORMASK(ctx->Color.ColorMask, i); - blend->dither = ctx->Color.DitherFlag; if (_mesa_is_multisample_enabled(ctx) && !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) { /* Unlike in gallium/d3d10 these operations are only performed * if both msaa is enabled and we have a multisample buffer. */ blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage; blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne; } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium: remove pipe_blend_state::dither
From: Marek Olšák very few drivers actually implement it. --- src/gallium/auxiliary/util/u_dump_state.c | 1 - src/gallium/auxiliary/vl/vl_compositor.c | 1 - src/gallium/auxiliary/vl/vl_idct.c | 1 - src/gallium/auxiliary/vl/vl_mc.c | 1 - src/gallium/auxiliary/vl/vl_zscan.c| 1 - src/gallium/docs/source/cso/blend.rst | 10 -- src/gallium/drivers/etnaviv/etnaviv_blend.c| 14 +++--- src/gallium/drivers/freedreno/a2xx/fd2_blend.c | 3 --- src/gallium/drivers/freedreno/a3xx/fd3_blend.c | 3 --- src/gallium/drivers/freedreno/a4xx/fd4_blend.c | 3 --- src/gallium/drivers/freedreno/a5xx/fd5_blend.c | 3 --- src/gallium/drivers/i915/i915_state.c | 3 --- src/gallium/drivers/nouveau/nv30/nv30_state.c | 2 +- src/gallium/drivers/r300/r300_state.c | 12 src/gallium/drivers/trace/tr_dump_state.c | 2 -- src/gallium/drivers/virgl/virgl_encode.c | 2 +- src/gallium/include/pipe/p_state.h | 1 - src/gallium/state_trackers/nine/nine_pipe.c| 2 -- src/gallium/state_trackers/va/surface.c| 1 - src/gallium/state_trackers/vdpau/output.c | 1 - src/mesa/state_tracker/st_atom_blend.c | 2 -- src/mesa/state_tracker/st_cb_clear.c | 3 --- 22 files changed, 5 insertions(+), 67 deletions(-) diff --git a/src/gallium/auxiliary/util/u_dump_state.c b/src/gallium/auxiliary/util/u_dump_state.c index b68de13..ddaafef 100644 --- a/src/gallium/auxiliary/util/u_dump_state.c +++ b/src/gallium/auxiliary/util/u_dump_state.c @@ -591,21 +591,20 @@ util_dump_blend_state(FILE *stream, const struct pipe_blend_state *state) { unsigned valid_entries = 1; if (!state) { util_dump_null(stream); return; } util_dump_struct_begin(stream, "pipe_blend_state"); - util_dump_member(stream, bool, state, dither); util_dump_member(stream, bool, state, alpha_to_coverage); util_dump_member(stream, bool, state, alpha_to_one); util_dump_member(stream, bool, state, logicop_enable); if (state->logicop_enable) { util_dump_member(stream, enum_func, state, logicop_func); } else { util_dump_member(stream, bool, state, independent_blend_enable); diff --git a/src/gallium/auxiliary/vl/vl_compositor.c b/src/gallium/auxiliary/vl/vl_compositor.c index 725bfd9..d264d87 100644 --- a/src/gallium/auxiliary/vl/vl_compositor.c +++ b/src/gallium/auxiliary/vl/vl_compositor.c @@ -579,21 +579,20 @@ init_pipe_state(struct vl_compositor *c) sampler.min_img_filter = PIPE_TEX_FILTER_NEAREST; sampler.mag_img_filter = PIPE_TEX_FILTER_NEAREST; c->sampler_nearest = c->pipe->create_sampler_state(c->pipe, &sampler); memset(&blend, 0, sizeof blend); blend.independent_blend_enable = 0; blend.rt[0].blend_enable = 0; blend.logicop_enable = 0; blend.logicop_func = PIPE_LOGICOP_CLEAR; blend.rt[0].colormask = PIPE_MASK_RGBA; - blend.dither = 0; c->blend_clear = c->pipe->create_blend_state(c->pipe, &blend); blend.rt[0].blend_enable = 1; blend.rt[0].rgb_func = PIPE_BLEND_ADD; blend.rt[0].rgb_src_factor = PIPE_BLENDFACTOR_SRC_ALPHA; blend.rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_INV_SRC_ALPHA; blend.rt[0].alpha_func = PIPE_BLEND_ADD; blend.rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ONE; blend.rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE; c->blend_add = c->pipe->create_blend_state(c->pipe, &blend); diff --git a/src/gallium/auxiliary/vl/vl_idct.c b/src/gallium/auxiliary/vl/vl_idct.c index 3e6f581..9995087 100644 --- a/src/gallium/auxiliary/vl/vl_idct.c +++ b/src/gallium/auxiliary/vl/vl_idct.c @@ -528,21 +528,20 @@ init_state(struct vl_idct *idct) blend.rt[0].rgb_func = PIPE_BLEND_ADD; blend.rt[0].rgb_src_factor = PIPE_BLENDFACTOR_ONE; blend.rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_ONE; blend.rt[0].alpha_func = PIPE_BLEND_ADD; blend.rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ONE; blend.rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE; blend.logicop_enable = 0; blend.logicop_func = PIPE_LOGICOP_CLEAR; /* Needed to allow color writes to FB, even if blending disabled */ blend.rt[0].colormask = PIPE_MASK_RGBA; - blend.dither = 0; idct->blend = idct->pipe->create_blend_state(idct->pipe, &blend); if (!idct->blend) goto error_blend; for (i = 0; i < 2; ++i) { memset(&sampler, 0, sizeof(sampler)); sampler.wrap_s = PIPE_TEX_WRAP_REPEAT; sampler.wrap_t = PIPE_TEX_WRAP_REPEAT; sampler.wrap_r = PIPE_TEX_WRAP_REPEAT; sampler.min_img_filter = PIPE_TEX_FILTER_NEAREST; diff --git a/src/gallium/auxiliary/vl/vl_mc.c b/src/gallium/auxiliary/vl/vl_mc.c index a202fac..944b4fc 100644 --- a/src/gallium/auxiliary/vl/vl_mc.c +++ b/src/gallium/auxiliary/vl/vl_mc.c @@ -402,21 +402,20 @@ init_pipe_state(struct vl_mc *r) blend.rt[0].blend_enable = 1; blend.rt[0].rgb_func = PIPE_BLEND_ADD;
Re: [Mesa-dev] [PATCH] gallium: remove pipe_blend_state::dither
nine uses it, GL uses it, a bunch of drivers implement it ... why is it being removed? On Wed, Jan 31, 2018 at 2:55 PM, Marek Olšák wrote: > From: Marek Olšák > > very few drivers actually implement it. > --- > src/gallium/auxiliary/util/u_dump_state.c | 1 - > src/gallium/auxiliary/vl/vl_compositor.c | 1 - > src/gallium/auxiliary/vl/vl_idct.c | 1 - > src/gallium/auxiliary/vl/vl_mc.c | 1 - > src/gallium/auxiliary/vl/vl_zscan.c| 1 - > src/gallium/docs/source/cso/blend.rst | 10 -- > src/gallium/drivers/etnaviv/etnaviv_blend.c| 14 +++--- > src/gallium/drivers/freedreno/a2xx/fd2_blend.c | 3 --- > src/gallium/drivers/freedreno/a3xx/fd3_blend.c | 3 --- > src/gallium/drivers/freedreno/a4xx/fd4_blend.c | 3 --- > src/gallium/drivers/freedreno/a5xx/fd5_blend.c | 3 --- > src/gallium/drivers/i915/i915_state.c | 3 --- > src/gallium/drivers/nouveau/nv30/nv30_state.c | 2 +- > src/gallium/drivers/r300/r300_state.c | 12 > src/gallium/drivers/trace/tr_dump_state.c | 2 -- > src/gallium/drivers/virgl/virgl_encode.c | 2 +- > src/gallium/include/pipe/p_state.h | 1 - > src/gallium/state_trackers/nine/nine_pipe.c| 2 -- > src/gallium/state_trackers/va/surface.c| 1 - > src/gallium/state_trackers/vdpau/output.c | 1 - > src/mesa/state_tracker/st_atom_blend.c | 2 -- > src/mesa/state_tracker/st_cb_clear.c | 3 --- > 22 files changed, 5 insertions(+), 67 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_dump_state.c > b/src/gallium/auxiliary/util/u_dump_state.c > index b68de13..ddaafef 100644 > --- a/src/gallium/auxiliary/util/u_dump_state.c > +++ b/src/gallium/auxiliary/util/u_dump_state.c > @@ -591,21 +591,20 @@ util_dump_blend_state(FILE *stream, const struct > pipe_blend_state *state) > { > unsigned valid_entries = 1; > > if (!state) { >util_dump_null(stream); >return; > } > > util_dump_struct_begin(stream, "pipe_blend_state"); > > - util_dump_member(stream, bool, state, dither); > util_dump_member(stream, bool, state, alpha_to_coverage); > util_dump_member(stream, bool, state, alpha_to_one); > > util_dump_member(stream, bool, state, logicop_enable); > if (state->logicop_enable) { >util_dump_member(stream, enum_func, state, logicop_func); > } > else { >util_dump_member(stream, bool, state, independent_blend_enable); > > diff --git a/src/gallium/auxiliary/vl/vl_compositor.c > b/src/gallium/auxiliary/vl/vl_compositor.c > index 725bfd9..d264d87 100644 > --- a/src/gallium/auxiliary/vl/vl_compositor.c > +++ b/src/gallium/auxiliary/vl/vl_compositor.c > @@ -579,21 +579,20 @@ init_pipe_state(struct vl_compositor *c) > sampler.min_img_filter = PIPE_TEX_FILTER_NEAREST; > sampler.mag_img_filter = PIPE_TEX_FILTER_NEAREST; > c->sampler_nearest = c->pipe->create_sampler_state(c->pipe, &sampler); > > memset(&blend, 0, sizeof blend); > blend.independent_blend_enable = 0; > blend.rt[0].blend_enable = 0; > blend.logicop_enable = 0; > blend.logicop_func = PIPE_LOGICOP_CLEAR; > blend.rt[0].colormask = PIPE_MASK_RGBA; > - blend.dither = 0; > c->blend_clear = c->pipe->create_blend_state(c->pipe, &blend); > > blend.rt[0].blend_enable = 1; > blend.rt[0].rgb_func = PIPE_BLEND_ADD; > blend.rt[0].rgb_src_factor = PIPE_BLENDFACTOR_SRC_ALPHA; > blend.rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_INV_SRC_ALPHA; > blend.rt[0].alpha_func = PIPE_BLEND_ADD; > blend.rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ONE; > blend.rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE; > c->blend_add = c->pipe->create_blend_state(c->pipe, &blend); > diff --git a/src/gallium/auxiliary/vl/vl_idct.c > b/src/gallium/auxiliary/vl/vl_idct.c > index 3e6f581..9995087 100644 > --- a/src/gallium/auxiliary/vl/vl_idct.c > +++ b/src/gallium/auxiliary/vl/vl_idct.c > @@ -528,21 +528,20 @@ init_state(struct vl_idct *idct) > blend.rt[0].rgb_func = PIPE_BLEND_ADD; > blend.rt[0].rgb_src_factor = PIPE_BLENDFACTOR_ONE; > blend.rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_ONE; > blend.rt[0].alpha_func = PIPE_BLEND_ADD; > blend.rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ONE; > blend.rt[0].alpha_dst_factor = PIPE_BLENDFACTOR_ONE; > blend.logicop_enable = 0; > blend.logicop_func = PIPE_LOGICOP_CLEAR; > /* Needed to allow color writes to FB, even if blending disabled */ > blend.rt[0].colormask = PIPE_MASK_RGBA; > - blend.dither = 0; > idct->blend = idct->pipe->create_blend_state(idct->pipe, &blend); > if (!idct->blend) >goto error_blend; > > for (i = 0; i < 2; ++i) { >memset(&sampler, 0, sizeof(sampler)); >sampler.wrap_s = PIPE_TEX_WRAP_REPEAT; >sampler.wrap_t = PIPE_TEX_WRAP_REPEAT; >sampler.wrap_r = PIPE_TEX_WRAP_REPEAT; >sampler.min_img_filter = PIPE_TEX_
Re: [Mesa-dev] [PATCH] gallium: remove pipe_blend_state::dither
On Wed, Jan 31, 2018 at 8:58 PM, Ilia Mirkin wrote: > nine uses it, GL uses it, a bunch of drivers implement it ... why is > it being removed? very few drivers actually implement it. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove pipe_blend_state::dither
On Wed, Jan 31, 2018 at 2:59 PM, Marek Olšák wrote: > On Wed, Jan 31, 2018 at 8:58 PM, Ilia Mirkin wrote: >> nine uses it, GL uses it, a bunch of drivers implement it ... why is >> it being removed? > > very few drivers actually implement it. nv30, a2xx-a4xx, etnaviv, i915, virgl. None of the modern stuff implements it (except a4xx, which is a DX11-ish chip), but that hardly seems like a reason to remove. I dunno, I don't feel too strongly about it, just seemed lacking a proper motivation. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM
Acked-by: Marek Olšák Marek On Wed, Jan 31, 2018 at 6:37 AM, Timothy Arceri wrote: > On 31/01/18 15:05, Timothy Arceri wrote: >> >> This has been unused since 100796c15c3a. >> --- >> >> Please note this is not even compile tested as I don't have clover > >> 7.0.0 repo to go with my current llvm 7.0.0 setup. Any testing is >> appreciated. > > > That was meant to say clang repo. > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/glsl_to_nir: add more nir opts to st_nir_opts()
Reviewed-by: Marek Olšák Marek On Wed, Jan 31, 2018 at 3:00 AM, Timothy Arceri wrote: > I forgot to add that adding the opts also required some of the lowering > passes to be called slightly earlier. > > > On 31/01/18 12:58, Timothy Arceri wrote: >> >> All of the current gallium nir driver use these optimisations but >> they do so in their backends. Having these called in the backend >> only can cause a number of problems: >> >> - Shader compile times are greater because the opts need to do >>significant passes over all shader variants. >> - The shader cache is partially defeated due to the significant >>optimisation passes over variants. >> - We might miss out on nir linking optimisation opportunities. >> >> Adding these passes to st_nir_opts() alleviates these problems. >> --- >> src/mesa/state_tracker/st_glsl_to_nir.cpp | 36 >> +-- >> 1 file changed, 20 insertions(+), 16 deletions(-) >> >> diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp >> b/src/mesa/state_tracker/st_glsl_to_nir.cpp >> index 65931bfa33..b9ac9fafc2 100644 >> --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp >> +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp >> @@ -270,6 +270,10 @@ st_nir_opts(nir_shader *nir) >> do { >> progress = false; >> + NIR_PASS_V(nir, nir_lower_vars_to_ssa); >> + NIR_PASS_V(nir, nir_lower_alu_to_scalar); >> + NIR_PASS_V(nir, nir_lower_phis_to_scalar); >> + >> NIR_PASS_V(nir, nir_lower_64bit_pack); >> NIR_PASS(progress, nir, nir_copy_prop); >> NIR_PASS(progress, nir, nir_opt_remove_phis); >> @@ -317,6 +321,22 @@ st_glsl_to_nir(struct st_context *st, struct >> gl_program *prog, >> (nir_variable_mode) (nir_var_shader_in | nir_var_shader_out); >> nir_remove_dead_variables(nir, mask); >> + if (options->lower_all_io_to_temps || >> + nir->info.stage == MESA_SHADER_VERTEX || >> + nir->info.stage == MESA_SHADER_GEOMETRY) { >> + NIR_PASS_V(nir, nir_lower_io_to_temporaries, >> + nir_shader_get_entrypoint(nir), >> + true, true); >> + } else if (nir->info.stage == MESA_SHADER_FRAGMENT) { >> + NIR_PASS_V(nir, nir_lower_io_to_temporaries, >> + nir_shader_get_entrypoint(nir), >> + true, false); >> + } >> + >> + NIR_PASS_V(nir, nir_lower_global_vars_to_local); >> + NIR_PASS_V(nir, nir_split_var_copies); >> + NIR_PASS_V(nir, nir_lower_var_copies); >> + >> st_nir_opts(nir); >>return nir; >> @@ -481,22 +501,6 @@ st_nir_get_mesa_program(struct gl_context *ctx, >>set_st_program(prog, shader_program, nir); >> prog->nir = nir; >> - >> - if (options->lower_all_io_to_temps || >> - nir->info.stage == MESA_SHADER_VERTEX || >> - nir->info.stage == MESA_SHADER_GEOMETRY) { >> - NIR_PASS_V(nir, nir_lower_io_to_temporaries, >> - nir_shader_get_entrypoint(nir), >> - true, true); >> - } else if (nir->info.stage == MESA_SHADER_FRAGMENT) { >> - NIR_PASS_V(nir, nir_lower_io_to_temporaries, >> - nir_shader_get_entrypoint(nir), >> - true, false); >> - } >> - >> - NIR_PASS_V(nir, nir_lower_global_vars_to_local); >> - NIR_PASS_V(nir, nir_split_var_copies); >> - NIR_PASS_V(nir, nir_lower_var_copies); >> } >> static void >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: change ctx->Color.ColorMask into a 32-bit bitmask
Am 31.01.2018 um 20:55 schrieb Marek Olšák: > From: Marek Olšák > > 4 bits per draw buffer, 8 draw buffers in total --> 32 bits. > > This is easier to work with. > --- > src/mesa/drivers/common/driverfuncs.c| 8 ++-- > src/mesa/drivers/common/meta.c | 41 +++--- > src/mesa/drivers/common/meta.h | 2 +- > src/mesa/drivers/dri/i915/intel_clear.c | 5 +-- > src/mesa/drivers/dri/i915/intel_pixel.c | 5 +-- > src/mesa/drivers/dri/i915/intel_pixel_copy.c | 5 +-- > src/mesa/drivers/dri/i965/brw_blorp.c| 9 ++-- > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 ++-- > src/mesa/drivers/dri/i965/genX_state_upload.c| 13 +++--- > src/mesa/drivers/dri/i965/intel_pixel.c | 5 +-- > src/mesa/drivers/dri/i965/intel_pixel_copy.c | 5 +-- > src/mesa/drivers/dri/nouveau/nouveau_driver.c| 8 +++- > src/mesa/drivers/dri/nouveau/nv04_context.c | 5 +-- > src/mesa/drivers/dri/nouveau/nv04_state_raster.c | 8 ++-- > src/mesa/drivers/dri/nouveau/nv10_state_raster.c | 8 ++-- > src/mesa/drivers/dri/nouveau/nv20_context.c | 8 ++-- > src/mesa/drivers/dri/r200/r200_state.c | 8 ++-- > src/mesa/drivers/dri/radeon/radeon_state.c | 8 ++-- > src/mesa/drivers/x11/xm_dd.c | 4 +- > src/mesa/main/accum.c| 16 +++ > src/mesa/main/attrib.c | 16 +++ > src/mesa/main/blend.c| 53 > +--- > src/mesa/main/blend.h| 10 + > src/mesa/main/clear.c| 2 +- > src/mesa/main/get.c | 16 +++ > src/mesa/main/mtypes.h | 8 +++- > src/mesa/state_tracker/st_atom_blend.c | 27 > src/mesa/state_tracker/st_cb_clear.c | 39 ++--- > src/mesa/swrast/s_clear.c| 8 +++- > src/mesa/swrast/s_context.c | 10 + > src/mesa/swrast/s_masking.c | 24 +++ > src/mesa/swrast/s_span.c | 6 +-- > src/mesa/swrast/s_triangle.c | 8 ++-- > 33 files changed, 174 insertions(+), 232 deletions(-) > ... > > diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c > index cf9a2f6..16ad8f6 100644 > --- a/src/mesa/main/get.c > +++ b/src/mesa/main/get.c > @@ -670,24 +670,24 @@ find_custom_value(struct gl_context *ctx, const struct > value_desc *d, union valu > > case GL_CURRENT_TEXTURE_COORDS: >unit = ctx->Texture.CurrentUnit; >v->value_float_4[0] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][0]; >v->value_float_4[1] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][1]; >v->value_float_4[2] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][2]; >v->value_float_4[3] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][3]; >break; > > case GL_COLOR_WRITEMASK: > - v->value_int_4[0] = ctx->Color.ColorMask[0][RCOMP] ? 1 : 0; > - v->value_int_4[1] = ctx->Color.ColorMask[0][GCOMP] ? 1 : 0; > - v->value_int_4[2] = ctx->Color.ColorMask[0][BCOMP] ? 1 : 0; > - v->value_int_4[3] = ctx->Color.ColorMask[0][ACOMP] ? 1 : 0; > + v->value_int_4[0] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 0) ? 1 > : 0; > + v->value_int_4[1] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 1) ? 1 > : 0; > + v->value_int_4[2] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 2) ? 1 > : 0; > + v->value_int_4[3] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 3) ? 1 > : 0; Here and below, GET_COLORMASK_BIT is already defined to return 1 or 0, so the conditional seems a bit overkill (albeit I guess the compiler is clever enough to get rid of it?) Roland >break; > > case GL_EDGE_FLAG: >v->value_bool = ctx->Current.Attrib[VERT_ATTRIB_EDGEFLAG][0] == 1.0F; >break; > > case GL_READ_BUFFER: >v->value_enum = ctx->ReadBuffer->ColorReadBuffer; >break; > > @@ -2255,24 +2255,24 @@ find_value_indexed(const char *func, GLenum pname, > GLuint index, union value *v) >if (!ctx->Extensions.ARB_draw_buffers_blend) > goto invalid_enum; >v->value_int = ctx->Color.Blend[index].EquationA; >return TYPE_INT; > > case GL_COLOR_WRITEMASK: >if (index >= ctx->Const.MaxDrawBuffers) > goto invalid_value; >if (!ctx->Extensions.EXT_draw_buffers2) > goto invalid_enum; > - v->value_int_4[0] = ctx->Color.ColorMask[index][RCOMP] ? 1 : 0; > - v->value_int_4[1] = ctx->Color.ColorMask[index][GCOMP] ? 1 : 0; > - v->value_int_4[2] = ctx->Color.ColorMask[index][BCOMP] ? 1 : 0; > - v->value_int_4[3] = ctx->Color.ColorMask[index][ACOMP] ? 1 : 0; > + v->value_int_4[0] = GET_COLORMASK_BIT(ctx->Color.ColorMask, index, 0) > ? 1 : 0; > + v->value
Re: [Mesa-dev] [PATCH] i965: Bump official kernel requirement to Linux v3.9.
Quoting Kenneth Graunke (2018-01-31 19:33:13) > In commit 3f353342a6b6744773c26ed66b12afed42bd57af (present in 17.3.0) > we started unconditionally using I915_EXEC_NO_RELOC, which was > introduced in Linux v3.9. ChromeOS kernel 3.8 has backported this, > so it should work too. > > Running on older kernels would likely result in every single batch > being rejected by the kernel, which is pretty catastrophic. Yet, it > appears that nobody noticed. So, let's just bump the official > requirement and move forward ever so slowly. > > Fixes: 3f353342a6b ("i965: Use I915_EXEC_NO_RELOC") I did think we were already checking for a more recent kernel, oh well. I checked that v3.9 is the first kernel to support I915_EXEC_NO_RELOC Reviewed-by: Chris Wilson but I'll leave it to some else to vouch for bumping the requirement. -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: change ctx->Color.ColorMask into a 32-bit bitmask
On Wed, Jan 31, 2018 at 9:31 PM, Roland Scheidegger wrote: > Am 31.01.2018 um 20:55 schrieb Marek Olšák: >> From: Marek Olšák >> >> 4 bits per draw buffer, 8 draw buffers in total --> 32 bits. >> >> This is easier to work with. >> --- >> src/mesa/drivers/common/driverfuncs.c| 8 ++-- >> src/mesa/drivers/common/meta.c | 41 +++--- >> src/mesa/drivers/common/meta.h | 2 +- >> src/mesa/drivers/dri/i915/intel_clear.c | 5 +-- >> src/mesa/drivers/dri/i915/intel_pixel.c | 5 +-- >> src/mesa/drivers/dri/i915/intel_pixel_copy.c | 5 +-- >> src/mesa/drivers/dri/i965/brw_blorp.c| 9 ++-- >> src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 ++-- >> src/mesa/drivers/dri/i965/genX_state_upload.c| 13 +++--- >> src/mesa/drivers/dri/i965/intel_pixel.c | 5 +-- >> src/mesa/drivers/dri/i965/intel_pixel_copy.c | 5 +-- >> src/mesa/drivers/dri/nouveau/nouveau_driver.c| 8 +++- >> src/mesa/drivers/dri/nouveau/nv04_context.c | 5 +-- >> src/mesa/drivers/dri/nouveau/nv04_state_raster.c | 8 ++-- >> src/mesa/drivers/dri/nouveau/nv10_state_raster.c | 8 ++-- >> src/mesa/drivers/dri/nouveau/nv20_context.c | 8 ++-- >> src/mesa/drivers/dri/r200/r200_state.c | 8 ++-- >> src/mesa/drivers/dri/radeon/radeon_state.c | 8 ++-- >> src/mesa/drivers/x11/xm_dd.c | 4 +- >> src/mesa/main/accum.c| 16 +++ >> src/mesa/main/attrib.c | 16 +++ >> src/mesa/main/blend.c| 53 >> +--- >> src/mesa/main/blend.h| 10 + >> src/mesa/main/clear.c| 2 +- >> src/mesa/main/get.c | 16 +++ >> src/mesa/main/mtypes.h | 8 +++- >> src/mesa/state_tracker/st_atom_blend.c | 27 >> src/mesa/state_tracker/st_cb_clear.c | 39 ++--- >> src/mesa/swrast/s_clear.c| 8 +++- >> src/mesa/swrast/s_context.c | 10 + >> src/mesa/swrast/s_masking.c | 24 +++ >> src/mesa/swrast/s_span.c | 6 +-- >> src/mesa/swrast/s_triangle.c | 8 ++-- >> 33 files changed, 174 insertions(+), 232 deletions(-) >> > > ... > > >> >> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c >> index cf9a2f6..16ad8f6 100644 >> --- a/src/mesa/main/get.c >> +++ b/src/mesa/main/get.c >> @@ -670,24 +670,24 @@ find_custom_value(struct gl_context *ctx, const struct >> value_desc *d, union valu >> >> case GL_CURRENT_TEXTURE_COORDS: >>unit = ctx->Texture.CurrentUnit; >>v->value_float_4[0] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][0]; >>v->value_float_4[1] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][1]; >>v->value_float_4[2] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][2]; >>v->value_float_4[3] = ctx->Current.Attrib[VERT_ATTRIB_TEX0 + unit][3]; >>break; >> >> case GL_COLOR_WRITEMASK: >> - v->value_int_4[0] = ctx->Color.ColorMask[0][RCOMP] ? 1 : 0; >> - v->value_int_4[1] = ctx->Color.ColorMask[0][GCOMP] ? 1 : 0; >> - v->value_int_4[2] = ctx->Color.ColorMask[0][BCOMP] ? 1 : 0; >> - v->value_int_4[3] = ctx->Color.ColorMask[0][ACOMP] ? 1 : 0; >> + v->value_int_4[0] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 0) ? 1 >> : 0; >> + v->value_int_4[1] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 1) ? 1 >> : 0; >> + v->value_int_4[2] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 2) ? 1 >> : 0; >> + v->value_int_4[3] = GET_COLORMASK_BIT(ctx->Color.ColorMask, 0, 3) ? 1 >> : 0; > > Here and below, GET_COLORMASK_BIT is already defined to return 1 or 0, > so the conditional seems a bit overkill (albeit I guess the compiler is > clever enough to get rid of it?) Yes (and yes). I'll just adjust it without re-sending. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/glsl_to_nir: add more nir opts to st_nir_opts()
Timothy Arceri writes: > All of the current gallium nir driver use these optimisations but > they do so in their backends. Having these called in the backend > only can cause a number of problems: > > - Shader compile times are greater because the opts need to do > significant passes over all shader variants. > - The shader cache is partially defeated due to the significant > optimisation passes over variants. > - We might miss out on nir linking optimisation opportunities. > > Adding these passes to st_nir_opts() alleviates these problems. Maybe some driver gaining NIR input would want vector math instead of scalar, but if all of our drivers agree at the moment, then let's put it here. Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove pipe_blend_state::dither
Marek Olšák writes: > From: Marek Olšák > > very few drivers actually implement it. I disagree. If the hardware supports it and the API supports it, then we should support it, too. I've got a branch around somewhere for vc4 dithering. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove pipe_blend_state::dither
On Jan 31, 2018 11:14 PM, "Eric Anholt" wrote: Marek Olšák writes: > From: Marek Olšák > > very few drivers actually implement it. I disagree. If the hardware supports it and the API supports it, then we should support it, too. I've got a branch around somewhere for vc4 dithering. Fair enough. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600: fix buffer resinfo opcode translation.
From: Dave Airlie The vtx operations never got translated, so things worked by 0 being equal to 0, translate them so we can use the proper buffer resinfo code. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_asm.c| 2 +- src/gallium/drivers/r600/r600_shader.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/r600_asm.c b/src/gallium/drivers/r600/r600_asm.c index 92c2bdf..21d069d 100644 --- a/src/gallium/drivers/r600/r600_asm.c +++ b/src/gallium/drivers/r600/r600_asm.c @@ -1510,7 +1510,7 @@ int cm_bytecode_add_cf_end(struct r600_bytecode *bc) /* common to all 3 families */ static int r600_bytecode_vtx_build(struct r600_bytecode *bc, struct r600_bytecode_vtx *vtx, unsigned id) { - bc->bytecode[id] = S_SQ_VTX_WORD0_VTX_INST(vtx->op) | + bc->bytecode[id] = S_SQ_VTX_WORD0_VTX_INST(r600_isa_fetch_opcode(bc->isa->hw_class, vtx->op)) | S_SQ_VTX_WORD0_BUFFER_ID(vtx->buffer_id) | S_SQ_VTX_WORD0_FETCH_TYPE(vtx->fetch_type) | S_SQ_VTX_WORD0_SRC_GPR(vtx->src_gpr) | diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index be02412..46e2d08 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -6939,7 +6939,7 @@ static int r600_do_buffer_txq(struct r600_shader_ctx *ctx, int reg_idx, int offs } else { struct r600_bytecode_vtx vtx; memset(&vtx, 0, sizeof(vtx)); - vtx.op = FETCH_OP_GDS_MIN_UINT; /* aka GET_BUFFER_RESINFO */ + vtx.op = FETCH_OP_GET_BUFFER_RESINFO; vtx.buffer_id = id + R600_MAX_CONST_BUFFERS; vtx.fetch_type = SQ_VTX_FETCH_NO_INDEX_OFFSET; vtx.src_gpr = 0; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 06/24] anv/image: Add a helper for determining when fast clears are supported
On Mon, Jan 22, 2018 at 12:19:33AM -0800, Jason Ekstrand wrote: > v2 (Jason Ekstrand): > - Return an enum instead of a boolean > > v3 (Jason Ekstrand): > - Return ANV_FAST_CLEAR_NONE instead of false (Topi) > - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE > - Add documentation for the enum values > > Reviewed-by: Topi Pohjolainen > --- > src/intel/vulkan/anv_image.c | 71 > ++ > src/intel/vulkan/anv_private.h | 16 ++ > 2 files changed, 87 insertions(+) > > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c > index 0aa8cd9..4cd4fe1 100644 > --- a/src/intel/vulkan/anv_image.c > +++ b/src/intel/vulkan/anv_image.c > @@ -861,6 +861,77 @@ anv_layout_to_aux_usage(const struct gen_device_info * > const devinfo, > unreachable("layout is not a VkImageLayout enumeration member."); > } > > +/** > + * This function returns the level of unresolved fast-clear support of the > + * given image in the given VkImageLayout. > + * > + * @param devinfo The device information of the Intel GPU. > + * @param image The image that may contain a collection of buffers. > + * @param aspect The aspect of the image to be accessed. > + * @param layout The current layout of the image aspect(s). > + */ > +enum anv_fast_clear_type > +anv_layout_to_fast_clear_type(const struct gen_device_info * const devinfo, > + const struct anv_image * const image, > + const VkImageAspectFlagBits aspect, > + const VkImageLayout layout) > +{ > + /* The aspect must be exactly one of the image aspects. */ > + assert(_mesa_bitcount(aspect) == 1 && (aspect & image->aspects)); > + > + uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect); > + > + /* If there is no auxiliary surface allocated, there are no fast-clears */ > + if (image->planes[plane].aux_surface.isl.size == 0) > + return ANV_FAST_CLEAR_NONE; > + > + /* All images that use an auxiliary surface are required to be tiled. */ > + assert(image->tiling == VK_IMAGE_TILING_OPTIMAL); > + > + /* Stencil has no aux */ > + assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT); > + > + if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) { > + /* For depth images (with HiZ), the layout supports fast-clears if and > + * only if it supports HiZ. However, we only support fast-clears to > the > + * default depth value. > + */ > + enum isl_aux_usage aux_usage = > + anv_layout_to_aux_usage(devinfo, image, aspect, layout); > + return aux_usage == ISL_AUX_USAGE_HIZ ? > + ANV_FAST_CLEAR_DEFAULT_VALUE : ANV_FAST_CLEAR_NONE; > + } > + > + assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV); > + > + /* Multisample fast-clear is not yet supported. */ > + if (image->samples > 1) > + return ANV_FAST_CLEAR_NONE; > + > + /* The only layout which actually supports fast-clears today is > +* VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL. Some day in the future > +* this may change if our ability to track clear colors improves. > +*/ > + switch (layout) { > + case VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL: > + return ANV_FAST_CLEAR_ANY; > + > + case VK_IMAGE_LAYOUT_PRESENT_SRC_KHR: > + return ANV_FAST_CLEAR_NONE; Just realized that TRANSFER_DST supports ANV_FAST_CLEAR_DEFAULT_VALUE. -Nanley > + > + default: > + /* If the image has CCS_E enabled all the time then we can use > + * fast-clear as long as the clear color is the default value of zero > + * since this is the default value we program into every surface state > + * used for texturing. > + */ > + if (image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E) > + return ANV_FAST_CLEAR_DEFAULT_VALUE; > + else > + return ANV_FAST_CLEAR_NONE; > + } > +} > + > > static struct anv_state > alloc_surface_state(struct anv_device *device) > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h > index cf82196..b96895b 100644 > --- a/src/intel/vulkan/anv_private.h > +++ b/src/intel/vulkan/anv_private.h > @@ -2423,6 +2423,16 @@ struct anv_image { > } planes[3]; > }; > > +/* The ordering of this enum is important */ > +enum anv_fast_clear_type { > + /** Image does not have/support any fast-clear blocks */ > + ANV_FAST_CLEAR_NONE = 0, > + /** Image has/supports fast-clear but only to the default value */ > + ANV_FAST_CLEAR_DEFAULT_VALUE = 1, > + /** Image has/supports fast-clear with an arbitrary fast-clear value */ > + ANV_FAST_CLEAR_ANY = 2, > +}; > + > /* Returns the number of auxiliary buffer levels attached to an image. */ > static inline uint8_t > anv_image_aux_levels(const struct anv_image * const image, > @@ -2545,6 +2555,12 @@ anv_layout_to_aux_usage(const struct gen_device_info * > const devinfo, > const VkImageAspectFlagBits aspect, >
Re: [Mesa-dev] [PATCH] r600: fix buffer resinfo opcode translation.
Ah I see now how that's supposed to work... Previous to adding GET_BUFFER_RESINFO the op was just a fixed zero, and the op for this is the same on eg/cm (and we should not hit it with r600). But indeed that looks more like the code elsewhere... Reviewed-by: Roland Scheidegger Am 01.02.2018 um 01:33 schrieb Dave Airlie: > From: Dave Airlie > > The vtx operations never got translated, so things worked by > 0 being equal to 0, translate them so we can use the proper buffer > resinfo code. > > Signed-off-by: Dave Airlie > --- > src/gallium/drivers/r600/r600_asm.c| 2 +- > src/gallium/drivers/r600/r600_shader.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/r600/r600_asm.c > b/src/gallium/drivers/r600/r600_asm.c > index 92c2bdf..21d069d 100644 > --- a/src/gallium/drivers/r600/r600_asm.c > +++ b/src/gallium/drivers/r600/r600_asm.c > @@ -1510,7 +1510,7 @@ int cm_bytecode_add_cf_end(struct r600_bytecode *bc) > /* common to all 3 families */ > static int r600_bytecode_vtx_build(struct r600_bytecode *bc, struct > r600_bytecode_vtx *vtx, unsigned id) > { > - bc->bytecode[id] = S_SQ_VTX_WORD0_VTX_INST(vtx->op) | > + bc->bytecode[id] = > S_SQ_VTX_WORD0_VTX_INST(r600_isa_fetch_opcode(bc->isa->hw_class, vtx->op)) | > S_SQ_VTX_WORD0_BUFFER_ID(vtx->buffer_id) | > S_SQ_VTX_WORD0_FETCH_TYPE(vtx->fetch_type) | > S_SQ_VTX_WORD0_SRC_GPR(vtx->src_gpr) | > diff --git a/src/gallium/drivers/r600/r600_shader.c > b/src/gallium/drivers/r600/r600_shader.c > index be02412..46e2d08 100644 > --- a/src/gallium/drivers/r600/r600_shader.c > +++ b/src/gallium/drivers/r600/r600_shader.c > @@ -6939,7 +6939,7 @@ static int r600_do_buffer_txq(struct r600_shader_ctx > *ctx, int reg_idx, int offs > } else { > struct r600_bytecode_vtx vtx; > memset(&vtx, 0, sizeof(vtx)); > - vtx.op = FETCH_OP_GDS_MIN_UINT; /* aka GET_BUFFER_RESINFO */ > + vtx.op = FETCH_OP_GET_BUFFER_RESINFO; > vtx.buffer_id = id + R600_MAX_CONST_BUFFERS; > vtx.fetch_type = SQ_VTX_FETCH_NO_INDEX_OFFSET; > vtx.src_gpr = 0; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that use window-system MSAA. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741 Cc: mesa-sta...@lists.freedesktop.org Cc: Chad Versace --- src/mesa/drivers/dri/i965/brw_context.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index addacf2..e5d3b5c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1283,6 +1283,21 @@ intel_resolve_for_dri2_flush(struct brw_context *brw, intel_miptree_prepare_external(brw, rb->mt); } else { intel_renderbuffer_downsample(brw, rb); + + /* Call prepare_external on the single-sample miptree to do any + * needed resolves prior to handing it off to the window system. + * This is needed in the case that rb->singlesample_mt is Y-tiled + * with CCS_E enabled but without I915_FORMAT_MOD_Y_TILED_CCS_E. In + * this case, the MSAA resolve above will write compressed data into + * rb->singlesample_mt. + * + * TODO: Some day, if we decide to care about the tiny performance + * hit we're taking by doing the MSAA resolve and then a CCS resolve, + * we could detect this case and just allocate the single-sampled + * miptree without aux. However, that would be a lot of plumbing and + * this is a rather exotic case so it's not really worth it. + */ + intel_miptree_prepare_external(brw, rb->singlesample_mt); } } } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600/eg: make sure we allow vpm bit on other CF ops.
From: Dave Airlie the vpm bit wasn't being applied to the push/pop instructions. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/eg_asm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/r600/eg_asm.c b/src/gallium/drivers/r600/eg_asm.c index f8651bd..c03a9d8 100644 --- a/src/gallium/drivers/r600/eg_asm.c +++ b/src/gallium/drivers/r600/eg_asm.c @@ -137,6 +137,7 @@ int eg_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode_cf *cf) /* other instructions */ bc->bytecode[id++] = S_SQ_CF_WORD0_ADDR(cf->cf_addr >> 1); bc->bytecode[id] = S_SQ_CF_WORD1_CF_INST(opcode) | + S_SQ_CF_WORD1_VALID_PIXEL_MODE(cf->vpm) | S_SQ_CF_WORD1_BARRIER(1) | S_SQ_CF_WORD1_COND(cf->cond) | S_SQ_CF_WORD1_POP_COUNT(cf->pop_count) | -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600/cayman: initial attempt at gl_HelperInvocation (v2)
From: Dave Airlie This is a cayman only patch, it doesn't appear that evergreen supports the ALU on VPM. I'll try and figure it out later. All I can say for this patch is it passes the piglit test and the CTS tests. This also disable sb for helper invocations until it can handle the special ALU clause I'd like to push this (evergreen is left as an exercise for the reader :-) v2: move to using alu vpm mode, and just setting 0, -1. move calcs to top of pixel shader and store value. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_isa.c| 1 + src/gallium/drivers/r600/r600_isa.h| 5 +-- src/gallium/drivers/r600/r600_shader.c | 64 ++ src/gallium/drivers/r600/r600_shader.h | 1 + 4 files changed, 69 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/r600_isa.c b/src/gallium/drivers/r600/r600_isa.c index 2633cdcdb9..611b370bf5 100644 --- a/src/gallium/drivers/r600/r600_isa.c +++ b/src/gallium/drivers/r600/r600_isa.c @@ -506,6 +506,7 @@ static const struct cf_op_info cf_op_table[] = { {"ALU_EXT", { -1, -1, 0x0C, 0x0C }, CF_CLAUSE | CF_ALU | CF_ALU_EXT }, {"ALU_CONTINUE", { 0x0D, 0x0D, 0x0D, -1 }, CF_CLAUSE | CF_ALU }, {"ALU_BREAK", { 0x0E, 0x0E, 0x0E, -1 }, CF_CLAUSE | CF_ALU }, + {"ALU_VALID_PIXEL_MODE", { -1, -1, -1, 0x0E }, CF_CLAUSE | CF_ALU }, {"ALU_ELSE_AFTER",{ 0x0F, 0x0F, 0x0F, 0x0F }, CF_CLAUSE | CF_ALU }, {"CF_NATIVE", { 0x00, 0x00, 0x00, 0x00 }, 0 } }; diff --git a/src/gallium/drivers/r600/r600_isa.h b/src/gallium/drivers/r600/r600_isa.h index f6e26976c5..fcaf1f766b 100644 --- a/src/gallium/drivers/r600/r600_isa.h +++ b/src/gallium/drivers/r600/r600_isa.h @@ -646,10 +646,11 @@ struct cf_op_info #define CF_OP_ALU_EXT 84 #define CF_OP_ALU_CONTINUE 85 #define CF_OP_ALU_BREAK86 -#define CF_OP_ALU_ELSE_AFTER 87 +#define CF_OP_ALU_VALID_PIXEL_MODE 87 +#define CF_OP_ALU_ELSE_AFTER 88 /* CF_NATIVE means that r600_bytecode_cf contains pre-encoded native data */ -#define CF_NATIVE 88 +#define CF_NATIVE 89 enum r600_chip_class { ISA_CC_R600, diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index a462691f7a..54c67c7f83 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -197,6 +197,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, use_sb &= !shader->shader.uses_atomics; use_sb &= !shader->shader.uses_images; + use_sb &= !shader->shader.uses_helper_invocation; /* Check if the bytecode has already been built. */ if (!shader->shader.bc.bytecode) { @@ -346,6 +347,7 @@ struct r600_shader_ctx { boolean clip_vertex_write; unsignedcv_output; unsignededgeflag_output; + int helper_invoc_reg; int cs_block_size_reg; int cs_grid_size_reg; bool cs_block_size_loaded, cs_grid_size_loaded; @@ -1295,6 +1297,44 @@ static int load_sample_position(struct r600_shader_ctx *ctx, struct r600_shader_ return t1; } +static int eg_load_helper_invocation(struct r600_shader_ctx *ctx) +{ + /* TODO eg support */ + return -1; +} + +static int cm_load_helper_invocation(struct r600_shader_ctx *ctx) +{ + int r; + + struct r600_bytecode_alu alu; + + memset(&alu, 0, sizeof(struct r600_bytecode_alu)); + alu.op = ALU_OP1_MOV; + alu.dst.sel = ctx->helper_invoc_reg; + alu.dst.chan = 0; + alu.src[0].sel = V_SQ_ALU_SRC_LITERAL; + alu.src[0].value = 0x; + alu.dst.write = 1; + alu.last = 1; + r = r600_bytecode_add_alu(ctx->bc, &alu); + if (r) + return r; + + memset(&alu, 0, sizeof(struct r600_bytecode_alu)); + alu.op = ALU_OP1_MOV; + alu.dst.sel = ctx->helper_invoc_reg; + alu.dst.chan = 0; + alu.src[0].sel = V_SQ_ALU_SRC_0; + alu.dst.write = 1; + alu.last = 1; + r = r600_bytecode_add_alu_type(ctx->bc, &alu, CF_OP_ALU_VALID_PIXEL_MODE); + if (r) + return r; + + return 0; +} + static int load_block_grid_size(struct r600_shader_ctx *ctx, bool load_block) { struct r600_bytecode_vtx vtx; @@ -1458,6 +1498,12 @@ static void tgsi_src(struct r600_shader_ctx *ctx, r600_src->sel = load_block_grid_size(ctx, false); } else if (ctx->info.system_value_semantic_name[tgsi_src->Register.Index] == TGSI_SEMANTIC_BLOCK_SIZE) {
[Mesa-dev] [PATCH 4/9] glsl/lower_64bit: use the correct packing function for doubles
From: Dave Airlie This picks the correct double packing function. Signed-off-by: Dave Airlie --- src/compiler/glsl/lower_64bit.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index c7c6d1cb31..020ec2e9c3 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -255,8 +255,10 @@ lower_64bit::compact_destination(ir_factory &body, ir_variable *result[4]) { const ir_expression_operation pack_opcode = - type->base_type == GLSL_TYPE_UINT64 - ? ir_unop_pack_uint_2x32 : ir_unop_pack_int_2x32; + type->base_type == GLSL_TYPE_DOUBLE + ? ir_unop_pack_double_2x32 : + (type->base_type == GLSL_TYPE_UINT64 + ? ir_unop_pack_uint_2x32 : ir_unop_pack_int_2x32); ir_variable *const compacted_result = body.make_temp(type, "compacted_64bit_result"); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] glsl/lower_64bit: extract non-64bit sources from vectors.
From: Dave Airlie In order to deal with conversions properly we need to extract non-64bit sources from vectors instead of expanding them as the 64-bit code does. We need non-64bit sources for the 32->64 conversion functions. Signed-off-by: Dave Airlie --- src/compiler/glsl/lower_64bit.cpp | 38 ++ 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index ac62d1db1e..c7c6d1cb31 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -52,6 +52,7 @@ using namespace ir_builder; namespace lower_64bit { void expand_source(ir_factory &, ir_rvalue *val, ir_variable **expanded_src); +void extract_source(ir_factory &, ir_rvalue *val, ir_variable **extracted_src); ir_dereference_variable *compact_destination(ir_factory &, const glsl_type *type, @@ -226,6 +227,25 @@ lower_64bit::expand_source(ir_factory &body, expanded_src[i] = expanded_src[0]; } +void +lower_64bit::extract_source(ir_factory &body, +ir_rvalue *val, +ir_variable **extracted_src) +{ + ir_variable *const temp = body.make_temp(val->type, "tmp"); + + body.emit(assign(temp, val)); + unsigned i; + for (i = 0; i < val->type->vector_elements; i++) { + extracted_src[i] = body.make_temp(val->type->get_scalar_type(), "extracted_source"); + + body.emit(assign(extracted_src[i], swizzle(temp, i, 1))); + } + + for (/* empty */; i < 4; i++) + extracted_src[i] = extracted_src[0]; +} + /** * Convert a series of uvec2 results into a single 64-bit integer vector */ @@ -262,14 +282,24 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, void *const mem_ctx = ralloc_parent(ir); exec_list instructions; unsigned source_components = 0; - const glsl_type *const result_type = - ir->type->base_type == GLSL_TYPE_UINT64 - ? glsl_type::uvec2_type : glsl_type::ivec2_type; + const glsl_type *result_type; + + if (ir->type->is_64bit()) { + if (ir->type->base_type == GLSL_TYPE_UINT64 || + ir->type->base_type == GLSL_TYPE_DOUBLE) + result_type = glsl_type::uvec2_type; + else + result_type = glsl_type::ivec2_type; + } else + result_type = ir->type->get_scalar_type(); ir_factory body(&instructions, mem_ctx); for (unsigned i = 0; i < num_operands; i++) { - expand_source(body, ir->operands[i], src[i]); + if (ir->operands[i]->type->is_64bit()) + expand_source(body, ir->operands[i], src[i]); + else + extract_source(body, ir->operands[i], src[i]); if (ir->operands[i]->type->vector_elements > source_components) source_components = ir->operands[i]->type->vector_elements; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/9] glsl/lower_64bit: add ability to handle 32-bit sources.
From: Dave Airlie If this function saw a 32-bit source it would just return the IR without doing any conversion, this adds the ability to denote where 32-bit sources are expected and will be used in subsequent patches to add 32->64 conversions. Signed-off-by: Dave Airlie --- src/compiler/glsl/lower_64bit.cpp | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 020ec2e9c3..b72b5cf799 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -122,7 +122,7 @@ private: ir_factory added_functions; ir_rvalue *handle_op(ir_expression *ir, const char *function_name, -function_generator generator); +function_generator generator, bool conv_from_32bit = false); }; } /* anonymous namespace */ @@ -347,12 +347,15 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, ir_rvalue * lower_64bit_visitor::handle_op(ir_expression *ir, const char *function_name, - function_generator generator) + function_generator generator, + bool conv_from_32bit) { - for (unsigned i = 0; i < ir->num_operands; i++) - if (!ir->operands[i]->type->is_integer_64()) - return ir; - + if (conv_from_32bit == false) { + for (unsigned i = 0; i < ir->num_operands; i++) + if (!ir->operands[i]->type->is_integer_64() && + !ir->operands[i]->type->is_double()) +return ir; + } /* Get a handle to the correct ir_function_signature for the core * operation. */ -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] st/glsl: lower int->fp64 conversion if cap is set.
From: Dave Airlie This just enables the lowering if requested. Signed-off-by: Dave Airlie --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 43dd5fdd40..24dd5b35fd 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -7028,8 +7028,15 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) options->EmitNoIndirectUniform); } + unsigned what_to_64bit_lower = 0; + + if (pscreen->get_param(pscreen, PIPE_CAP_LOWER_INT_TO_FP64_CONVERSIONS)) + what_to_64bit_lower |= UI2D; if (!pscreen->get_param(pscreen, PIPE_CAP_INT64_DIVMOD)) - lower_64bit_instructions(ir, DIV64 | MOD64); + what_to_64bit_lower |= DIV64 | MOD64; + + if (what_to_64bit_lower) + lower_64bit_instructions(ir, what_to_64bit_lower); if (ctx->Extensions.ARB_shading_language_packing) { unsigned lower_inst = LOWER_PACK_SNORM_2x16 | -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] glsl/lower_64bit: add flag to denote we want to lower int/uint->double
From: Dave Airlie Some hardware has no conversion for these (cayman), so we want to lower them early using the common code. This adds a flag to allow the lowering pass to take these conversions into consideration. Signed-off-by: Dave Airlie --- src/compiler/glsl/ir_optimization.h | 2 ++ src/compiler/glsl/lower_64bit.cpp | 12 +++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 5b21319261..0e76951bda 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -60,6 +60,8 @@ #define SIGN64(1U << 1) #define DIV64 (1U << 2) #define MOD64 (1U << 3) +/* lower u->d and i->d */ +#define UI2D (1U << 4) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index b72b5cf799..a9b2b98f83 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -419,7 +419,17 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) *rvalue = handle_op(ir, "__builtin_umul64", generate_ir::umul64); } break; - + case ir_unop_i2d: + if (lowering(UI2D)) { +assert(ir->type->base_type == GLSL_TYPE_DOUBLE); +*rvalue = handle_op(ir, "__builtin_int_to_fp64", generate_ir::int_to_fp64, true); + } + break; + case ir_unop_u2d: + if (lowering(UI2D)) { +assert(ir->type->base_type == GLSL_TYPE_DOUBLE); +*rvalue = handle_op(ir, "__builtin_uint_to_fp64", generate_ir::uint_to_fp64, true); + } default: break; } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] some initial fp64 lowering to fix some cayman tests
Elie has been working on soft-fp64 code in a branch for a while, and I've been slowly helping trying to get it in place so we can enable GL4.x on the evergreen GPUs. However I've discovered on cayman we don't have int->double or uint->double support, and my conversion functions which did (int->f32->f64, and uint->f32->f64) weren't passing CTS tests or piglits. Instead of trying to do all this work in the r600 shader code, I've decided to start bringing in some of Elie's work now, and using it to fix the problem. This isn't soft-fp64, but it adds some of the required infrastructure that it will need, and might be an easier way to start introducing the big picture. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev