On 19 December 2013 14:52, Francisco Jerez <curroje...@riseup.net> wrote:
> Paul Berry <stereotype...@gmail.com> writes: > > >[...] > > In v2 of this patch, you add the following code to > > fs_visitor::try_copy_propagate(): > > > > + /* Bail if the result of composing both strides cannot be expressed > > + * as another stride. > > + */ > > + if (entry->src.stride != 1 && > > + (inst->src[arg].stride * > > + type_sz(inst->src[arg].type)) % type_sz(entry->src.type) != 0) > > return false; > > > > I don't understand. It seems like this is trying to make sure that > > (src_stride * src_type_sz) / entry_type_sz is an integer so we can later > > use the factor (src_type_sz / entry_type_sz) as a multiplicative factor > to > > correct the stride without creating a fractional result. But I don't see > > us applying this correction factor anywhere. Is there some code missing? > > That isn't exactly the purpose of that check. > > The problem arises when you are trying to compose two regions of > different base types. If the stride in bytes of 'inst->src[arg]' is not > a multiple of the element size of 'entry->src' you end up with an > inhomogeneous stride that can only be expressed as a two-dimensional > region. Instead of adding support for two-dimensional regions (which > would be hard, not very useful, and only a partial solution because then > we would need to handle composition of two-dimensional regions that > yield three- and four-dimensional regions as a result) I disallowed > copy-propagation in that case. > > Thanks. > Ok, I see. I had to spend quite a while fiddling with scratch paper to understand this. Would you mind adding an example to the comment to clarify this? Perhaps something like this: /* Bail if the result of composing both strides cannot be expressed as * another stride. This avoids, for example, trying to transform this: * * MOV (8) rX<1>UD rY<0;1,0>UD * FOO (8) ... rX<8;8,2>UW * * into this: * * FOO (8) ... rY<8;8,0>UW * * Which would have different semantics. */ Note: the code sequence in my example is bogus because it crosses data between SIMD execution paths, so it would never actually arise in practice. I wasn't able to come up with a realistic example; in fact, after trying and failing to come up with a non-bogus example, I'm pretty convinced that this particular bailout case will never happen unless there's a bug elsewhere in the compiler. You might want to consider adding an assert(false) into the code path so that if it ever does occur we'll have a chance to investigate. I don't have strong feelings about adding the assert, though. With the comment updated (and the other comment suggestion from my previous email about is_power_of_two()), this patch is: Reviwed-by: Paul Berry <stereotype...@gmail.com>
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev