Jason Ekstrand <ja...@jlekstrand.net> writes: > On Fri, Jul 31, 2015 at 6:15 AM, Francisco Jerez <curroje...@riseup.net> > wrote: >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >>> On Thu, Jul 23, 2015 at 4:38 AM, Francisco Jerez <curroje...@riseup.net> >>> wrote: >>>> Jason Ekstrand <ja...@jlekstrand.net> writes: >>>> >>>>> This all looks correct as far as I can tell. However, I'm very >>>>> concerned about the number of checks such as >>>>> has_matching_typed_format() that are built-in to the compiler (via >>>>> surface_builder) where we then go on to do something that is highly >>>>> dependant on state setup doing the exact same check (not via >>>>> surface_builder) and doing the right thing. Another example would be >>>>> the interplay with has_split_bit_layout. How do we know these won't >>>>> get out of sync? >>>>> >>>> They cannot get out of sync because the policy which surface format to >>>> use for a given GLSL format is shared between the compiler and the >>>> state-setup code, see brw_lower_mesa_image_format() in "i965: Implement >>>> surface state set-up for shader images.". >>> >>> I came back to this today and looked at it again. I think my main >>> objection is that all of these metadata functions work in terms of a >>> devinfo and a format and then immediately call >>> brw_lower_mesa_image_format. Why don't they work instead in terms >>> comparing the lowered format to the nominal format? Then it would be >>> a lot more obvious that you're doing a transformation from one format >>> to another. As is, it's a bunch of magic "what do I have to do next" >>> queries >> >> I don't think any of these functions implies what to do next, they all >> encode clearly defined device-specific properties of the formats: >> >> - Does the device support any format with the same bit layout? >> (has_supported_bit_layout()) >> >> - Does the device support any format with the same bit layout *and* >> encoding? (is_conversion_trivial()) >> >> - Does the device support a typed format of the same size? >> (has_matching_typed_format()) >> >> - Does the device split the pixel data into a number of discontiguous >> segments? (has_split_bit_layout()) >> >> - Does the device return garbage in the unused bits when reading from >> the given format? (has_undefined_high_bits()) >> >> - Do all fields of the format have the same size? (This is not even >> device-specific) >> >> - Is the format represented as a signed integer in memory? (Ditto) >> >> You may have noticed that except for the last two the natural subject >> and object of all questions are "device" and "format" respectively -- >> I'm not sure I see how rephrasing them to involve the lowered format >> (that in fact a number of them don't care about) would make any of these >> questions easier to understand. > > I *understand* the questions being asked. My claim is this is the > *wrong* set of questions. By the time you get to this stage, someone > higher up has made the decision that they're not going to give you > format A but will rather give you format B which they know will also > work. (This decision is made by code that understands the hardware > limitations and you're guaranteed that B is a format you can read and > transform into A.) The fact that that decision is passed down through > brw_lower_mesa_image_format is secondary. At this stage, you have two > formats, A and B and need to figure out how to get from one to the > other. > > For example, instead of "Does the device support a typed format of the > same size?" the better question would be "Are these two formats the > same size?".
Those two questions are not equivalent, lowered formats are always the same size as the original, whether a format is supported for typed surface access or not is inherently devinfo-specific, the devinfo argument couldn't be replaced with a lower_format argument. The same goes for "Does the device return garbage in the unused bits when reading From the given format?". > Instead of asking "Does the device support any format with the same > bit layout *and* encoding?" you should ask "Do these two formats have > the same bit layout *and* encoding?" > *Shrug*, I think they're only wrong questions if you start from the assumption that the policy that computes the lowered format from a given image format is broken -- E.g. that it gives you a suboptimal format of different encoding even though the hardware supports an identically encoded one. You can quickly rule out that assumption by looking up the definition of is_conversion_trivial() and seeing that it does little more than call brw_lower_mesa_image_format(). I'm afraid that I was slightly misleading earlier: is_conversion_trivial() is really about the encoding and only has indirect bit layout implications. The current implementation is based on the assumption that the lowered format we're converting from or to is just the canonical one -- I.e. it cannot be readily extended to check whether the conversion between any two arbitrary formats is trivial, so even though replacing the devinfo argument with a lower_format would be technically possible it would lead to an IMO a more disturbing problem. is_conversion_trivial(mesa_format, mesa_format) would be a partial function with ill-defined results for certain combinations of its arguments, and it wouldn't have any way to assert() that the pair is sensible without an additional devinfo pointer. To make the matter worse the C++ type system cannot distinguish between a lowered and an unlowered format while it can distinguish between a devinfo and a format. The current image_format_info queries are total functions of their arguments and they are all consistent about taking the original format *only*. I'd like to preserve those properties. > Does the difference I'm making make sense? In the current code, it > looks like you're doing device queries and making decisions about how > to work around device limitations. emit_image_load and store don't make any decisions themselves, and I'm not sure I fully understand what made you think they were -- The code-paths taken through them are driven by the image_format_info queries so I understand that you may have thought that *those* were the ones making decisions, or not, either way it's not an obstacle to convince oneself of the correctness of emit_image_load/store if you temporarily assume that the results returned by image_format_info are sensible and consistent with the canonical format lowering policy -- Assumption you can quickly verify afterwards because their semantics are clearly defined and their result is usually a simple function of brw_lower_mesa_image_format(). There's one case however in which I think it would be fine to replace the devinfo argument with a mesa_format one (although it would somewhat break the rule of only passing unlowered formats to the queries). It may also be the one that contributed the most to your impression that emit_image_load and store were making decisions: We could change "has_supported_bit_layout(devinfo, format)" to "has_same_bit_layout(format, lower_format)". has_same_bit_layout() would be total and commutative, so my objections above don't apply. For the other ones I could mention more explicitly in the doxygen comments that the functions don't make any decisions independently and the result is based on the usual lowering policy for the given device. > Hence my original comment about things getting out of sync. In > reality, all of those decisions have already been made by > brw_lower_mesa_image_format, this code is just doing what it needs to > do to go from format A to format B. > >>> which, while in the same namespace, are in a different file. >> >> I happen to have cleaned that up to be in the same file a short while >> ago -- They're no longer used by anything else outside of >> brw_fs_surface_builder.cpp: >> http://cgit.freedesktop.org/~currojerez/mesa/commit/?h=image-load-store-lower&id=6b83876a343cabb89083212d20d5b3721b10c200 >> >>> It would also be more efficient because you would only call >>> brw_lower_mesa_image_format once. >> >> A more elegant way to "fix" that would be to mark >> brw_lower_mesa_image_format() as pure, or enable LTO -- It's not like it >> would make any measurable difference in practice though. >> >>> >>> Does that make more sense? >>> --Jason >>> >>>> Thanks. >>>> >>>>> On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez <curroje...@riseup.net> >>>>> wrote: >>>>>> v2: Drop VEC4 suport. >>>>>> v3: Rebase. >>>>>> --- >>>>>> .../drivers/dri/i965/brw_fs_surface_builder.cpp | 216 >>>>>> +++++++++++++++++++++ >>>>>> src/mesa/drivers/dri/i965/brw_fs_surface_builder.h | 17 ++ >>>>>> 2 files changed, 233 insertions(+) >>>>>> >>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp >>>>>> b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp >>>>>> index ea1c4aa..46b449f 100644 >>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp >>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp >>>>>> @@ -587,3 +587,219 @@ namespace { >>>>>> } >>>>>> } >>>>>> } >>>>>> + >>>>>> +namespace brw { >>>>>> + namespace image_access { >>>>>> + /** >>>>>> + * Load a vector from a surface of the given format and >>>>>> dimensionality >>>>>> + * at the given coordinates. >>>>>> + */ >>>>>> + fs_reg >>>>>> + emit_image_load(const fs_builder &bld, >>>>>> + const fs_reg &image, const fs_reg &addr, >>>>>> + mesa_format format, unsigned dims) >>>>>> + { >>>>>> + using namespace image_format_info; >>>>>> + using namespace image_format_conversion; >>>>>> + using namespace image_validity; >>>>>> + using namespace image_coordinates; >>>>>> + using namespace surface_access; >>>>>> + const brw_device_info *devinfo = bld.shader->devinfo; >>>>>> + const mesa_format lower_format = >>>>>> + brw_lower_mesa_image_format(devinfo, format); >>>>>> + fs_reg tmp; >>>>>> + >>>>>> + if (has_matching_typed_format(devinfo, format)) { >>>>>> + /* Hopefully we get here most of the time... */ >>>>>> + tmp = emit_typed_read(bld, image, addr, dims, >>>>>> + >>>>>> _mesa_format_num_components(lower_format)); >>>>>> + } else { >>>>>> + /* Untyped surface reads return 32 bits of the surface per >>>>>> + * component, without any sort of unpacking or type >>>>>> conversion, >>>>>> + */ >>>>>> + const unsigned size = _mesa_get_format_bytes(format) / 4; >>>>>> + >>>>>> + /* they don't properly handle out of bounds access, so we >>>>>> have to >>>>>> + * check manually if the coordinates are valid and >>>>>> predicate the >>>>>> + * surface read on the result, >>>>>> + */ >>>>>> + const brw_predicate pred = >>>>>> + emit_bounds_check(bld, image, addr, dims); >>>>>> + >>>>>> + /* and they don't know about surface coordinates, we need to >>>>>> + * convert them to a raw memory offset. >>>>>> + */ >>>>>> + const fs_reg laddr = emit_address_calculation(bld, image, >>>>>> addr, dims); >>>>>> + >>>>>> + tmp = emit_untyped_read(bld, image, laddr, 1, size, pred); >>>>>> + >>>>>> + /* An out of bounds surface access should give zero as >>>>>> result. */ >>>>>> + for (unsigned c = 0; c < 4; ++c) >>>>>> + set_predicate(pred, bld.SEL(offset(tmp, bld, c), >>>>>> + offset(tmp, bld, c), >>>>>> fs_reg(0))); >>>>>> + } >>>>>> + >>>>>> + /* Set the register type to D instead of UD if the data type is >>>>>> + * represented as a signed integer in memory so that sign >>>>>> extension >>>>>> + * is handled correctly by unpack. >>>>>> + */ >>>>>> + if (needs_sign_extension(format)) >>>>>> + tmp = retype(tmp, BRW_REGISTER_TYPE_D); >>>>>> + >>>>>> + if (!has_supported_bit_layout(devinfo, format)) { >>>>>> + /* Unpack individual vector components from the bitfield if >>>>>> the >>>>>> + * hardware is unable to do it for us. >>>>>> + */ >>>>>> + if (has_split_bit_layout(devinfo, format)) >>>>>> + tmp = emit_pack(bld, tmp, get_bit_shifts(lower_format), >>>>>> + get_bit_widths(lower_format)); >>>>>> + else >>>>>> + tmp = emit_unpack(bld, tmp, get_bit_shifts(format), >>>>>> + get_bit_widths(format)); >>>>>> + >>>>>> + } else if ((needs_sign_extension(format) && >>>>>> + !is_conversion_trivial(devinfo, format)) || >>>>>> + has_undefined_high_bits(devinfo, format)) { >>>>>> + /* Perform a trivial unpack even though the bit layout >>>>>> matches in >>>>>> + * order to get the most significant bits of each component >>>>>> + * initialized properly. >>>>>> + */ >>>>>> + tmp = emit_unpack(bld, tmp, color_u(0, 32, 64, 96), >>>>>> + get_bit_widths(format)); >>>>>> + } >>>>>> + >>>>>> + if (!_mesa_is_format_integer(format)) { >>>>>> + if (is_conversion_trivial(devinfo, format)) { >>>>>> + /* Just need to cast the vector to the target type. */ >>>>>> + tmp = retype(tmp, BRW_REGISTER_TYPE_F); >>>>>> + } else { >>>>>> + /* Do the right sort of type conversion to float. */ >>>>>> + if (_mesa_get_format_datatype(format) == GL_FLOAT) >>>>>> + tmp = emit_convert_from_float( >>>>>> + bld, tmp, get_bit_widths(format)); >>>>>> + else >>>>>> + tmp = emit_convert_from_scaled( >>>>>> + bld, tmp, get_bit_widths(format), >>>>>> + _mesa_is_format_signed(format)); >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + /* Initialize missing components of the result. */ >>>>>> + return emit_pad(bld, tmp, get_bit_widths(format)); >>>>>> + } >>>>>> + >>>>>> + /** >>>>>> + * Store a vector in a surface of the given format and >>>>>> dimensionality at >>>>>> + * the given coordinates. >>>>>> + */ >>>>>> + void >>>>>> + emit_image_store(const fs_builder &bld, const fs_reg &image, >>>>>> + const fs_reg &addr, const fs_reg &src, >>>>>> + mesa_format format, unsigned dims) >>>>>> + { >>>>>> + using namespace image_format_info; >>>>>> + using namespace image_format_conversion; >>>>>> + using namespace image_validity; >>>>>> + using namespace image_coordinates; >>>>>> + using namespace surface_access; >>>>>> + const brw_device_info *devinfo = bld.shader->devinfo; >>>>>> + >>>>>> + if (format == MESA_FORMAT_NONE) { >>>>>> + /* We don't know what the format is, but that's fine >>>>>> because it >>>>>> + * implies write-only access, and typed surface writes are >>>>>> always >>>>>> + * able to take care of type conversion and packing for us. >>>>>> + */ >>>>>> + emit_typed_write(bld, image, addr, src, dims, 4); >>>>>> + >>>>>> + } else { >>>>>> + const mesa_format lower_format = >>>>>> + brw_lower_mesa_image_format(devinfo, format); >>>>>> + fs_reg tmp = src; >>>>>> + >>>>>> + if (!is_conversion_trivial(devinfo, format)) { >>>>>> + /* Do the right sort of type conversion. */ >>>>>> + if (_mesa_get_format_datatype(format) == GL_FLOAT) >>>>>> + tmp = emit_convert_to_float(bld, tmp, >>>>>> get_bit_widths(format)); >>>>>> + >>>>>> + else if (_mesa_is_format_integer(format)) >>>>>> + tmp = emit_convert_to_integer(bld, tmp, >>>>>> get_bit_widths(format), >>>>>> + >>>>>> _mesa_is_format_signed(format)); >>>>>> + >>>>>> + else >>>>>> + tmp = emit_convert_to_scaled(bld, tmp, >>>>>> get_bit_widths(format), >>>>>> + >>>>>> _mesa_is_format_signed(format)); >>>>>> + } >>>>>> + >>>>>> + /* We're down to bit manipulation at this point. */ >>>>>> + tmp = retype(tmp, BRW_REGISTER_TYPE_UD); >>>>>> + >>>>>> + if (!has_supported_bit_layout(devinfo, format)) { >>>>>> + /* Pack the vector components into a bitfield if the >>>>>> hardware >>>>>> + * is unable to do it for us. >>>>>> + */ >>>>>> + if (has_split_bit_layout(devinfo, format)) >>>>>> + tmp = emit_unpack(bld, tmp, >>>>>> get_bit_shifts(lower_format), >>>>>> + get_bit_widths(lower_format)); >>>>>> + >>>>>> + else >>>>>> + tmp = emit_pack(bld, tmp, get_bit_shifts(format), >>>>>> + get_bit_widths(format)); >>>>>> + } >>>>>> + >>>>>> + if (has_matching_typed_format(devinfo, format)) { >>>>>> + /* Hopefully we get here most of the time... */ >>>>>> + emit_typed_write(bld, image, addr, tmp, dims, >>>>>> + >>>>>> _mesa_format_num_components(lower_format)); >>>>>> + >>>>>> + } else { >>>>>> + /* Untyped surface writes store 32 bits of the surface >>>>>> per >>>>>> + * component, without any sort of packing or type >>>>>> conversion, >>>>>> + */ >>>>>> + const unsigned size = _mesa_get_format_bytes(format) / 4; >>>>>> + >>>>>> + /* they don't properly handle out of bounds access, so >>>>>> we have >>>>>> + * to check manually if the coordinates are valid and >>>>>> predicate >>>>>> + * the surface write on the result, >>>>>> + */ >>>>>> + const brw_predicate pred = >>>>>> + emit_bounds_check(bld, image, addr, dims); >>>>>> + >>>>>> + /* and, phew, they don't know about surface coordinates, >>>>>> we >>>>>> + * need to convert them to a raw memory offset. >>>>>> + */ >>>>>> + const fs_reg laddr = emit_address_calculation(bld, >>>>>> image, addr, dims); >>>>>> + >>>>>> + emit_untyped_write(bld, image, laddr, tmp, 1, size, >>>>>> pred); >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + /** >>>>>> + * Perform an atomic read-modify-write operation in a surface of >>>>>> the >>>>>> + * given dimensionality at the given coordinates. Main building >>>>>> block >>>>>> + * of the imageAtomic GLSL built-ins. >>>>>> + */ >>>>>> + fs_reg >>>>>> + emit_image_atomic(const fs_builder &bld, >>>>>> + const fs_reg &image, const fs_reg &addr, >>>>>> + const fs_reg &src0, const fs_reg &src1, >>>>>> + unsigned dims, unsigned rsize, unsigned op) >>>>>> + { >>>>>> + using namespace image_validity; >>>>>> + using namespace image_coordinates; >>>>>> + using namespace surface_access; >>>>>> + /* Avoid performing an atomic operation on an unbound surface. >>>>>> */ >>>>>> + const brw_predicate pred = emit_surface_check(bld, image); >>>>>> + >>>>>> + /* Thankfully we can do without untyped atomics here. */ >>>>>> + const fs_reg tmp = emit_typed_atomic(bld, image, addr, src0, >>>>>> src1, >>>>>> + dims, rsize, op, pred); >>>>>> + >>>>>> + /* An unbound surface access should give zero as result. */ >>>>>> + if (rsize) >>>>>> + set_predicate(pred, bld.SEL(tmp, tmp, fs_reg(0))); >>>>>> + >>>>>> + return tmp; >>>>>> + } >>>>>> + } >>>>>> +} >>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h >>>>>> b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h >>>>>> index 5424be6..cad8c27 100644 >>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h >>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.h >>>>>> @@ -213,5 +213,22 @@ namespace brw { >>>>>> _mesa_get_format_datatype(format) == GL_INT); >>>>>> } >>>>>> } >>>>>> + >>>>>> + namespace image_access { >>>>>> + fs_reg >>>>>> + emit_image_load(const fs_builder &bld, >>>>>> + const fs_reg &image, const fs_reg &addr, >>>>>> + mesa_format format, unsigned dims); >>>>>> + >>>>>> + void >>>>>> + emit_image_store(const fs_builder &bld, const fs_reg &image, >>>>>> + const fs_reg &addr, const fs_reg &src, >>>>>> + mesa_format format, unsigned dims); >>>>>> + fs_reg >>>>>> + emit_image_atomic(const fs_builder &bld, >>>>>> + const fs_reg &image, const fs_reg &addr, >>>>>> + const fs_reg &src0, const fs_reg &src1, >>>>>> + unsigned dims, unsigned rsize, unsigned op); >>>>>> + } >>>>>> } >>>>>> #endif >>>>>> -- >>>>>> 2.4.3 >>>>>> >>>>>> _______________________________________________ >>>>>> mesa-dev mailing list >>>>>> mesa-dev@lists.freedesktop.org >>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev