Re: [Mesa-dev] [PATCH 3/4] glsl: add ir_cache class and functions for shader serialization

Tapani Pälli Thu, 31 Oct 2013 00:10:05 -0700

On 10/29/2013 03:48 PM, Paul Berry wrote:

On 29 October 2013 00:53, Tapani <tapani.pa...@intel.com<mailto:tapani.pa...@intel.com>> wrote:
    On 10/28/2013 10:09 PM, Paul Berry wrote:
        +static uint32_t _unique_id(ir_variable *var)
        +{
        +   char buffer[256];
        +   _mesa_snprintf(buffer, 256, "%s_%p", var->name, var);
        +   return _mesa_str_checksum(buffer);


    Two problems:

    1. This is a lot of work to obtain a unique identifier.  The
    pointer is already unique; we should just use that.
    I was originally using a pointer but then there's the 32 vs 64bit
    address problem. I wanted to avoid usage of pointers.
    2. There's no guarantee that the checksum of the string will be
    unique. Because of the birthday paradox, checksum collisions are
    probably going to happen all the time.
    I think there is because the pointer address is unique and is used
    as part of the string.
Yes, but the checksumming process destroys uniqueness (it has to,because the output of the checksum operation is a 32-bit int, and theinput is a string. There are more than 2^32 possible strings, so bythe pigeon hole principle at least one checksum must correspond tomore than one possible input string). In the case of_mesa_str_checksum() (which is not a particularly good hash from acryptographic perspective) the collisions are pretty easy to come by.For example, these were the first two collisions I found by writing aquick test program:
   printf("checksum of \"%s\" is 0x%x\n", "foo_0000003c",
          _mesa_str_checksum("foo_0000003c"));
   printf("checksum of \"%s\" is 0x%x\n", "foo_000001a8",
          _mesa_str_checksum("foo_000001a8"));

checksum of "foo_0000003c" is 0x1360
checksum of "foo_000001a8" is 0x1360


OK, I'll put the address itself back in use then.

    Instead of going to a lot of work to generate a unique id for
    each ir_variable, I think we should just serialize the
    ir_variable pointer address.  If we are trying to use the same
    binary format for 32-bit and 64-bit builds (see my clarifying
    question at the top of this email), we'll need to extend the
    pointer to 64 bits before serializing it.

    Alternatively, if you want to keep the unique ID's small, we'll
    have to use some hashtable mechanism to map each variable to its
    own ID.


    OK, I will verify if this is a problem. For me it does not seem
    like a problem as I'm using the address there.

        +}
        +
        +
        +/* data required to serialize ir_variable */
        +struct ir_cache_variable_data


    As with glsl_type_data, I don't understand why we need to have
    this class rather than just adding a serialize() method to the
    ir_variable class.


    This could be a serialize function in ir_cache_serialize.cpp. I
    did not want to touch existing ir classes, I think it helps to
    have the cache code separated and in one place. This is of course
    debatable, every ir instruction could have serialize() and
    unserialize() to deal with this.

My chief worry with separating them is that people will forget toupdate the serialization code when they change the ir class, becausethe thing they need to update will be far away. This is particularlya problem when adding new members to the class (something we do quitefrequently when implementing new GLSL features).

I understand and I've been worried about this from the maintainabilitypoint of view as well. I've thought to add some documentation on their.h file about this.


// Tapani

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] glsl: add ir_cache class and functions for shader serialization

Reply via email to