On 11/05/2013 10:16 PM, Paul Berry wrote:
On 1 November 2013 02:16, Tapani Pälli <tapani.pa...@intel.com
<mailto:tapani.pa...@intel.com>> wrote:
ir_deserializer can create a gl_shader structure out of binary
data from ir_serializer, will be used by OES_get_program_binary
implementation.
Signed-off-by: Tapani Pälli <tapani.pa...@intel.com
<mailto:tapani.pa...@intel.com>>
---
src/glsl/Makefile.sources | 1 +
src/glsl/ir_cache_deserializer.cpp | 1341
++++++++++++++++++++++++++++++++++++
src/glsl/ir_cache_deserializer.h | 301 ++++++++
3 files changed, 1643 insertions(+)
create mode 100644 src/glsl/ir_cache_deserializer.cpp
create mode 100644 src/glsl/ir_cache_deserializer.h
General comment: my attitude about error handling is different when
reviewing this patch than it was when reviewing the last patch,
because in the case of deserialization, we're dealing with data passed
in by the client, so I think it's far more important to be robust in
the case of malformed input data. We probably don't need to worry
about malicious clients, since the client is in process so if they
wanted to be malicious they could just walk all over memory. But it
will make life a lot easier for library clients (and for us when we're
debugging) if we can avoid having Mesa die in a fire when it receives
a corrupted shader binary.
Yes, I agree, it makes sense to have some more error checking in
deserialization, it has to be basic and quick though.
Having said that, I still don't think integer return values are the
way to go, for the following reasons:
- it's confusing for 0 to mean success.
- we're going to forget to check return values.
- it complicates the signatures of a lot of functions that should be
able to just return the thing that they read.
What I'd recommend doing instead is adding a "failed" boolean to
memory_map. That way rather than having to check every single return
value for failure, we just have to check the "failed" boolean at
appropriate critical times (such as inside loops and at the end of
functions like ir_read_variable()), and return early in the event of
failure. An additional advantage of this is that it makes it easy to
add bounds checking to memory_map--the read() and ffwd() functions can
simply check if the position goes out of bounds, and if so, set
"failed" flag to 0 and read zeros.
ok, I'll take a look how this would work out. The current checks have
actually helped as they catch not just invalid/corrupt data from client
but possible errors in serialization too, even though they are just
simple value checks but there's quite a lot of them and they help in
stopping the parsing immediately. If allowed to go further after error,
it'll be harder to track when did it start to fail so I think there will
be still a lot of checks.
+ir_deserializer::read_header(struct gl_shader *shader, memory_map
&map,
+ const char *mesa_sha)
Rather than pass the memory_map to all of the ir_deserializer
functions, I'd prefer to see it just be a member of the
ir_deserializer class. It's a closer parallel to what you do with
memory_writer in ir_serializer, and it makes the code easier to read
by making all of the function signatures and invocations smaller.
Yep, I think it can be changed, the actual memory_map will come from
caller of this class. I think read_glsl_type() is the only method which
will require support to pass a memory_map, this is used when reading
uniform_storage when deserializing a whole pgoram.
As with the memory_writer::write*() functions, I think the
memory_map::read() functions should include the type in their name,
and should use return by value rather than a pointer argument. E.g.:
int32_t read_int32();
bool read_bool();
and so on.
(Note: I think it's still ok to have a read() function that operates
on a pointer for reading larger structs).
Yes, will change these.
Although technically the C standard allows locals whose name begin
with an underscore (provided that what follows is a lower case
letter), it seems strange--names beginning with underscores typically
refer to globals that belong to libraries.
Also, it looks like all the callers of read_glsl_type pass this->state
for this parameter, so I don't understand why the parameter is needed
at all.
Yes, it is possible that in some of calls it might not be explicitly
needed parameter, read_glsl_type must support also situation where we
don't have state, this is used when reading uniform_storage.
+{
+ uint32_t type_size;
+
+ char *name = map.read_string();
+ map.read(&type_size);
+
+ const glsl_type *type_exists = NULL;
"type_exists" sounds like the name of a boolean. Can we rename this
to "existing_type"?
ok
+
+ if (_state && _state->symbols)
+ type_exists = _state->symbols->get_type(name);
I don't see any circumstances where _state or _state->symbols would be
NULL. Am I missing something?
state can be null, state->symbols is a paranoid check and can be removed
+
+ /* if type exists, move read pointer forward and return type */
+ if (type_exists) {
+ map.ffwd(type_size);
+ return type_exists;
+ }
A single shader is allowed to have multiple distinct interface types
with the same name (e.g. an input and an output with the block name,
but different interface members), so this won't work. It will cause
the distinct interface types to alias to each other.
This will need some more thought then, I'll check if there's some tests
that would utilize this.
I would have thought that the logic above for looking up existing
types would take care of samplers, since they are built-in types (and
hence populated in the symbol table before deserialization). Am I
missing something here?
Yes, samplers could be removed. I wanted this function to be able to
construct whatever type but you are right that samplers are all built-ins.
If we do need to keep this switch statement, I recommend adding some
error handling in the default case to make sure the "failed" flag is
set and then return glsl_type::error_type. That will reduce the risk
of Mesa crashing in the event of a corrupted shader binary.
I will remove samplers from here
+ glsl_struct_field *fields = ralloc_array(mem_ctx,
+ glsl_struct_field, length);
We should check for a memory allocation failure here and do
appropriate error handling; that way a corrupted binary that has an
outrageously large length stored in it won't lead to a crash.
ok, will fix
+ir_deserializer::read_ir_variable(struct exec_list *list,
memory_map &map)
General comment on code organization: I'd prefer to see
deserialization functions like these written as constructors (i.e.
this function would be an ir_variable constructor, which takes an
ir_deserializer as a single parameter). The advantages of this
approach are (a) the deserialization code is closer to the other
ir_variable-related code, so there's less danger of us forgetting to
update it when we change the structure of ir_variable, and (b) if we
change ir_variable and forget to update the deserialization code,
static analysis tools like coverity will alert us to the bug.
Yep, I'll refactor and go this way now.
uint8_t read_only;
map.read(&read_only);
var->read_only = read_only;
we simply would do:
var->read_only = map.read_uint8();
Of course, my idea from the last patch of serializing a "bits" data
structure would make this example moot.
I will do the separate 'bits' structure, it'll make code much more
easier to maintain.
+
+ list->push_tail(var);
+
+ return 0;
+}
Rather than having this function read the variable and add it to a
list, I'd prefer for it to read the variable and return a pointer to
it. Then the caller can decide whether to add it to a list or do
something else to it. The key advantage IMHO is more flexibility for
the future. For example, we have talked about the idea of changing
the storage of function parameter ir_variables so that they use an
array of ir_variable pointers rather than an exec_list. As the
deserialization code is written right now we have extra obstacles to
making that future change.
A similar comment applies to most of the other read...() functions below.
Sure, can be done. Currently it's been all the time about building the
lists so I wanted to have it in functions instead of caller.
+ uint32_t sig_amount;
As in the previous patch, I'd recommend renaming "sig_amount" to
"num_signatures" and "par_count" to "param_count".
ok
+ if (ir_type != ir_type_function_signature) {
+ CACHE_DEBUG("cache format error with function %s\n", name);
+ return -1;
+ }
It seems silly to store ir_type in the instruction stream and then
generate an error if it doesn't match. If we want to be robust in the
face of a corrupted shader binary, we should be checking for things
that don't make sense (like is_builtin being a number other than 0 or
1) rather than storing redundant information like ir_type in the
instruction stream and then checking that it equals the only value
that it could possibly be.
agreed, this check can be removed
+
+ /* insert function parameter variables to prototypes list
... */
+ foreach_list_const(node, &sig->parameters) {
+ ir_variable *var = ((ir_instruction *) node)->as_variable();
+ if (var)
+ prototypes->push_tail(var->clone(mem_ctx, NULL));
+ }
Can you explain what's going on with the variables in the prototypes
list? I don't understand what purpose this serves.
There can be further references to these variables so this list will be
then used to search for them.
+read_errors:
+ CACHE_DEBUG("%s: read errors with [%s]\n", __func__, name);
+ if (sig)
+ ralloc_free(sig);
Part of the point of ralloc is that we shouldn't have to do this kind
of clean up. Instead, in the event of failure, we should just free
the toplevel memory context, and all the memory we allocated during
the failed deserialization attempt will be automatically reclaimed.
If we let ralloc take care of reclaiming the memory at the end of
deserialization, then we can replace a lot of these goto labels for
error conditions with early returns, which I think will make the code
easier to read.
true, I will go through these cases
+ return NULL;
If we adopt the idea of the "failed" boolean that I talked about
above, then that gives us the opportunity to return a dummy
ir_dereference_array rather than NULL. The advantage of doing that is
that there's less risk of the caller segfaulting before it figures out
that we are in an error condition.
Of course, if we adopt my suggestion of turning these reader functions
into IR constructors, that will happen automatically.
Yes, I'm going for the constructor approach.
+
+read_return:
+ if (list)
+ list->push_tail(con);
This is a great example of why it should be the caller's
responsibility to add the deserialized IR to the appropriate list.
That way callers that need to put the IR in a list can do so, and
callers that don't need to put it in a list won't.
will be changed
+
+static uint8_t
+read_bool(memory_map &map)
+{
+ uint8_t value = 0;
+ map.read(&value);
+ return value;
+}
As in the previous patch, this should just be a function in memory_map.
will fix
Thanks for the review;
// Tapani
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev