Working through the new pass... Overall it looks pretty good. There's a certain level of trust I'll extend WRT getting the low level details right -- a thorough testsuite obviously helps there.


+const pass_data pass_data_sprintf_length = {
+  GIMPLE_PASS,        // pass type
+  "sprintf_length",   // pass name
+  OPTGROUP_NONE,      // optinfo_flags
+  TV_NONE,            // tv_id
+  PROP_cfg,           // properties_required
+  0,                 // properties_provided
+  0,                 // properties_destroyed
+  0,                 // properties_start
+  0,                 // properties_finish
So the worry here is that you need PROP_ssa in the late pass. I'm not sure if we have the infrastructure to allow you to distinguish the two. I guess we could literally make it two passes with two distinct pass_data structures.


In handle_gimple_call, you don't handle anti ranges -- I haven't looked closely at canonicalization rules in VRP, so I don't know if you're likely to see a range like ![MININT, -1] which would be a synonym for [0..MAXINT]. It might be worth doing some instrumentation to see if you're getting useful anti-ranges with any kind of consistency. ISTM the most interesting ones are going to allow you to drop or insist on a "-" sign.



+      /* As a special case, bounded function allow the explicitly
+        specified destination size argument to be zero as a request
+        to determine the number of bytes on output without actually
+        writing any output.  Disable length checking for those.  */
This doesn't parse well.


+  /* Try to use __builtin_object_size although it rarely returns
+     a useful result even for straighforward cases.  */
Hopefully you've stashed away some tests for this so that we can work on them independently of the sprintf checking itself? ISTM that the recursive handling of POINTER_PLUS_EXPR really ought to move into the generic builtin_object_size handling itself as an independent change.



+  /* Bail early if format length checking is disabled, either
+     via a command line option, or as a result of the size of
+     the destination object not being available, or due to
+     format directives or arguments encountered during processing
+     that prevent length tracking with any reliability.  */
+
+  if (HOST_WIDE_INT_MAX <= info.objsize)
+      return;
I think the return is indented too deep.

+      if (*pf == 0)
+       {
+         /* Incomplete directive.  */
+         return;
+       }
For this and the other early returns from compute_format_length, do we know if -Wformat will catch these errors? Might that be a low priority follow-up project if it doesn't?


+static void
+add_bytes (const pass_sprintf_length::call_info &info,
+          const char                           *beg,
+          const char                           *end,
+          format_result                        *res)
In general, don't try to line up parameters, args or variable declarations like you've done here. Similarly in other places.

+  /* if (warn_format_length */
+  /*     && (overflowed || navail < nbytes */
+  /*     || (1 < warn_format_length && )) */
Presumably old implementation and/or debugging code...  Please remove it.

Please check indention of the curleys & code block in the loop over the phi arguments in get_string_length.


In format_floating, we end up using actual host floating point arithmetic. That's generally been frowned upon. We're not doing anything in terms of code generation here, so ultimately that might be what allows us to still be safe from a cross compilation standpoint.

It's also not clear if that routine handles the other target floating point formats. For things like VAX FP, I'm comfortable issuing a warning that this stuff isn't supported on that target. We have other targets where a double might be the same size as a float. Though I guess in that case the worst we get is false positives, so that may not be a huge issue either. I'm not sure how to check for this stuff without bringing in aspects of the target though, which we'd like to avoid.

In format_integer you have target specific ifdefs (solaris and aix). First we want to avoid conditionally compiled code. Second, from a modularity something like TARGET_AIX and TARGET_SOLARIS really aren't appropriate in here. I suspect what you're going to want to do is add something to the targetm structure that you can query and/or call here is the best we can do.

Again, overall it looks good. My biggest concern is format_integer and format_float and the bleeding of target dependencies into this code. To some degree it may be unavoidable and we can mitigate the damage with stuff in targetm -- I'd like to hear your thoughts.

Jeff

Reply via email to