On 6/25/19 5:03 PM, Martin Sebor wrote:

> 
> The caller ensures that handle_char_store is only called for stores
> to arrays (MEM_REF) or single elements as wide as char.
Where?  I don't see it, even after fixing the formatting in
strlen_check_and_optimize_stmt :-)

>   gimple *stmt = gsi_stmt (*gsi);
> 
>   if (is_gimple_call (stmt))

I think we can agree that we don't have a call, so this is false.

>  else if (is_gimple_assign (stmt) && !gimple_clobber_p (stmt))
>     {
>       tree lhs = gimple_assign_lhs (stmt);
This should be true IIUC, so we'll go into its THEN block.


>       if (TREE_CODE (lhs) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE (lhs)))
Should be false.

>       else if (TREE_CODE (lhs) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE 
> (lhs)))

Should also be false.

>       else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
Should be true since LHS is a MEM_REF.


>        {
>           tree type = TREE_TYPE (lhs);
>           if (TREE_CODE (type) == ARRAY_TYPE)
>             type = TREE_TYPE (type);
>           if (TREE_CODE (type) == INTEGER_TYPE
>               && TYPE_MODE (type) == TYPE_MODE (char_type_node)
>               && TYPE_PRECISION (type) == TYPE_PRECISION (char_type_node))
>             {
>               if (! handle_char_store (gsi))
>                 return false;
>             }
>         }
If TREE_TYPE (type) is an ARRAY_TYPE, we strip the ARRAY_TYPE.  We then
verify that TYPE is a single character type.  _But_ we stripped away the
ARRAY_TYPE.  So ISTM that we allow either an array of chars or a single
char on the LHS.

So how does this avoid multiple character stores?!?  We could have had
an ARRAY_REF on the LHS and we never check the number of elements in the
array.  There's no check on the RHS either.  SO I don't see how we
guarantee that we're dealing with a single character store.

What am I missing here?



> 
> What you describe sounds like
> 
>   char a[N];
>   *(int*)a = 0x31323300;
> 
> which is represented as
> 
>   MEM[(int *)&b] = 825373440;This would be closer (I realize it's not C):

  char a[N];
  a[0..3] = 0x313233300;


> 
> The LHS type of that is int so the function doesn't get called.
I'm concerned about the case where the LHS is an array.

>> And if the NUL byte in the original was at byte offset 2, then didn't we
>> just change the length by overwriting where the NUL is?
> 
> No, because cmp is the result of compare_nonzero_chars and cmp > 0
> means:
> 
>   1 if SI is known to start with more than OFF nonzero characters
> 
> i.e., the character is being stored before the terminating nul.
> This is the basis of the original optimization:
> 
>   /* If si->nonzero_chars > OFFSET, we aren't overwriting '\0',
>      and if we aren't storing '\0', we know that the length of the
>      string and any other zero terminated string in memory remains
>      the same.
But all this is predicated on the assumption that we're dealing with a
single character memory store.  I don't see what enforces that precondition.

Reply via email to