------- Comment #10 from rogerio at rilhas dot com  2010-08-11 01:57 -------
I'm replying now not in the context of the bug (since as I mentioned I must
move on), but just as a conversation between 2 persons. So please don't getting
me wrong for insisting.
The cdecl calling convention on x86-32 machines says that for any function call
the arguments are placed on the stack as a series of 4-byte pushed values. I've
checked that GCC respects this convention when calling a function and not
optimized. So, for a typical GCC functiona call GCC pushes the parameters
correctly according to the cdecl calling convention:

func(a, b, c, d)

... the parameters will pushed right-to-left like:

push d (4-byte)
push c (4-byte)
push b (4-byte)
push a (4-byte)
call func

Obviously there are exceptions for doubles and structs, but this is not the
case here as char* and int are pushed exactly the same way, as 4-byte values.
This is well defined, right? The cdecl calling convention defines this clearly,
correct? Or does GCC somehow uses another calling convention?

Another thing well-defined in C is that when I request the address of a
variable I get the address of it. So, inside func, when I request &a I get the
stack address where "a" was pushed to. This is also well defined, right?

Another thing well defined in C is what happens when navigating an array out of
its bounds. Unlinke what you say, the behaviour is deterministic in absolutely
every machine: if my PTR4 pointer to a 4-byte value contains the value X, then
PTR4[1] accesses address memory X+4. The same way, if my PTR2 points to a
2-byte value at address Y, then PTR2[1] will access address Y+2. There is no
hack here either, the behaviour is very well defined, and C doesn't care about
or check the bounds of the array.

So, if the calling convention states (in the example above) that "b" is placed
4 bytes after "a", and that "c" is placed 4 bytes after "b", and so on, it
follows, without any doubt, that if I get a PTR4 to point to "a", and the
address of "a" on the stack is X, then PTR4[0] accesses "a", PTR4[1] accesses
"b", PTR4[2] accesses "c", and so on. It doesn't mater the size of the array, C
will not check nor care about it, I can navigate the stack with it. That is
what I do with format_address, since it is a char** and char** has size 4 then
format_address is a PTR4. So, format_address[0] is "a" (or it should be if
&format returned the true address of "a"), and *without any doubt or hack*
format_address[1] is "b", and *with exactly the same confidence*,
format_address[2] is "c", and so on. This is one of the most well established
bases in C, for several decades now, so there is really no arguing about it.

The problem is that GCC, when optimizing, places "a" somewhere on the stack
which is not contiguous to "b". I think (although I'm not sure) that GCC does
something like this:

push d
push c
push b
push a
call format_direct

format_direct:
push some varied stuff
push [a copy of the original "a" again somewhere]
push [address of the above copy of "a"]
call format_indirect

format_indirect: (example)
mov eax,[address of the copy of "a"+0] // not the original "a", but the copy
mov ebx,[address of the copy of "a"+4] // contains some varied stuff, not "b"

I'm not sure if GCC does exactly this, but I'm sure that, when optimizing, GCC
does not place "a" adjacent to "b", and it should. Since I cannot get to "b"
I'm not sure if the same happens to "c".

This clearly violates the calling convention. If the calling convention were
respected then "a", "b", "c", and "d" would all be adjacent, so I could
navigate the 1-entry (4-byte) array using format_address[0], format_address[1],
format_address[2], and format_address[3]. I don't see how you can refute one of
the most well established properties of C, as the language allows me to
populate any memory buffer with whichever stuff and then navigate with a 4-byte
pointer (as long as properly aligned, which is the case here). I'm just doing
that with the stack, wich should be (by the calling convention) a 4-byte
aligned memory buffer with 4 adjacent 4-byte values "abcd".

So, my point is: if - req1) GCC placed the parameters on the stack adjacent to
one another (as it should as a result of the selected calling convention) *AND*
- req2) if it gave me the address of the original format parameter on the stack
(as it should by the definition of getting the address of a function parameter)
then the code would work correctly, since - req3) is well established that C
does not perform array boundary checking and, thus, PTR4[1] accesses memory
location exactly 4 bytes after memory location PTR4[0].

Since I believe none of these 3 requirements is refutable, and since they are
all well established, then - bug1) GCC is not putting the parameters adjacent
on the stack as it should *OR* - bug2) GCC is not giving me the correct address
when I ask for &format. The - bug3) of not doing pointer arithmetic correctly
because the array has size 1 (your argument) is simply not true, I checked it
while debugging, and I've seen it to be correct on the most varied machines
(from PIC microcontrollers to PC's and mobile phones), otherwise it will not be
C. The only thing that is allowed to vary (for performance reasons) is the is
parameter width, as microcontrollers PIC16 have 1-byte parameters, PIC24 has
2-byte parameters, and x86-32 has 4 byte parameters).

Another interpretation of your comment of "array of size 1" would be to think
that a char* would take up more or less than 4 bytes, resulting in something
like:

option a:
push32 d
push32 c
push32 b
push16 a (less than 4 bytes, so access format_address[1] would be wrong.

option b:
push32 d
push32 c
push32 b
push64 a (more than 4 bytes, so access format_address[1] would be wrong.

For performance reasons it would be fairly obvious that none of the examples
would apply, but I checked them anyway and that is not what happens: the format
pointer is put on the stack as 4 bytes.

So, to use your reasoning, is well established that if I have a buffer of 16
bytes, and if I declare a 1-entry (4-byte) int* PTR4 to point to the first byte
of that buffer, it is absolutely well defined that PTR4[0] will access bytes 0
to 3 of the buffer, PTR4[1] will access bytes 4 to 7, PTR4[2] will access bytes
8 to 11, and PTR4[3] will access bytes 12 to 15. It has been established for
several decades now that this is not undefined behaviour event if PTR4 is
declared as a 1-entry (4-byte) pointer, and this type of code runs predictably
on every platform in a very well-defined way. It is, in fact, the principle for
serialization, which I have done numerous times for the last 20 years.

I think it is clear why I think your arguments are missing the point, it is
simply not true that this behaviour is undefined.


-- 

rogerio at rilhas dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45249

Reply via email to