[cctalk] Re: A little off-topic but at least somewhat related: endianness

Peter Corlett via cctalk Fri, 16 Aug 2024 07:56:24 -0700

On Thu, Aug 15, 2024 at 01:41:20PM -0600, ben via cctalk wrote:
[...]
> I don't know about the VAX,but my gripe is the x86 and the 68000 don't
> automaticaly promote smaller data types to larger ones. What little
> programming I have done was in C never cared about that detail. Now I can
> see way it is hard to generate good code in C when all the CPU's are brain
> dead in that aspect.


This makes them a perfect match for a brain-dead language. But what does it
even *mean* to "automaticaly promote smaller data types to larger ones"?
That's a rhetorical question, because your answer will probably disagree
with what the C standard actually says :)

Widening of integers is normally done through left-extending the sign bit.
For unsigned values, the sign bit is implicitly zero although we usually say
"sign extend" or "zero extend" to be clearer about whether we're dealing
with signed or unsigned values. C will typically do one or the other of
these, but not always the one you expected.

For sign-extension, m68k has the EXT instruction, and x86 has CBW/CWDE. For
zero-extension, pre-clear the register before loading a smaller value into a
subregister. From the 386 onwards, there are MOVZX/MOVSX which do
load-and-extend in a single operation. If the result of a calculation is
then truncated when written back to memory, then the upper bits of the
register may have never had an effect on the result and did not need to be
set to a known value, so this palaver is quite unnecessary. The thing was
only extended in the first place because C's promotion rules required it to
be, and the compiler backend has had to prove otherwise to eliminate it
again.

As it happens, it's not unnessary on modern out-of-order CPUs, so there's a
lot more use of MOVZX etc on code compiled for x86-64. Loading into a
subregister without clearing the full register first introduces a false
dependency on the old value of the upper bits, resulting in a pipeline stall
and performance hit. However, this is "just" for performance rather than
correctness.

Said performance hit is likely the main reason why x86-64 automatically
zero-extends when loading a 32-bit value into a register, and so MOVZX is no
longer required for that operation. So in fact x86 *does* "automaticaly
promote smaller data types to larger ones". Not doing so would cause an
unacceptable performance hit when running 32-bit code (which was basically
all of it back in 2003 when the first Opteron was released) or 64-bit code
making heavy use of 32-bit data.

Now, what kind of badly-written code and/or braindead programming language
would go out of its way to be inefficient and use 32-bit arithmetic instead
of the native register width?

I'm sure you can "C" where I'm going here. `int` is extremely special to it.
C really wants to do everything with 32-bit values. Smaller values are
widened, larger values are very grudgingly tolerated. C programmers
habitually use `int` as array indices rather than `size_t`, particularly in
`for` loops. Apparently everything is *still* a VAX. So on 64-bit platforms,
the index needs to be widened before adding to the pointer, and there's so
much terrible C code out there -- as if there is any other kind -- that the
CPUs need hardware mitigations to defend against it.

It's not just modern hardware which is a poor fit for C: classic hardware is
too. Because of a lot of architectural assumptions in the C model, it is
hard to generate efficient code for the 6502 or Z80, for example.

But please, feel free to tell me how C is just fine and it's the CPUs which
are at fault, even those which are heavily-optimised to run typical C code.

[cctalk] Re: A little off-topic but at least somewhat related: endianness

Reply via email to