On 11/11/13 07:32, Ulrich Weigand wrote:
Hello,
when implementing the new ABI for powerpc64le-linux, I ran into an assertion
failure in store_unaligned_arguments_into_pseudos:
gcc_assert (args[i].partial % UNITS_PER_WORD == 0);
This can happen in the new ABI since we pass "homogeneous structures"
consisting of soleley floating point elements of the same type in
floating-point registers until they run out, and the rest of the
structure in memory. If the structure member type is a 4-byte float,
and we only have an odd number of floating point registers available,
then args[i].partial can legitimately end up not being a multiple
of UNITS_PER_WORD (i.e. 8 on the platform).
Now, there are a number of similar checks that args[i].partial is
aligned elsewhere in calls.c and functions.c. But for all of those
the logic is: if args[i].reg is a REG, then args[i].partial must be
a multiple of the word size; but if args[i].regs is a PARALLEL, then
args[i].partial can be any arbitrary value. In the powerpc64le-linux
use case, the back-end always generates PARALLEL in this situation,
so it seemed the middle-end ought to support this -- and it does,
except for this one case.
However, looking more closely, it seems store_unaligned_arguments_into_pseudos
is not really useful for PARALLEL arguments in the first place. What this
routine does is load arguments into args[i].aligned_regs. But if we have
an argument where args[i].reg is a PARALLEL, args[i].aligned_regs will in
fact never be used later on at all! Instead, PARALLEL will always be
handled directly via emit_group_move (in load_register_parameters), so
the code generated by store_unaligned_arguments_into_pseudos for such cases
is simply dead anyway.
Thus, I'd suggest to simply have store_unaligned_arguments_into_pseudos
skip PARALLEL arguments.
Tested on powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand <ulrich.weig...@de.ibm.com>
* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
arguments.
Does this work on the PA, particularly the 32bit ABI?
/* Structures 5 to 8 bytes in size are passed in the general
registers in the same manner as other non floating-point
objects. The data is right-justified and zero-extended
to 64 bits. This is opposite to the normal justification
used on big endian targets and requires special treatment.
We now define BLOCK_REG_PADDING to pad these objects.
Aggregates, complex and vector types are passed in the same
manner as structures. */
if (mode == BLKmode
|| (type && (AGGREGATE_TYPE_P (type)
|| TREE_CODE (type) == COMPLEX_TYPE
|| TREE_CODE (type) == VECTOR_TYPE)))
{
rtx loc = gen_rtx_EXPR_LIST (VOIDmode,
gen_rtx_REG (DImode,
gpr_reg_base),
const0_rtx);
return gen_rtx_PARALLEL (BLKmode, gen_rtvec (1, loc));
}
It's been eons since I looked at this stuff, so if it's not relevant,
then just say so and I'll put those concerns aside.
jeff