In https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01530.html, Seghar had some questions about that patch.
This patch addresses some of those concerns. Instead of limiting the vector element number in rs6000_split_vec_extract_var so that the memory access does not go out of bounds, I decided to move the logic to rs6000_adjust_vec_address. Rs6000_split_vec_extract_var is the only caller of rs6000_adjust_vec_address that passes in a variable element number. The function rs6000_adjust_vec_address has 3 parts: 1) Calculation of the byte offset within the vector; 2) Creation of the new vector address; 3) Validating that the new address is valid for the register being loaded. In this patch, I moved the code that calculates the byte offset to a separate function, and moved in the AND that was originally done in rs6000_split_vec_extract_var. I have built and bootstrapped a compiler with this patch installed on a little endian power8 system and there were no regressions in the test suite. In addition, I built -mcpu=future versions of Spec 2006 and Spec 2017, and there were no additional failures. Can I check this patch into the trunk? 2020-01-09 Michael Meissner <meiss...@linux.ibm.com> * config/rs6000/rs6000.c (get_vector_offset): New helper function to calculate the offset in memory from the start of a vector of a particular element. Add code to keep the element number in bounces if the element number is variable. (rs6000_adjust_vec_address): Move calculation of offset of the vector element to get_vector_offset. (rs6000_split_vec_extract_var): Do not do the initial AND of element here, move the code to get_vector_offset. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 280071) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6753,6 +6753,43 @@ hard_reg_and_mode_to_addr_mask (rtx reg, return addr_mask; } +/* Return the offset within a memory object (MEM) of a vector type to a given + element within the vector (ELEMENT) with an element size (SCALAR_SIZE). If + the element is constant, we return a constant integer. Otherwise, we use a + base register temporary to calculate the offset after making it to fit + within the vector and scaling it. */ + +static rtx +get_vector_offset (rtx mem, rtx element, rtx base_tmp, unsigned scalar_size) +{ + if (CONST_INT_P (element)) + return GEN_INT (INTVAL (element) * scalar_size); + + /* All insns should use the 'Q' constraint (address is a single register) if + the element number is not a constant. */ + rtx addr = XEXP (mem, 0); + gcc_assert (REG_P (addr) || SUBREG_P (addr)); + + /* Mask the element to make sure the element number is between 0 and the + maximum number of elements - 1 so that we don't generate an address + outside the vector. */ + rtx num_ele_m1 = GEN_INT (GET_MODE_NUNITS (GET_MODE (mem)) - 1); + rtx and_op = gen_rtx_AND (Pmode, element, num_ele_m1); + emit_insn (gen_rtx_SET (base_tmp, and_op)); + + /* Shift the element to get the byte offset from the element number. */ + int shift = exact_log2 (scalar_size); + gcc_assert (shift >= 0); + + if (shift > 0) + { + rtx shift_op = gen_rtx_ASHIFT (Pmode, base_tmp, GEN_INT (shift)); + emit_insn (gen_rtx_SET (base_tmp, shift_op)); + } + + return base_tmp; +} + /* Adjust a memory address (MEM) of a vector type to point to a scalar field within the vector (ELEMENT) with a mode (SCALAR_MODE). Use a base register temporary (BASE_TMP) to fixup the address. Return the new memory address @@ -6767,7 +6804,6 @@ rs6000_adjust_vec_address (rtx scalar_re { unsigned scalar_size = GET_MODE_SIZE (scalar_mode); rtx addr = XEXP (mem, 0); - rtx element_offset; rtx new_addr; bool valid_addr_p; @@ -6779,30 +6815,7 @@ rs6000_adjust_vec_address (rtx scalar_re /* Calculate what we need to add to the address to get the element address. */ - if (CONST_INT_P (element)) - element_offset = GEN_INT (INTVAL (element) * scalar_size); - else - { - /* All insns should use the 'Q' constraint (address is a single register) - if the element number is not a constant. */ - gcc_assert (REG_P (addr) || SUBREG_P (addr)); - - int byte_shift = exact_log2 (scalar_size); - gcc_assert (byte_shift >= 0); - - if (byte_shift == 0) - element_offset = element; - - else - { - if (TARGET_POWERPC64) - emit_insn (gen_ashldi3 (base_tmp, element, GEN_INT (byte_shift))); - else - emit_insn (gen_ashlsi3 (base_tmp, element, GEN_INT (byte_shift))); - - element_offset = base_tmp; - } - } + rtx element_offset = get_vector_offset (mem, element, base_tmp, scalar_size); /* Create the new address pointing to the element within the vector. If we are adding 0, we don't have to change the address. */ @@ -6938,13 +6951,9 @@ rs6000_split_vec_extract_var (rtx dest, systems. */ if (MEM_P (src)) { - int num_elements = GET_MODE_NUNITS (mode); - rtx num_ele_m1 = GEN_INT (num_elements - 1); - - emit_insn (gen_anddi3 (element, element, num_ele_m1)); - gcc_assert (REG_P (tmp_gpr)); - emit_move_insn (dest, rs6000_adjust_vec_address (dest, src, element, - tmp_gpr, scalar_mode)); + emit_move_insn (dest, + rs6000_adjust_vec_address (dest, src, element, tmp_gpr, + scalar_mode)); return; } -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797