Hi all, > I believe this patch is wrong. Please revert. See the PR84762 testcase. >
I'm happy to revert it given the regression and GCC 8 release being imminent, but looking at the code again it seems to me that the original code may also not be fully correct? Particularly I'm wondering what happens if you overalign the struct. > There are two problems. Firstly, if padding_correction is non-zero, > then xbitpos % BITS_PER_WORD is non-zero and in > > store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD, > 0, 0, word_mode, > extract_bit_field (src_word, bitsize, > bitpos % BITS_PER_WORD, 1, > NULL_RTX, word_mode, word_mode, > false, NULL), > false); > > the stored bit-field exceeds the destination register size. You could > fix that by making bitsize the gcd of bitsize and padding_correction. > > However, that doesn't fix the second problem which is that the > extracted bit-field can exceed the source size. That will result in > rubbish being read into a register. Yes, this is because I misunderstood how extract_bit_field works, I had though that the word_mode was the size of the load and that the bitsize, bitpos is just used to determine the mask & shift. But it seems to do the load based on bitsize and word_mode is just the mode you want the results in? In which case, wouldn't just adjusting bitsize to be the largest size smaller than the number of bits we want to copy in each loop iteration work? alignment permitted. Regards, Tamar ________________________________________ From: Alan Modra <amo...@gmail.com> Sent: Tuesday, April 3, 2018 1:25 PM To: Richard Biener Cc: Peter Bergner; Tamar Christina; gcc-patches@gcc.gnu.org; nd; l...@redhat.com; i...@airs.com Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)] On Tue, Apr 03, 2018 at 01:01:23PM +0200, Richard Biener wrote: > On Fri, 30 Mar 2018, Peter Bergner wrote: > > > On 3/29/18 9:35 AM, Alan Modra wrote: > > > On Thu, Nov 16, 2017 at 03:27:01PM +0000, Tamar Christina wrote: > > >> --- a/gcc/expr.c > > >> +++ b/gcc/expr.c > > >> @@ -2769,7 +2769,9 @@ copy_blkmode_to_reg (machine_mode mode, tree src) > > >> > > >> n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; > > >> dst_words = XALLOCAVEC (rtx, n_regs); > > >> - bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > > >> + bitsize = BITS_PER_WORD; > > >> + if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE > > >> (src)))) > > >> + bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > > >> > > >> /* Copy the structure BITSIZE bits at a time. */ > > >> for (bitpos = 0, xbitpos = padding_correction; > > > > > > I believe this patch is wrong. Please revert. See the PR84762 testcase. > > > > > > There are two problems. Firstly, if padding_correction is non-zero, > > > then xbitpos % BITS_PER_WORD is non-zero and in > > > > > > store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD, > > > 0, 0, word_mode, > > > extract_bit_field (src_word, bitsize, > > > bitpos % BITS_PER_WORD, 1, > > > NULL_RTX, word_mode, word_mode, > > > false, NULL), > > > false); > > > > > > the stored bit-field exceeds the destination register size. You could > > > fix that by making bitsize the gcd of bitsize and padding_correction. > > > > > > However, that doesn't fix the second problem which is that the > > > extracted bit-field can exceed the source size. That will result in > > > rubbish being read into a register. > > > > FYI, I received an automated response saying Tamar is away on vacation > > with no return date specified. That means he won't be able to revert > > the patch. What do we do now? > > The code before the change already looks fishy to me. Yes, it is fishy so far as the code in the loop relies on alignment determining a small enough bitsize. Adding if (bytes % UNITS_PER_WORD != 0) bitsize = gcd (bitsize, (bytes % UNITS_PER_WORD) * BITS_PER_UNIT); after bitsize is calculated from alignment would make the code correct, I believe. But I think that will fail the testcase Tamar added. > x = expand_normal (src); > > so what do we expect this to expand to in general? Fortunately > it seems there are exactly two callers so hopefully a > gcc_assert (MEM_P (x)) would work? > > The fishy part is looking at TYPE_ALIGN (TREE_TYPE (src)). > > In case x is a MEM we should use MEM_ALIGN and if not then we > shouldn't do anything but just use word_mode here. No! You can't use a bitsize of BITS_PER_WORD here. See the code I quoted above. -- Alan Modra Australia Development Lab, IBM ________________________________________ From: Richard Biener <rguent...@suse.de> Sent: Tuesday, April 3, 2018 1:30:23 PM To: Alan Modra Cc: Peter Bergner; Tamar Christina; gcc-patches@gcc.gnu.org; nd; l...@redhat.com; i...@airs.com Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)] On Tue, 3 Apr 2018, Alan Modra wrote: > On Tue, Apr 03, 2018 at 01:01:23PM +0200, Richard Biener wrote: > > On Fri, 30 Mar 2018, Peter Bergner wrote: > > > > > On 3/29/18 9:35 AM, Alan Modra wrote: > > > > On Thu, Nov 16, 2017 at 03:27:01PM +0000, Tamar Christina wrote: > > > >> --- a/gcc/expr.c > > > >> +++ b/gcc/expr.c > > > >> @@ -2769,7 +2769,9 @@ copy_blkmode_to_reg (machine_mode mode, tree src) > > > >> > > > >> n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; > > > >> dst_words = XALLOCAVEC (rtx, n_regs); > > > >> - bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > > > >> + bitsize = BITS_PER_WORD; > > > >> + if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE > > > >> (src)))) > > > >> + bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > > > >> > > > >> /* Copy the structure BITSIZE bits at a time. */ > > > >> for (bitpos = 0, xbitpos = padding_correction; > > > > > > > > I believe this patch is wrong. Please revert. See the PR84762 > > > > testcase. > > > > > > > > There are two problems. Firstly, if padding_correction is non-zero, > > > > then xbitpos % BITS_PER_WORD is non-zero and in > > > > > > > > store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD, > > > > 0, 0, word_mode, > > > > extract_bit_field (src_word, bitsize, > > > > bitpos % BITS_PER_WORD, 1, > > > > NULL_RTX, word_mode, > > > > word_mode, > > > > false, NULL), > > > > false); > > > > > > > > the stored bit-field exceeds the destination register size. You could > > > > fix that by making bitsize the gcd of bitsize and padding_correction. > > > > > > > > However, that doesn't fix the second problem which is that the > > > > extracted bit-field can exceed the source size. That will result in > > > > rubbish being read into a register. > > > > > > FYI, I received an automated response saying Tamar is away on vacation > > > with no return date specified. That means he won't be able to revert > > > the patch. What do we do now? > > > > The code before the change already looks fishy to me. > > Yes, it is fishy so far as the code in the loop relies on alignment > determining a small enough bitsize. Adding > > if (bytes % UNITS_PER_WORD != 0) > bitsize = gcd (bitsize, (bytes % UNITS_PER_WORD) * BITS_PER_UNIT); > > after bitsize is calculated from alignment would make the code > correct, I believe. But I think that will fail the testcase Tamar > added. > > > x = expand_normal (src); > > > > so what do we expect this to expand to in general? Fortunately > > it seems there are exactly two callers so hopefully a > > gcc_assert (MEM_P (x)) would work? > > > > The fishy part is looking at TYPE_ALIGN (TREE_TYPE (src)). > > > > In case x is a MEM we should use MEM_ALIGN and if not then we > > shouldn't do anything but just use word_mode here. > > No! You can't use a bitsize of BITS_PER_WORD here. See the code I > quoted above. Ah, so it should possibly use TYPE_SIZE instead of TYPE_ALIGN, otherwise it will handle the over-aligned case (align to 8 bytes, size 4 bytes) incorrectly as well? Richard.