This patch is a rewrite of parts 2, 3, and 4 of the last version of the
patch series. It tries to address Bernd Edlinger's comments about the
last version in PR56997 and my own dissatisfaction with how messy this
code was!
The particular problem Bernd pointed out was that, upon finding an
unaligned bitfield that it had to split across multiple accesses, it was
attempting to do each access in the mode of the container instead of
giving up and doing what it does normally, with the end result that it
could write past the end of the object. This is because
store_fixed_bit_field calls itself recursively for split fields, and
likewise for extract_fixed_bit_field. Also, by the time it recognizes
that it can't follow the -fstrict-volatile-bitfields rules, it's already
thrown away state relating to bit regions and skipped a bunch of
possible optimizations.
After thinking about it for a while, I decided the best approach was to
test whether -fstrict-volatile-bitfields applies at the very beginning
and skip straight to the code that handles it. Then the more deeply
nested code need not check for -fstrict-volatile-bitfields at all, which
makes it a lot simpler.
This version of the patch still overrides the bit region specified by
the C/C++ memory model when -fstrict-volatile-bitfields applies, but
that's now happening in store_bit_field and doesn't affect cases where
it can't output the the single access anyway.
I think there is still a case where this version of the patch could
still end up reading or writing past the end of an object. For example,
consider a packed struct that is 3 bytes long, with a bitfield that has
int type. I am not sure how to detect this situation at this point in
the code, but leaving just that case un-fixed for now seems better than
leaving the current mess in place. Perhaps it could be addressed by a
follow-up patch. Also, assuming the change in part 4 of the new series
to make -fstrict-volatile-bitfields not be the default on any target any
more is required as a condition of due to overriding the new memory
model, programmers at least will be less likely to be taken by surprise
when it does something weird with packed structures.
Is this version OK to commit, or at least getting closer? I'd like to
wind this project up soon, somehow or another.
-Sandra
2013-06-30 Sandra Loosemore <san...@codesourcery.com>
PR middle-end/23623
PR middle-end/48784
PR middle-end/56341
PR middle-end/56997
gcc/
* expmed.c (strict_volatile_bitfield_p): New function.
(store_bit_field_1): Don't special-case strict volatile
bitfields here.
(store_bit_field): Handle strict volatile bitfields here instead.
(store_fixed_bit_field): Don't special-case strict volatile
bitfields here.
(extract_bit_field_1): Don't special-case strict volatile
bitfields here.
(extract_bit_field): Handle strict volatile bitfields here instead.
(extract_fixed_bit_field): Don't special-case strict volatile
bitfields here. Simplify surrounding code to resemble that in
store_fixed_bit_field.
* doc/invoke.texi (Code Gen Options): Update
-fstrict-volatile-bitfields description.
diff -u gcc/expmed.c gcc/expmed.c
--- gcc/expmed.c (working copy)
+++ gcc/expmed.c (working copy)
@@ -415,6 +415,42 @@ lowpart_bit_field_p (unsigned HOST_WIDE_
return bitnum % BITS_PER_WORD == 0;
}
+/* Return true if -fstrict-volatile-bitfields applies an access of OP0
+ containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE. */
+
+static bool
+strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize,
+ unsigned HOST_WIDE_INT bitnum,
+ enum machine_mode fieldmode)
+{
+ unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode);
+
+ /* -fstrict-volatile-bitfields must be enabled and we must have a
+ volatile MEM. */
+ if (!MEM_P (op0)
+ || !MEM_VOLATILE_P (op0)
+ || flag_strict_volatile_bitfields <= 0)
+ return false;
+
+ /* Non-integral modes likely only happen with packed structures.
+ Punt. */
+ if (!SCALAR_INT_MODE_P (fieldmode))
+ return false;
+
+ /* The bit size must not be larger than the field mode, and
+ the field mode must not be larger than a word. */
+ if (bitsize > modesize || modesize > BITS_PER_WORD)
+ return false;
+
+ /* Check for cases of unaligned fields that must be split. */
+ if (bitnum % BITS_PER_UNIT + bitsize > modesize
+ || (STRICT_ALIGNMENT
+ && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize))
+ return false;
+
+ return true;
+}
+
/* Return true if OP is a memory and if a bitfield of size BITSIZE at
bit number BITNUM can be treated as a simple value of mode MODE. */
@@ -813,12 +849,8 @@ store_bit_field_1 (rtx str_rtx, unsigned
cheap register alternative is available. */
if (MEM_P (op0))
{
- /* Do not use unaligned memory insvs for volatile bitfields when
- -fstrict-volatile-bitfields is in effect. */
- if (!(MEM_VOLATILE_P (op0)
- && flag_strict_volatile_bitfields > 0)
- && get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
- fieldmode)
+ if (get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
+ fieldmode)
&& store_bit_field_using_insv (&insv, op0, bitsize, bitnum, value))
return true;
@@ -871,6 +903,27 @@ store_bit_field (rtx str_rtx, unsigned H
enum machine_mode fieldmode,
rtx value)
{
+ /* Handle -fstrict-volatile-bitfields in the cases where it applies. */
+ if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+ {
+
+ /* Storing any naturally aligned field can be done with a simple
+ store. For targets that support fast unaligned memory, any
+ naturally sized, unit aligned field can be done directly. */
+ if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+ {
+ str_rtx = adjust_bitfield_address (str_rtx, fieldmode,
+ bitnum / BITS_PER_UNIT);
+ emit_move_insn (str_rtx, value);
+ }
+ else
+ /* Explicitly override the C/C++ memory model; ignore the
+ bit range so that we can do the access in the mode mandated
+ by -fstrict-volatile-bitfields instead. */
+ store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value);
+ return;
+ }
+
/* Under the C++0x memory model, we must not touch bits outside the
bit region. Adjust the address to start at the beginning of the
bit region. */
@@ -923,29 +976,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_P (op0))
{
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
-
- /* Get the proper mode to use for this field. We want a mode that
- includes the entire field. If such a mode would be larger than
- a word, we won't be doing the extraction the normal way.
- We don't want a mode bigger than the destination. */
-
mode = GET_MODE (op0);
if (GET_MODE_BITSIZE (mode) == 0
|| GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
mode = word_mode;
-
- if (MEM_VOLATILE_P (op0)
- && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
- && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
- && flag_strict_volatile_bitfields > 0)
- mode = GET_MODE (op0);
- else
- mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
- MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
+ MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
{
@@ -1429,19 +1465,8 @@ extract_bit_field_1 (rtx str_rtx, unsign
If that's wrong, the solution is to test for it and set TARGET to 0
if needed. */
- /* If the bitfield is volatile, we need to make sure the access
- remains on a type-aligned boundary. */
- if (GET_CODE (op0) == MEM
- && MEM_VOLATILE_P (op0)
- && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
- && flag_strict_volatile_bitfields > 0)
- goto no_subreg_mode_swap;
-
- /* Only scalar integer modes can be converted via subregs. There is an
- additional problem for FP modes here in that they can have a precision
- which is different from the size. mode_for_size uses precision, but
- we want a mode based on the size, so we must avoid calling it for FP
- modes. */
+ /* Get the mode of the field to use for atomic access or subreg
+ conversion. */
mode1 = mode;
if (SCALAR_INT_MODE_P (tmode))
{
@@ -1474,8 +1499,6 @@ extract_bit_field_1 (rtx str_rtx, unsign
return convert_extracted_bit_field (op0, mode, tmode, unsignedp);
}
- no_subreg_mode_swap:
-
/* Handle fields bigger than a word. */
if (bitsize > BITS_PER_WORD)
@@ -1595,11 +1618,8 @@ extract_bit_field_1 (rtx str_rtx, unsign
cheap register alternative is available. */
if (MEM_P (op0))
{
- /* Do not use extv/extzv for volatile bitfields when
- -fstrict-volatile-bitfields is in effect. */
- if (!(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0)
- && get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
- tmode))
+ if (get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
+ tmode))
{
rtx result = extract_bit_field_using_extv (&extv, op0, bitsize,
bitnum, unsignedp,
@@ -1665,6 +1685,31 @@ extract_bit_field (rtx str_rtx, unsigned
unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
enum machine_mode mode, enum machine_mode tmode)
{
+ enum machine_mode mode1;
+
+ /* Handle -fstrict-volatile-bitfields in the cases where it applies. */
+ if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) > 0)
+ mode1 = GET_MODE (str_rtx);
+ else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
+ mode1 = GET_MODE (target);
+ else
+ mode1 = tmode;
+
+ if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, mode1))
+ {
+ rtx result;
+
+ /* Extraction of a full MODE1 value can be done with a load as long as
+ the field is on a byte boundary and is sufficiently aligned. */
+ if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1))
+ result = adjust_bitfield_address (str_rtx, mode1,
+ bitnum / BITS_PER_UNIT);
+ else
+ result = extract_fixed_bit_field (mode, str_rtx, bitsize, bitnum,
+ target, unsignedp);
+ return convert_extracted_bit_field (result, mode, tmode, unsignedp);
+ }
+
return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp,
target, mode, tmode, true);
}
@@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo
includes the entire field. If such a mode would be larger than
a word, we won't be doing the extraction the normal way. */
- if (MEM_VOLATILE_P (op0)
- && flag_strict_volatile_bitfields > 0)
- {
- if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
- mode = GET_MODE (op0);
- else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
- mode = GET_MODE (target);
- else
- mode = tmode;
- }
- else
- mode = get_best_mode (bitsize, bitnum, 0, 0,
- MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
+ mode = GET_MODE (op0);
+ if (GET_MODE_BITSIZE (mode) == 0
+ || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
+ mode = word_mode;
+ mode = get_best_mode (bitsize, bitnum, 0, 0,
+ MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
/* The only way this should occur is if the field spans word
boundaries. */
return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
- unsigned int total_bits = GET_MODE_BITSIZE (mode);
- HOST_WIDE_INT bit_offset = bitnum - bitnum % total_bits;
-
- /* If we're accessing a volatile MEM, we can't apply BIT_OFFSET
- if it results in a multi-word access where we otherwise wouldn't
- have one. So, check for that case here. */
- if (MEM_P (op0)
- && MEM_VOLATILE_P (op0)
- && flag_strict_volatile_bitfields > 0
- && bitnum % BITS_PER_UNIT + bitsize <= total_bits
- && bitnum % GET_MODE_BITSIZE (mode) + bitsize > total_bits)
- {
- /* If the target doesn't support unaligned access, give up and
- split the access into two. */
- if (STRICT_ALIGNMENT)
- return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
- bit_offset = bitnum - bitnum % BITS_PER_UNIT;
- }
- op0 = adjust_bitfield_address (op0, mode, bit_offset / BITS_PER_UNIT);
- bitnum -= bit_offset;
+ op0 = narrow_bit_field_mem (op0, mode, bitsize, bitnum, &bitnum);
}
mode = GET_MODE (op0);
diff -u gcc/doc/invoke.texi gcc/doc/invoke.texi
--- gcc/doc/invoke.texi (working copy)
+++ gcc/doc/invoke.texi (working copy)
@@ -20887,6 +20887,12 @@ instruction, even though that accesses b
any portion of the bit-field, or memory-mapped registers unrelated to
the one being updated.
+In some cases, such as when the @code{packed} attribute is applied to a
+structure field, it may not be possible to access the field with a single
+read or write that is correctly aligned for the target machine. In this
+case GCC falls back to generating multiple accesses rather than code that
+will fault or truncate the result at run time.
+
The default value of this option is determined by the application binary
interface for the target processor.