Evgeny Karpov <evgeny.kar...@microsoft.com> writes: > Tuesday, October 29, 2024 > Richard Sandiford <richard.sandif...@arm.com> wrote: > >> Hmm, I see. I think this is surprising enough that it would be worth >> a comment. How about: >> >> /* Since the assembly directive only specifies a size, and not an >> alignment, we need to follow the default ASM_OUTPUT_LOCAL behavior >> and round the size up to at least a multiple of BIGGEST_ALIGNMENT bits, >> so that each uninitialized object starts on such a boundary. >> However, we also want to allow the alignment (and thus minimum size) >> to exceed BIGGEST_ALIGNMENT. */ > > Thanks for the suggestion. It will be included in the next version of the > patch. > >> But how does using a larger size force the linker to assign a larger >> alignment than BIGGEST_ALIGNMENT? Is there a second limit in play? >> >> Or does this patch not guarantee that the ffmpeg variable gets the >> alignment it wants? Is it just about suppresing the error? >> >> If it's just about suppressing the error without guaranteeing the >> requested alignment, then, yeah, I think patching ffmpeg would >> be better. If the patch does guarantee the alignment, then the >> patch seems ok, but I think the comment should explain how, and >> explain why BIGGEST_ALIGNMENT isn't larger. > > It looks like it generates the expected assembly code for the alignments > and the correct object file, and it should be the expected code for FFmpeg. > > The alignment cannot be larger than 8192, otherwise, it will generate an > error. > > error: requested alignment ‘16384’ exceeds object file maximum 8192 > 16 | float __attribute__((aligned (1 << 14))) large_aligned_array10[3];
OK, thanks. But... > Here an example: > > float large_aligned_array[3]; > float __attribute__((aligned (8))) large_aligned_array2[3]; > float __attribute__((aligned (16))) large_aligned_array3[3]; > float __attribute__((aligned (32))) large_aligned_array4[3]; > float __attribute__((aligned (64))) large_aligned_array5[3]; > float __attribute__((aligned (128))) large_aligned_array6[3]; > float __attribute__((aligned (256))) large_aligned_array7[3]; > float __attribute__((aligned (512))) large_aligned_array8[3]; > float __attribute__((aligned (1024))) large_aligned_array9[3]; > > > .align 3 > .def large_aligned_array; .scl 3; .type 0; .endef > large_aligned_array: > .space 12 // skip Is this the construct used by the patch? The original patch was: +#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGNMENT) \ + { \ + unsigned HOST_WIDE_INT rounded = MAX ((SIZE), 1); \ + unsigned HOST_WIDE_INT alignment = MAX ((ALIGNMENT), BIGGEST_ALIGNMENT); \ + rounded += (alignment / BITS_PER_UNIT) - 1; \ + rounded = (rounded / (alignment / BITS_PER_UNIT) \ + * (alignment / BITS_PER_UNIT)); \ + ASM_OUTPUT_LOCAL (FILE, NAME, SIZE, rounded); \ + } with patch 5 defining ASM_OUTPUT_LOCAL as: +#define ASM_OUTPUT_LOCAL(FILE, NAME, SIZE, ROUNDED) \ +( fputs (".lcomm ", (FILE)), \ + assemble_name ((FILE), (NAME)), \ + fprintf ((FILE), ",%u\n", (int)(ROUNDED))) So for the change in question, I was expecting, say, a 1024-byte-aligned float[3] to be defined using: .lcomm array, 1024 If we have access to .align, couldn't we define ASM_OUTPUT_ALIGNED_LOCAL to use that, using the style you quoted above? Or is the .lcomm approach needed to work with -fcommon? Thanks, Richard > .global large_aligned_array2 > .align 3 > .def large_aligned_array2; .scl 3; .type 0; .endef > large_aligned_array2: > .space 12 // skip > > .global large_aligned_array3 > .align 4 > .def large_aligned_array3; .scl 3; .type 0; .endef > large_aligned_array3: > .space 12 // skip > > .global large_aligned_array4 > .align 5 > .def large_aligned_array4; .scl 3; .type 0; .endef > large_aligned_array4: > .space 12 // skip > > .global large_aligned_array5 > .align 6 > .def large_aligned_array5; .scl 3; .type 0; .endef > large_aligned_array5: > .space 12 // skip > > .global large_aligned_array6 > .align 7 > .def large_aligned_array6; .scl 3; .type 0; .endef > large_aligned_array6: > .space 12 // skip > > .global large_aligned_array7 > .align 8 > .def large_aligned_array7; .scl 3; .type 0; .endef > large_aligned_array7: > .space 12 // skip > > .global large_aligned_array8 > .align 9 > .def large_aligned_array8; .scl 3; .type 0; .endef > large_aligned_array8: > .space 12 // skip > > .global large_aligned_array9 > .align 10 > .def large_aligned_array9; .scl 3; .type 0; .endef > large_aligned_array9: > .space 12 // skip > > > Symbols in the object file also look good. > > 015 00000000 SECT2 notype External | large_aligned_array > 016 00000010 SECT2 notype External | large_aligned_array2 > 017 00000020 SECT2 notype External | large_aligned_array3 > 018 00000040 SECT2 notype External | large_aligned_array4 > 019 00000080 SECT2 notype External | large_aligned_array5 > 01A 00000100 SECT2 notype External | large_aligned_array6 > 01B 00000200 SECT2 notype External | large_aligned_array7 > 01C 00000400 SECT2 notype External | large_aligned_array8 > 01D 00000800 SECT2 notype External | large_aligned_array9