Hi! When not using .base64 directive, we emit for long sequences of zeros .string "foobarbaz" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" The following patch changes that to .string "foobarbaz" .zero 12 It keeps emitting .string "" if there is just one zero or two zeros where the first one is preceded by non-zeros, so we can have .string "foobarbaz" .string "" or .base64 "VG8gYmUgb3Igbm90IHRvIGJlLCB0aGF0IGlzIHRoZSBxdWVzdGlvbg==" .string "" but not 2 .string "" in a row.
On a testcase I have with around 310440 0-255 unsigned char character constants mostly derived from cc1plus start but with too long sequences of 0s which broke transformation to STRING_CST adjusted to have at most 126 consecutive 0s, I see: 1504498 bytes long assembly without this patch on i686-linux (without .base64 support in binutils) 1155071 bytes long assembly with this patch on i686-linux (without .base64 support in binutils) 431390 bytes long assembly without this patch on x86_64-linux (with .base64 support in binutils) 427593 bytes long assembly with this patch on x86_64-linux (with .base64 support in binutils) All 4 assemble to identical *.o file when using x86_64-linux .base64 supporting gas, and the former 2 when using older x86_64-linux gas assemble to identical content as well. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-07-17 Jakub Jelinek <ja...@redhat.com> * varasm.cc (default_elf_asm_output_ascii): Use ASM_OUTPUT_SKIP instead of 2 or more default_elf_asm_output_limited_string (f, "") calls and adjust base64 heuristics correspondingly. --- gcc/varasm.cc.jj 2024-07-16 13:36:54.259748720 +0200 +++ gcc/varasm.cc 2024-07-16 14:08:19.211753867 +0200 @@ -8538,6 +8538,7 @@ default_elf_asm_output_ascii (FILE *f, c if (s >= last_base64) { unsigned cnt = 0; + unsigned char prev_c = ' '; const char *t; for (t = s; t < limit && (t - s) < (long) ELF_STRING_LIMIT - 1; t++) { @@ -8560,7 +8561,13 @@ default_elf_asm_output_ascii (FILE *f, c break; case 1: if (c == 0) - cnt += 2 + strlen (STRING_ASM_OP) + 1; + { + if (prev_c == 0 + && t + 1 < limit + && (t + 1 - s) < (long) ELF_STRING_LIMIT - 1) + break; + cnt += 2 + strlen (STRING_ASM_OP) + 1; + } else cnt += 4; break; @@ -8568,6 +8575,7 @@ default_elf_asm_output_ascii (FILE *f, c cnt += 2; break; } + prev_c = c; } if (cnt > ((unsigned) (t - s) + 2) / 3 * 4 && (t - s) >= 3) { @@ -8633,8 +8641,18 @@ default_elf_asm_output_ascii (FILE *f, c bytes_in_chunk = 0; } - default_elf_asm_output_limited_string (f, s); - s = p; + if (p == s && p + 1 < limit && p[1] == '\0') + { + for (p = s + 2; p < limit && *p == '\0'; p++) + continue; + ASM_OUTPUT_SKIP (f, (unsigned HOST_WIDE_INT) (p - s)); + s = p - 1; + } + else + { + default_elf_asm_output_limited_string (f, s); + s = p; + } } else { Jakub