hctim added a comment. > summary of DWARF: > & how many of these descriptions get added to the debug info?
afaict, there is now: 1x .debug_addr entry for each string 1x. debug_info DW_TAG_variable for each string 1x. DW_TAG_array_type + DW_TAG_subrange_type for each unique sizeof(string) i tried to measure if there's other bits laying around that could be optimised. i thought briefly about diffing the llvm-dwarfdump for the before/after for clang, but as the dumpfiles reached 20gb, rethought that decision. the dwarfdump for the clang/test/CodeGen/debug-info-variables.c dwo is below. > Numbers for Split DWARF may be helpful too - given this'll add an extra > address/relocation for every string literal, it might make object size > (specifically unlinked object size where relocations are expensive/plentiful) > significantly larger in problematic ways. sorry, i don't understand why split-dwarf means this requires an additional relocation (i'm not really sure what split-dwarf is outside of just putting the dwarf in a separate file, but don't see why that would change relocations). i made a quick dwarfdump diff on clang/test/CodeGen/debug-info-variables.c (with split-dwarf): sections old: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 2] .debug_str.dwo PROGBITS 0000000000000000 000040 0000eb 01 MSE 0 0 1 [ 3] .debug_str_offsets.dwo PROGBITS 0000000000000000 00012b 00002c 00 E 0 0 1 [ 4] .debug_info.dwo PROGBITS 0000000000000000 000157 000077 00 E 0 0 1 [ 5] .debug_abbrev.dwo PROGBITS 0000000000000000 0001ce 000091 00 E 0 0 1 sections new: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 3] .debug_str.dwo PROGBITS 0000000000000000 000078 0000ff 01 MSE 0 0 1 [ 2] .debug_str_offsets.dwo PROGBITS 0000000000000000 000040 000038 00 E 0 0 1 [ 4] .debug_info.dwo PROGBITS 0000000000000000 000177 000092 00 E 0 0 1 [ 5] .debug_abbrev.dwo PROGBITS 0000000000000000 000209 0000aa 00 E 0 0 1 so `.debug_string += 0x14`, `.debug_str_offsets += 0xc`, `.debug_info += 0x1b` and `.debug_abbrev += 0x19`. as before, the DW_TAG_array_type + DW_TAG_subrange_type would be amortised across strings with the same size. unfortunately, i don't see any further places to optimise (except for the full `const char*` amortization, which as in a previous comment, didn't make much of an improvement for the entire clang binary) diff --git a/tmp/dwo/dwo-dump b/dwo-dump index e0fdd77..6415086 100644 --- a/tmp/dwo/dwo-dump +++ b/dwo-dump @@ -3,14 +3,13 @@ debug-info-variables.dwo: file format elf64-x86-64 .debug_abbrev.dwo contents: Abbrev table for offset: 0x00000000 [1] DW_TAG_compile_unit DW_CHILDREN_yes - DW_AT_producer DW_FORM_GNU_str_index + DW_AT_producer DW_FORM_strx1 DW_AT_language DW_FORM_data2 - DW_AT_name DW_FORM_GNU_str_index - DW_AT_GNU_dwo_name DW_FORM_GNU_str_index - DW_AT_GNU_dwo_id DW_FORM_data8 + DW_AT_name DW_FORM_strx1 + DW_AT_dwo_name DW_FORM_strx1 [2] DW_TAG_variable DW_CHILDREN_no - DW_AT_name DW_FORM_GNU_str_index + DW_AT_name DW_FORM_strx1 DW_AT_type DW_FORM_ref4 DW_AT_external DW_FORM_flag_present DW_AT_decl_file DW_FORM_data1 @@ -18,155 +17,194 @@ Abbrev table for offset: 0x00000000 DW_AT_location DW_FORM_exprloc [3] DW_TAG_base_type DW_CHILDREN_no - DW_AT_name DW_FORM_GNU_str_index + DW_AT_name DW_FORM_strx1 DW_AT_encoding DW_FORM_data1 DW_AT_byte_size DW_FORM_data1 -[4] DW_TAG_subprogram DW_CHILDREN_no - DW_AT_low_pc DW_FORM_GNU_addr_index +[4] DW_TAG_variable DW_CHILDREN_no + DW_AT_type DW_FORM_ref4 + DW_AT_decl_file DW_FORM_data1 + DW_AT_decl_line DW_FORM_data1 + DW_AT_location DW_FORM_exprloc + +[5] DW_TAG_array_type DW_CHILDREN_yes + DW_AT_type DW_FORM_ref4 + +[6] DW_TAG_subrange_type DW_CHILDREN_no + DW_AT_type DW_FORM_ref4 + DW_AT_count DW_FORM_data1 + +[7] DW_TAG_base_type DW_CHILDREN_no + DW_AT_name DW_FORM_strx1 + DW_AT_byte_size DW_FORM_data1 + DW_AT_encoding DW_FORM_data1 + +[8] DW_TAG_subprogram DW_CHILDREN_no + DW_AT_low_pc DW_FORM_addrx DW_AT_high_pc DW_FORM_data4 DW_AT_frame_base DW_FORM_exprloc - DW_AT_name DW_FORM_GNU_str_index + DW_AT_name DW_FORM_strx1 DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data1 DW_AT_type DW_FORM_ref4 DW_AT_external DW_FORM_flag_present -[5] DW_TAG_subprogram DW_CHILDREN_yes - DW_AT_low_pc DW_FORM_GNU_addr_index +[9] DW_TAG_subprogram DW_CHILDREN_yes + DW_AT_low_pc DW_FORM_addrx DW_AT_high_pc DW_FORM_data4 DW_AT_frame_base DW_FORM_exprloc - DW_AT_name DW_FORM_GNU_str_index + DW_AT_name DW_FORM_strx1 DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data1 DW_AT_prototyped DW_FORM_flag_present DW_AT_type DW_FORM_ref4 DW_AT_external DW_FORM_flag_present -[6] DW_TAG_formal_parameter DW_CHILDREN_no +[10] DW_TAG_formal_parameter DW_CHILDREN_no DW_AT_location DW_FORM_exprloc - DW_AT_name DW_FORM_GNU_str_index + DW_AT_name DW_FORM_strx1 DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data1 DW_AT_type DW_FORM_ref4 -[7] DW_TAG_variable DW_CHILDREN_no +[11] DW_TAG_variable DW_CHILDREN_no DW_AT_location DW_FORM_exprloc - DW_AT_name DW_FORM_GNU_str_index + DW_AT_name DW_FORM_strx1 DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data1 DW_AT_type DW_FORM_ref4 -[8] DW_TAG_pointer_type DW_CHILDREN_no +[12] DW_TAG_pointer_type DW_CHILDREN_no DW_AT_type DW_FORM_ref4 -[9] DW_TAG_const_type DW_CHILDREN_no +[13] DW_TAG_const_type DW_CHILDREN_no DW_AT_type DW_FORM_ref4 .debug_info.dwo contents: -0x00000000: Compile Unit: length = 0x00000073, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000077) +0x00000000: Compile Unit: length = 0x0000008e, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0xa7a0aceb112fa998 (next unit at 0x00000092) -0x0000000b: DW_TAG_compile_unit - DW_AT_producer ("clang version 14.0.0 (https://github.com/llvm/llvm-project.git 5357a98c823a5262814e269db265b5d8e1f2c4f2)") +0x00000014: DW_TAG_compile_unit + DW_AT_producer ("clang version 15.0.0 (https://github.com/llvm/llvm-project.git 4c7fff8247ec5485701d4e04ea45b9fe399e1c5a)") DW_AT_language (DW_LANG_C99) DW_AT_name ("/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c") - DW_AT_GNU_dwo_name ("debug-info-variables.dwo") - DW_AT_GNU_dwo_id (0x02255219cf8b78f3) + DW_AT_dwo_name ("debug-info-variables.dwo") -0x00000019: DW_TAG_variable +0x0000001a: DW_TAG_variable DW_AT_name ("global") - DW_AT_type (0x00000024 "int") + DW_AT_type (0x00000025 "int") DW_AT_external (true) DW_AT_decl_file (0x01) DW_AT_decl_line (4) - DW_AT_location (DW_OP_GNU_addr_index 0x0) + DW_AT_location (DW_OP_addrx 0x0) -0x00000024: DW_TAG_base_type +0x00000025: DW_TAG_base_type DW_AT_name ("int") DW_AT_encoding (DW_ATE_signed) DW_AT_byte_size (0x04) -0x00000028: DW_TAG_subprogram - DW_AT_low_pc (indexed (00000001) address = <unresolved>) +0x00000029: DW_TAG_variable + DW_AT_type (0x00000033 "char [11]") + DW_AT_decl_file (0x01) + DW_AT_decl_line (8) + DW_AT_location (DW_OP_addrx 0x1) + +0x00000033: DW_TAG_array_type + DW_AT_type (0x0000003f "char") + +0x00000038: DW_TAG_subrange_type + DW_AT_type (0x00000043 "__ARRAY_SIZE_TYPE__") + DW_AT_count (0x0b) + +0x0000003e: NULL + +0x0000003f: DW_TAG_base_type + DW_AT_name ("char") + DW_AT_encoding (DW_ATE_signed_char) + DW_AT_byte_size (0x01) + +0x00000043: DW_TAG_base_type + DW_AT_name ("__ARRAY_SIZE_TYPE__") + DW_AT_byte_size (0x08) + DW_AT_encoding (DW_ATE_unsigned) + +0x00000047: DW_TAG_subprogram + DW_AT_low_pc (indexed (00000002) address = <unresolved>) DW_AT_high_pc (0x0000000d) DW_AT_frame_base (DW_OP_reg6 RBP) DW_AT_name ("s") DW_AT_decl_file (0x01) DW_AT_decl_line (7) - DW_AT_type (0x00000068 "const char *") + DW_AT_type (0x00000087 "const char *") DW_AT_external (true) -0x00000037: DW_TAG_subprogram - DW_AT_low_pc (indexed (00000002) address = <unresolved>) +0x00000056: DW_TAG_subprogram + DW_AT_low_pc (indexed (00000003) address = <unresolved>) DW_AT_high_pc (0x00000018) DW_AT_frame_base (DW_OP_reg6 RBP) DW_AT_name ("sum") DW_AT_decl_file (0x01) DW_AT_decl_line (14) DW_AT_prototyped (true) - DW_AT_type (0x00000024 "int") + DW_AT_type (0x00000025 "int") DW_AT_external (true) -0x00000046: DW_TAG_formal_parameter +0x00000065: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg -4) DW_AT_name ("p") DW_AT_decl_file (0x01) DW_AT_decl_line (14) - DW_AT_type (0x00000024 "int") + DW_AT_type (0x00000025 "int") -0x00000051: DW_TAG_formal_parameter +0x00000070: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg -8) DW_AT_name ("q") DW_AT_decl_file (0x01) DW_AT_decl_line (14) - DW_AT_type (0x00000024 "int") + DW_AT_type (0x00000025 "int") -0x0000005c: DW_TAG_variable +0x0000007b: DW_TAG_variable DW_AT_location (DW_OP_fbreg -12) DW_AT_name ("r") DW_AT_decl_file (0x01) DW_AT_decl_line (15) - DW_AT_type (0x00000024 "int") - -0x00000067: NULL + DW_AT_type (0x00000025 "int") -0x00000068: DW_TAG_pointer_type - DW_AT_type (0x0000006d "const char") +0x00000086: NULL -0x0000006d: DW_TAG_const_type - DW_AT_type (0x00000072 "char") +0x00000087: DW_TAG_pointer_type + DW_AT_type (0x0000008c "const char") -0x00000072: DW_TAG_base_type - DW_AT_name ("char") - DW_AT_encoding (DW_ATE_signed_char) - DW_AT_byte_size (0x01) +0x0000008c: DW_TAG_const_type + DW_AT_type (0x0000003f "char") -0x00000076: NULL +0x00000091: NULL .debug_str.dwo contents: 0x00000000: "global" 0x00000007: "int" -0x0000000b: "s" -0x0000000d: "char" -0x00000012: "sum" -0x00000016: "p" -0x00000018: "q" -0x0000001a: "r" -0x0000001c: "clang version 14.0.0 (https://github.com/llvm/llvm-project.git 5357a98c823a5262814e269db265b5d8e1f2c4f2)" -0x00000085: "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c" -0x000000d2: "debug-info-variables.dwo" +0x0000000b: "char" +0x00000010: "__ARRAY_SIZE_TYPE__" +0x00000024: "s" +0x00000026: "sum" +0x0000002a: "p" +0x0000002c: "q" +0x0000002e: "r" +0x00000030: "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 4c7fff8247ec5485701d4e04ea45b9fe399e1c5a)" +0x00000099: "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c" +0x000000e6: "debug-info-variables.dwo" .debug_str_offsets.dwo contents: -0x00000000: Contribution size = 44, Format = DWARF32, Version = 4 -0x00000000: 00000000 "global" -0x00000004: 00000007 "int" -0x00000008: 0000000b "s" -0x0000000c: 0000000d "char" -0x00000010: 00000012 "sum" -0x00000014: 00000016 "p" -0x00000018: 00000018 "q" -0x0000001c: 0000001a "r" -0x00000020: 0000001c "clang version 14.0.0 (https://github.com/llvm/llvm-project.git 5357a98c823a5262814e269db265b5d8e1f2c4f2)" -0x00000024: 00000085 "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c" -0x00000028: 000000d2 "debug-info-variables.dwo" +0x00000000: Contribution size = 52, Format = DWARF32, Version = 5 +0x00000008: 00000000 "global" +0x0000000c: 00000007 "int" +0x00000010: 0000000b "char" +0x00000014: 00000010 "__ARRAY_SIZE_TYPE__" +0x00000018: 00000024 "s" +0x0000001c: 00000026 "sum" +0x00000020: 0000002a "p" +0x00000024: 0000002c "q" +0x00000028: 0000002e "r" +0x0000002c: 00000030 "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 4c7fff8247ec5485701d4e04ea45b9fe399e1c5a)" +0x00000030: 00000099 "/usr/local/google/home/mitchp/llvm/clang/test/CodeGen/debug-info-variables.c" +0x00000034: 000000e6 "debug-info-variables.dwo" Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123534/new/ https://reviews.llvm.org/D123534 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits