> On 5 Nov 2021, at 15:25, Jakub Jelinek <ja...@redhat.com> wrote:
> 
> On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
>> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <ja...@redhat.com> wrote:
>>> 
>>> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches 
>>> wrote:
>>>> I had the impression we have support for PCH file relocation to deal with 
>>>> ASLR
>>>> at least on some platforms.
>>> 
>>> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
>>> x86_64-linux, PCH will stop working unless one always invokes it with
>>> disabled ASLR through personality.
>>> 
>>> I think this is related to function pointers and pointers to .rodata/.data
>>> etc. variables in GC memory, we currently do not relocate that.
>>> 
>>> What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
>>> are adjacent with a single load base for them - I think at least ia64
>>> non-PIE binaries were violating this by having .text and .data PT_LOAD
>>> segments many terrabytes appart with a whole in between not protected in any
>>> way, but dunno if that is for PIEs too), perhaps try in a host
>>> specific way remember the address range in which the function pointers and
>>> .rodata/.data can exist, remember the extent start and end from PCH 
>>> generation
>>> and on PCH load query those addresses for the current compiler and relocate
>>> everything in that extent by the load bias from the last run.
>>> But, the assumption for this is that those function and data/rodata pointers
>>> in GC memory are actually marked at least as pointers...
>> 
>> If any such pointers exist they must be marked GTY((skip)) since they do not
>> point to GC memory...  So we'd need to invent special-handling for those.
>> 
>>> Do we e.g. have objects with virtual classes in GC memory and if so, do we
>>> catch their virtual table pointers?
>> 
>> Who knows, but then I don't remember adding stuff that should end in a PCH.
> 
> So, I've investigated a little bit.
> Apparently all the relocation we currently do for PCH is done at PCH write
> time, we choose some address range in the address space we think will be 
> likely
> mmappable each time successfully, relocate all pointers pointing to GC
> memory to point in there and then write that to file, together with the
> scalar GTY global vars values and GTY pointers in global vars.
> On PCH load, we just try to mmap memory in the right range, fail PCH load if
> unsuccessful, and read the GC memory into that range and update scalar and
> pointer GTY global vars from what we've recorded.
> Patch that made PCH load to fail for PIEs etc. was
> https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
> If we wanted to relocate pointers to functions and .data/.rodata etc.,
> ideally we'd create a relocation list of addresses that should be
> incremented by the bias and quickly relocate those.

It is hard to judge the relative effort in the two immediately visible 
solutions:

1. relocatable PCH
2. taking the tree streamer from the modules implementation, moving its home
    to c-family and adding hooks so that each FE can stream its own special 
trees.

ISTM, that part of the reason people dislike PCH is because the implementation 
is
mixed up with the GC solution - the rendering is non-transparent etc.

So, in some ways, (2) above would be a better investment - the process of PCH 
is:
generate:
“get to the end of parsing a TU” .. stream the AST
consume:
.. see a header .. stream the PCH AST in if there is one available for the 
header.

There is no reason for this to be mixed into the GC solution - the read in 
(currently)
happens to an empty TU and there should be nothing in the AST that carries any
reference to the compiler’s executable.

just 0.02 GBP.
Iain


> 
> I wrote following ugly hack:
> 
> --- ggc-common.c.jj   2021-08-19 11:42:27.365422400 +0200
> +++ ggc-common.c      2021-11-05 15:37:51.447222544 +0100
> @@ -404,6 +404,9 @@ struct mmap_info
> 
> /* Write out the state of the compiler to F.  */
> 
> +char *exestart = (char *) 2;
> +char *exeend = (char *) 2;
> +
> void
> gt_pch_save (FILE *f)
> {
> @@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
>     for (rti = *rt; rti->base != NULL; rti++)
>       if (fwrite (rti->base, rti->stride, 1, f) != 1)
>       fatal_error (input_location, "cannot write PCH file: %m");
> +      else if ((((uintptr_t) rti->base) & (sizeof (void *) - 1)) == 0)
> +        {
> +          char *const *p = (char *const *) rti->base;
> +          char *const *q = (char *const *) ((uintptr_t) rti->base + 
> (rti->stride & ~(sizeof (void *) - 1)));
> +          for (; p < q; p++)
> +         if (*p >= exestart && *p < exeend)
> +           fprintf (stderr, "scalar at %p points to executable %p\n", (void 
> *) p, (void *) *p);
> +        }
> 
>   /* Write out all the global pointers, after translation.  */
>   write_pch_globals (gt_ggc_rtab, &state);
> @@ -546,6 +557,15 @@ gt_pch_save (FILE *f)
>       state.ptrs[i]->note_ptr_fn (state.ptrs[i]->obj,
>                                 state.ptrs[i]->note_ptr_cookie,
>                                 relocate_ptrs, &state);
> +      if ((((uintptr_t) state.ptrs[i]->obj) & (sizeof (void *) - 1)) == 0)
> +        {
> +          char *const *p = (char *const *) (state.ptrs[i]->obj);
> +          char *const *q = (char *const *) ((uintptr_t) (state.ptrs[i]->obj) 
> + (state.ptrs[i]->size & ~(sizeof (void *) - 1)));
> +          for (; p < q; p++)
> +         if (*p >= exestart && *p < exeend)
> +           fprintf (stderr, "object %p at %p points to executable %p\n", 
> (void *) (state.ptrs[i]->obj), (void *) p, (void *) *p);
> +        }
> +
>       ggc_pch_write_object (state.d, state.f, state.ptrs[i]->obj,
>                           state.ptrs[i]->new_addr, state.ptrs[i]->size,
>                           state.ptrs[i]->note_ptr_fn == gt_pch_p_S);
> 
> and under debugger set exestart and exeend from /proc/*/maps of the cc1plus
> process being debugged (the extent of cc1plus mappings).
> This resulted in something like:
> scalar at 0x3d869a8 points to executable 0x2dd85e0
> scalar at 0x3d869b0 points to executable 0x2dd85e4
> scalar at 0x3d869c8 points to executable 0x2dd85e7
> ...
> object 0x7fffea007e70 at 0x7fffea007e70 points to executable 0x11e48c2
> object 0x7fffe953dcc0 at 0x7fffe953dcc0 points to executable 0x201e222
> object 0x7fffe401d260 at 0x7fffe401d260 points to executable 0x4b0a27
> object 0x7fffea02fce0 at 0x7fffea02fce0 points to executable 0x18bb2b0
> object 0x7fffe7034ca0 at 0x7fffe7034ca0 points to executable 0x2f81537
> object 0x7fffe700f8a0 at 0x7fffe700f8a0 points to executable 0x2c36a32
> on stderr.  Unfortunately, I didn't try to rebuild the compiler as PIE, so
> unfortunately the range was 0x400000 .. 0x3d9b000 so I'm not really sure
> if all it dumped were actually addresses or some nice numbers like 0x1000000
> etc.  Much better would be to have the compiler as PIE, run it twice and
> only look at values that actually changed, or link the compiler at some very
> unlikely virtual address offset so that addresses into it would be easy to
> spot.
> All the "scalar at " messages are for offsets in the ovl_op_info
> array.
> struct GTY(()) ovl_op_info_t {
>  /* The IDENTIFIER_NODE for the operator.  */
>  tree identifier;
>  /* The name of the operator.  */
>  const char *name;
>  /* The mangled name of the operator.  */
>  const char *mangled_name;
>  /* The (regular) tree code.  */
>  enum tree_code tree_code : 16;
>  /* The (compressed) operator code.  */
>  enum ovl_op_code ovl_op_code : 8;
>  /* The ovl_op_flags of the operator */
>  unsigned flags : 8;
> };
> For that particular case gengtype emits:
>  {
>    &ovl_op_info[0][0].identifier,
>    1 * (2) * (OVL_OP_MAX),
>    sizeof (ovl_op_info[0][0]),
>    &gt_ggc_mx_tree_node,
>    &gt_pch_nx_tree_node
>  },
>  {
>    &ovl_op_info[0][0].name,
>    1 * (2) * (OVL_OP_MAX),
>    sizeof (ovl_op_info[0][0]),
>    (gt_pointer_walker) &gt_ggc_m_S,
>    (gt_pointer_walker) &gt_pch_n_S
>  },
>  {
>    &ovl_op_info[0][0].mangled_name,
>    1 * (2) * (OVL_OP_MAX),
>    sizeof (ovl_op_info[0][0]),
>    (gt_pointer_walker) &gt_ggc_m_S,
>    (gt_pointer_walker) &gt_pch_n_S
>  },
> so I believe we treat the identifier as always a GC memory object pointer,
> and name and mangled_name are const char * pointers which I vaguely remember
> we allow to be either NULL, or 1 or GC memory pointers or string literals
> (but can't find how it deals with that last category in the source).
> From the source:
> ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] = 
>  {
>    {
>      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
>      {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
> #define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
>      {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
> #define OPERATOR_TRANSITION }, {                        \
>      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
> #include "operators.def"
>    }
>  };
> where operators.def has e.g.:
> DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
> in this particular array the strings are always string literals.
> I guess to get ovl_op_info out of the picture we could mark
> name and mangled_name as GTY((skip)).
> But that is just 178 records, the remaining 52520 are in GC memory
> objects.  Figuring out what exactly it is in will be harder...
> From the addresses it printed in the last column, the following point
> to the start of some cc1plus symbol:
>  3310: 0000000000c121d2   831 FUNC    LOCAL  DEFAULT   14 
> _ZL9min_vis_rPP9tree_nodePiPv
> 134773: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 
> _Z20ggc_round_alloc_sizem
>  6151: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 
> _Z20ggc_round_alloc_sizem
> 188594: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 
> _Z4is_aIP7gswitch6gimpleEbPT0_
> 37908: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 
> _Z4is_aIP7gswitch6gimpleEbPT0_
> 50655: 0000000001707c85    37 FUNC    LOCAL  DEFAULT   14 
> _ZL20realloc_for_line_mapPvm
> 131570: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 
> _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
>  1653: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 
> _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
> 129108: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 
> _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
> 51650: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 
> _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
> 77141: 0000000001b6cb5a   159 FUNC    LOCAL  DEFAULT   14 
> _ZL10emit_localP9tree_nodePKcmm
> 77142: 0000000001b6cbf9    75 FUNC    LOCAL  DEFAULT   14 
> _ZL8emit_bssP9tree_nodePKcmm
> 77143: 0000000001b6cc44    75 FUNC    LOCAL  DEFAULT   14 
> _ZL11emit_commonP9tree_nodePKcmm
> 77144: 0000000001b6cc8f   231 FUNC    LOCAL  DEFAULT   14 
> _ZL15emit_tls_commonP9tree_nodePKcmm
> 181390: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 
> _Z21output_section_asm_opPKv
> 25347: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 
> _Z21output_section_asm_opPKv
> 160243: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
> _ZN11code_helperC2E11combined_fn
> 163230: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
> _ZN11code_helperC1E11combined_fn
> 26343: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
> _ZN11code_helperC1E11combined_fn
> 40584: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
> _ZN11code_helperC2E11combined_fn
> 12547: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
> _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> 165150: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
> _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> 181147: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
> _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> 26558: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
> _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
>  8400: 0000000002e13f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_1
>  8448: 0000000002e14260    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
> 10166: 0000000002e444a0     4 OBJECT  LOCAL  DEFAULT   16 
> _ZN15zero_regs_flagsL11ALL_GPR_ARGE
> 11568: 0000000002e51420    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512FP16
> 11735: 0000000002e52f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_VPCLMULQDQ
> 12575: 0000000002e5f560    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_ROCKETLAKE
> 165019: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 
> class_narrowest_mode
>  9991: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
> 12749: 0000000002e60f60   160 OBJECT  LOCAL  DEFAULT   16 
> _ZL22extra_order_size_table
> 14715: 0000000002e7e340    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL18PTA_SKYLAKE_AVX512
> 15895: 0000000002e84480    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_UINTR
> 17084: 0000000002e8c160    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL18PTA_SAPPHIRERAPIDS
> 18397: 0000000002e946a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_2
> 18986: 0000000002e97580    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
> 18990: 0000000002e975c0    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL17PTA_GOLDMONT_PLUS
> 22195: 0000000002eb1640    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_FMA
> 30065: 0000000002eed6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512BF16
> 31474: 0000000002ef3560     1 OBJECT  LOCAL  DEFAULT   16 
> _ZStL19piecewise_construct
> 34906: 0000000002f02580    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512BW
> 37696: 0000000002f0e420     1 OBJECT  LOCAL  DEFAULT   16 
> _ZStL19piecewise_construct
> 37701: 0000000002f0e484     4 OBJECT  LOCAL  DEFAULT   16 
> _ZL40LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
> 38868: 0000000002f13420    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
> 39129: 0000000002f143c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_XSAVES
> 40610: 0000000002f1e7c0     1 OBJECT  LOCAL  DEFAULT   16 
> _ZStL19piecewise_construct
> 42157: 0000000002f293c0    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL16PTA_AVX5124VNNIW
> 42201: 0000000002f29680    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL18PTA_SKYLAKE_AVX512
> 42207: 0000000002f296e0    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL18PTA_ICELAKE_SERVER
> 49618: 0000000002f556e0     4 OBJECT  LOCAL  DEFAULT   16 
> _ZN15zero_regs_flagsL8USED_ARGE
> 50904: 0000000002f5d4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_AVX
> 51188: 0000000002f5e6e0    48 OBJECT  LOCAL  DEFAULT   16 
> _ZN12_GLOBAL__N_1L17pass_data_tm_initE
> 56440: 0000000002f7d440    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
> 57404: 0000000002f81640     4 OBJECT  LOCAL  DEFAULT   16 _ZL14MAX_LOCATION_T
> 57424: 0000000002f816a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_64BIT
> 60100: 0000000002f903a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_TILE
> 67672: 0000000002fae460    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_COOPERLAKE
> 68780: 0000000002fb37c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512CD
> 70316: 0000000002fbb4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_LWP
> 70637: 0000000002fbc7a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> 70837: 0000000002fbd4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> 73878: 0000000002fcb960    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
> 79867: 00000000030435c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
> 81991: 0000000003053520    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_F16C
> 82244: 0000000003054500    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
> 86070: 00000000033ec560    99 OBJECT  LOCAL  DEFAULT   16 
> _ZL26znver1_agu_min_issue_delay
> 86071: 00000000033ec5e0  1334 OBJECT  LOCAL  DEFAULT   16 _ZL15geode_translate
> 86228: 00000000034419c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_NO_80387
> 94224: 0000000003849420    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> 94230: 0000000003849480    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_XOP
> 94647: 000000000384aa40    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SGX
> 95488: 000000000384e4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNM
> 95820: 000000000384f6a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
> 95822: 000000000384f6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> 95824: 000000000384f6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_WAITPKG
> 96072: 0000000003850640    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_LZCNT
> 96074: 0000000003850660    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_MOVBE
> 96080: 00000000038506c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SSE
> 98344: 000000000385a4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_ENQCMD
> 99309: 000000000385da40    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_POPCNT
> 103332: 000000000386f2c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CLFLUSHOPT
> 103344: 000000000386f380    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_PKU
> 103352: 000000000386f400    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL15PTA_AVX512VBMI2
> 103709: 0000000003870a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> 104337: 0000000003873660    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
> 106315: 000000000387d260    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_NO_TUNE
> 109183: 000000000388c160    16 OBJECT  LOCAL  DEFAULT   16 
> _ZL16PTA_AVX5124VNNIW
> 111159: 0000000003894a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CANNONLAKE
> 112043: 00000000038994c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> 112049: 0000000003899520    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
> 113040: 000000000389d6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_BF16
> 21876: 0000000003d8d5c0    56 OBJECT  LOCAL  DEFAULT   28 
> _ZL22mem_alloc_origin_names
> 31109: 0000000003d8e100    40 OBJECT  LOCAL  DEFAULT   28 
> _ZL30unspecified_modref_access_node
> 78193: 0000000003d932e0    56 OBJECT  LOCAL  DEFAULT   28 
> _ZL22mem_alloc_origin_names
> 78366: 0000000003d93320    56 OBJECT  LOCAL  DEFAULT   28 
> _ZL22mem_alloc_origin_names
> 
>       Jakub

Reply via email to