On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <ja...@redhat.com> wrote:
> >
> > On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches 
> > wrote:
> > > I had the impression we have support for PCH file relocation to deal with 
> > > ASLR
> > > at least on some platforms.
> >
> > Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
> > x86_64-linux, PCH will stop working unless one always invokes it with
> > disabled ASLR through personality.
> >
> > I think this is related to function pointers and pointers to .rodata/.data
> > etc. variables in GC memory, we currently do not relocate that.
> >
> > What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
> > are adjacent with a single load base for them - I think at least ia64
> > non-PIE binaries were violating this by having .text and .data PT_LOAD
> > segments many terrabytes appart with a whole in between not protected in any
> > way, but dunno if that is for PIEs too), perhaps try in a host
> > specific way remember the address range in which the function pointers and
> > .rodata/.data can exist, remember the extent start and end from PCH 
> > generation
> > and on PCH load query those addresses for the current compiler and relocate
> > everything in that extent by the load bias from the last run.
> > But, the assumption for this is that those function and data/rodata pointers
> > in GC memory are actually marked at least as pointers...
> 
> If any such pointers exist they must be marked GTY((skip)) since they do not
> point to GC memory...  So we'd need to invent special-handling for those.
> 
> > Do we e.g. have objects with virtual classes in GC memory and if so, do we
> > catch their virtual table pointers?
> 
> Who knows, but then I don't remember adding stuff that should end in a PCH.

So, I've investigated a little bit.
Apparently all the relocation we currently do for PCH is done at PCH write
time, we choose some address range in the address space we think will be likely
mmappable each time successfully, relocate all pointers pointing to GC
memory to point in there and then write that to file, together with the
scalar GTY global vars values and GTY pointers in global vars.
On PCH load, we just try to mmap memory in the right range, fail PCH load if
unsuccessful, and read the GC memory into that range and update scalar and
pointer GTY global vars from what we've recorded.
Patch that made PCH load to fail for PIEs etc. was
https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
If we wanted to relocate pointers to functions and .data/.rodata etc.,
ideally we'd create a relocation list of addresses that should be
incremented by the bias and quickly relocate those.

I wrote following ugly hack:

--- ggc-common.c.jj     2021-08-19 11:42:27.365422400 +0200
+++ ggc-common.c        2021-11-05 15:37:51.447222544 +0100
@@ -404,6 +404,9 @@ struct mmap_info
 
 /* Write out the state of the compiler to F.  */
 
+char *exestart = (char *) 2;
+char *exeend = (char *) 2;
+
 void
 gt_pch_save (FILE *f)
 {
@@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
     for (rti = *rt; rti->base != NULL; rti++)
       if (fwrite (rti->base, rti->stride, 1, f) != 1)
        fatal_error (input_location, "cannot write PCH file: %m");
+      else if ((((uintptr_t) rti->base) & (sizeof (void *) - 1)) == 0)
+        {
+          char *const *p = (char *const *) rti->base;
+          char *const *q = (char *const *) ((uintptr_t) rti->base + 
(rti->stride & ~(sizeof (void *) - 1)));
+          for (; p < q; p++)
+           if (*p >= exestart && *p < exeend)
+             fprintf (stderr, "scalar at %p points to executable %p\n", (void 
*) p, (void *) *p);
+        }
 
   /* Write out all the global pointers, after translation.  */
   write_pch_globals (gt_ggc_rtab, &state);
@@ -546,6 +557,15 @@ gt_pch_save (FILE *f)
       state.ptrs[i]->note_ptr_fn (state.ptrs[i]->obj,
                                  state.ptrs[i]->note_ptr_cookie,
                                  relocate_ptrs, &state);
+      if ((((uintptr_t) state.ptrs[i]->obj) & (sizeof (void *) - 1)) == 0)
+        {
+          char *const *p = (char *const *) (state.ptrs[i]->obj);
+          char *const *q = (char *const *) ((uintptr_t) (state.ptrs[i]->obj) + 
(state.ptrs[i]->size & ~(sizeof (void *) - 1)));
+          for (; p < q; p++)
+           if (*p >= exestart && *p < exeend)
+             fprintf (stderr, "object %p at %p points to executable %p\n", 
(void *) (state.ptrs[i]->obj), (void *) p, (void *) *p);
+        }
+
       ggc_pch_write_object (state.d, state.f, state.ptrs[i]->obj,
                            state.ptrs[i]->new_addr, state.ptrs[i]->size,
                            state.ptrs[i]->note_ptr_fn == gt_pch_p_S);

and under debugger set exestart and exeend from /proc/*/maps of the cc1plus
process being debugged (the extent of cc1plus mappings).
This resulted in something like:
scalar at 0x3d869a8 points to executable 0x2dd85e0
scalar at 0x3d869b0 points to executable 0x2dd85e4
scalar at 0x3d869c8 points to executable 0x2dd85e7
...
object 0x7fffea007e70 at 0x7fffea007e70 points to executable 0x11e48c2
object 0x7fffe953dcc0 at 0x7fffe953dcc0 points to executable 0x201e222
object 0x7fffe401d260 at 0x7fffe401d260 points to executable 0x4b0a27
object 0x7fffea02fce0 at 0x7fffea02fce0 points to executable 0x18bb2b0
object 0x7fffe7034ca0 at 0x7fffe7034ca0 points to executable 0x2f81537
object 0x7fffe700f8a0 at 0x7fffe700f8a0 points to executable 0x2c36a32
on stderr.  Unfortunately, I didn't try to rebuild the compiler as PIE, so
unfortunately the range was 0x400000 .. 0x3d9b000 so I'm not really sure
if all it dumped were actually addresses or some nice numbers like 0x1000000
etc.  Much better would be to have the compiler as PIE, run it twice and
only look at values that actually changed, or link the compiler at some very
unlikely virtual address offset so that addresses into it would be easy to
spot.
All the "scalar at " messages are for offsets in the ovl_op_info
array.
struct GTY(()) ovl_op_info_t {
  /* The IDENTIFIER_NODE for the operator.  */
  tree identifier;
  /* The name of the operator.  */
  const char *name;
  /* The mangled name of the operator.  */
  const char *mangled_name;
  /* The (regular) tree code.  */
  enum tree_code tree_code : 16;
  /* The (compressed) operator code.  */
  enum ovl_op_code ovl_op_code : 8;
  /* The ovl_op_flags of the operator */
  unsigned flags : 8;
};
For that particular case gengtype emits:
  {
    &ovl_op_info[0][0].identifier,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
  {
    &ovl_op_info[0][0].name,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    (gt_pointer_walker) &gt_ggc_m_S,
    (gt_pointer_walker) &gt_pch_n_S
  },
  {
    &ovl_op_info[0][0].mangled_name,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    (gt_pointer_walker) &gt_ggc_m_S,
    (gt_pointer_walker) &gt_pch_n_S
  },
so I believe we treat the identifier as always a GC memory object pointer,
and name and mangled_name are const char * pointers which I vaguely remember
we allow to be either NULL, or 1 or GC memory pointers or string literals
(but can't find how it deals with that last category in the source).
>From the source:
ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] = 
  {
    {
      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
      {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
#define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
      {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
#define OPERATOR_TRANSITION }, {                        \
      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
#include "operators.def"
    }
  };
where operators.def has e.g.:
DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
in this particular array the strings are always string literals.
I guess to get ovl_op_info out of the picture we could mark
name and mangled_name as GTY((skip)).
But that is just 178 records, the remaining 52520 are in GC memory
objects.  Figuring out what exactly it is in will be harder...
>From the addresses it printed in the last column, the following point
to the start of some cc1plus symbol:
  3310: 0000000000c121d2   831 FUNC    LOCAL  DEFAULT   14 
_ZL9min_vis_rPP9tree_nodePiPv
134773: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 
_Z20ggc_round_alloc_sizem
  6151: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 
_Z20ggc_round_alloc_sizem
188594: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 
_Z4is_aIP7gswitch6gimpleEbPT0_
 37908: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 
_Z4is_aIP7gswitch6gimpleEbPT0_
 50655: 0000000001707c85    37 FUNC    LOCAL  DEFAULT   14 
_ZL20realloc_for_line_mapPvm
131570: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 
_ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
  1653: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 
_ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
129108: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 
_ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
 51650: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 
_ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
 77141: 0000000001b6cb5a   159 FUNC    LOCAL  DEFAULT   14 
_ZL10emit_localP9tree_nodePKcmm
 77142: 0000000001b6cbf9    75 FUNC    LOCAL  DEFAULT   14 
_ZL8emit_bssP9tree_nodePKcmm
 77143: 0000000001b6cc44    75 FUNC    LOCAL  DEFAULT   14 
_ZL11emit_commonP9tree_nodePKcmm
 77144: 0000000001b6cc8f   231 FUNC    LOCAL  DEFAULT   14 
_ZL15emit_tls_commonP9tree_nodePKcmm
181390: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 
_Z21output_section_asm_opPKv
 25347: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 
_Z21output_section_asm_opPKv
160243: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
_ZN11code_helperC2E11combined_fn
163230: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
_ZN11code_helperC1E11combined_fn
 26343: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
_ZN11code_helperC1E11combined_fn
 40584: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 
_ZN11code_helperC2E11combined_fn
 12547: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
_ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
165150: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
_ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
181147: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
_ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
 26558: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 
_ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
  8400: 0000000002e13f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_1
  8448: 0000000002e14260    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
 10166: 0000000002e444a0     4 OBJECT  LOCAL  DEFAULT   16 
_ZN15zero_regs_flagsL11ALL_GPR_ARGE
 11568: 0000000002e51420    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512FP16
 11735: 0000000002e52f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_VPCLMULQDQ
 12575: 0000000002e5f560    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_ROCKETLAKE
165019: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
  9991: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
 12749: 0000000002e60f60   160 OBJECT  LOCAL  DEFAULT   16 
_ZL22extra_order_size_table
 14715: 0000000002e7e340    16 OBJECT  LOCAL  DEFAULT   16 
_ZL18PTA_SKYLAKE_AVX512
 15895: 0000000002e84480    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_UINTR
 17084: 0000000002e8c160    16 OBJECT  LOCAL  DEFAULT   16 
_ZL18PTA_SAPPHIRERAPIDS
 18397: 0000000002e946a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_2
 18986: 0000000002e97580    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
 18990: 0000000002e975c0    16 OBJECT  LOCAL  DEFAULT   16 
_ZL17PTA_GOLDMONT_PLUS
 22195: 0000000002eb1640    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_FMA
 30065: 0000000002eed6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512BF16
 31474: 0000000002ef3560     1 OBJECT  LOCAL  DEFAULT   16 
_ZStL19piecewise_construct
 34906: 0000000002f02580    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512BW
 37696: 0000000002f0e420     1 OBJECT  LOCAL  DEFAULT   16 
_ZStL19piecewise_construct
 37701: 0000000002f0e484     4 OBJECT  LOCAL  DEFAULT   16 
_ZL40LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
 38868: 0000000002f13420    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
 39129: 0000000002f143c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_XSAVES
 40610: 0000000002f1e7c0     1 OBJECT  LOCAL  DEFAULT   16 
_ZStL19piecewise_construct
 42157: 0000000002f293c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
 42201: 0000000002f29680    16 OBJECT  LOCAL  DEFAULT   16 
_ZL18PTA_SKYLAKE_AVX512
 42207: 0000000002f296e0    16 OBJECT  LOCAL  DEFAULT   16 
_ZL18PTA_ICELAKE_SERVER
 49618: 0000000002f556e0     4 OBJECT  LOCAL  DEFAULT   16 
_ZN15zero_regs_flagsL8USED_ARGE
 50904: 0000000002f5d4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_AVX
 51188: 0000000002f5e6e0    48 OBJECT  LOCAL  DEFAULT   16 
_ZN12_GLOBAL__N_1L17pass_data_tm_initE
 56440: 0000000002f7d440    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
 57404: 0000000002f81640     4 OBJECT  LOCAL  DEFAULT   16 _ZL14MAX_LOCATION_T
 57424: 0000000002f816a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_64BIT
 60100: 0000000002f903a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_TILE
 67672: 0000000002fae460    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_COOPERLAKE
 68780: 0000000002fb37c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512CD
 70316: 0000000002fbb4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_LWP
 70637: 0000000002fbc7a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
 70837: 0000000002fbd4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
 73878: 0000000002fcb960    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
 79867: 00000000030435c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
 81991: 0000000003053520    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_F16C
 82244: 0000000003054500    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
 86070: 00000000033ec560    99 OBJECT  LOCAL  DEFAULT   16 
_ZL26znver1_agu_min_issue_delay
 86071: 00000000033ec5e0  1334 OBJECT  LOCAL  DEFAULT   16 _ZL15geode_translate
 86228: 00000000034419c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_NO_80387
 94224: 0000000003849420    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
 94230: 0000000003849480    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_XOP
 94647: 000000000384aa40    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SGX
 95488: 000000000384e4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNM
 95820: 000000000384f6a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
 95822: 000000000384f6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
 95824: 000000000384f6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_WAITPKG
 96072: 0000000003850640    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_LZCNT
 96074: 0000000003850660    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_MOVBE
 96080: 00000000038506c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SSE
 98344: 000000000385a4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_ENQCMD
 99309: 000000000385da40    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_POPCNT
103332: 000000000386f2c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CLFLUSHOPT
103344: 000000000386f380    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_PKU
103352: 000000000386f400    16 OBJECT  LOCAL  DEFAULT   16 _ZL15PTA_AVX512VBMI2
103709: 0000000003870a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
104337: 0000000003873660    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
106315: 000000000387d260    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_NO_TUNE
109183: 000000000388c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
111159: 0000000003894a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CANNONLAKE
112043: 00000000038994c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
112049: 0000000003899520    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
113040: 000000000389d6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_BF16
 21876: 0000000003d8d5c0    56 OBJECT  LOCAL  DEFAULT   28 
_ZL22mem_alloc_origin_names
 31109: 0000000003d8e100    40 OBJECT  LOCAL  DEFAULT   28 
_ZL30unspecified_modref_access_node
 78193: 0000000003d932e0    56 OBJECT  LOCAL  DEFAULT   28 
_ZL22mem_alloc_origin_names
 78366: 0000000003d93320    56 OBJECT  LOCAL  DEFAULT   28 
_ZL22mem_alloc_origin_names

        Jakub

Reply via email to