Thanks Thomas for make it happen.

Then we have 2 patches, right? V4 Streamer and V1 LTO: Capture. Not quite sure 
if these 2 has some dependencies when commit (I suppose both are well tested 
and approved). But anything I can do to make some progress please feel free to 
let me know.

Again, very appreciate for the great help from Thomas, it is really save my day!

Pan

-----Original Message-----
From: Thomas Schwinge <tho...@codesourcery.com> 
Sent: Friday, June 30, 2023 4:50 PM
To: Li, Pan2 <pan2...@intel.com>; juzhe.zh...@rivai.ai; 
gcc-patches@gcc.gnu.org; Richard Biener <rguent...@suse.de>; Jakub Jelinek 
<ja...@redhat.com>
Cc: Robin Dapp <rdapp....@gmail.com>; jeffreya...@gmail.com; Wang, Yanzhang 
<yanzhang.w...@intel.com>; kito.ch...@gmail.com; Tobias Burnus 
<tob...@codesourcery.com>
Subject: [v4] Streamer: Fix out of range memory access of machine mode

Hi!

On 2023-06-30T01:39:39+0000, "Li, Pan2" <pan2...@intel.com> wrote:
> That’s very cool, thanks Thomas for help!

:-)

> Let’s wait the AMD test running result for the final version of the patch.

That's all looking good, too.

> From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai>
> Sent: Friday, June 30, 2023 9:27 AM

> Could you merge your patch after you tested?

I've done that, and with (already approved)
<https://inbox.sourceware.org/87v8f5uzob....@euler.schwinge.homeip.net>
"LTO: Capture 'lto_file_decl_data *file_data' in 'class lto_input_block'"
split out, OK to push the attached
v4 "Streamer: Fix out of range memory access of machine mode"?


Grüße
 Thomas


> From: Thomas Schwinge<mailto:tho...@codesourcery.com>
> Date: 2023-06-30 04:14

> Subject: Re: [PATCH v3] Streamer: Fix out of range memory access of machine 
> mode
> Hi!
>
> On 2023-06-29T11:29:57+0200, I wrote:
>> On 2023-06-21T15:58:24+0800, Pan Li via Gcc-patches 
>> <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>> wrote:
>>> We extend the machine mode from 8 to 16 bits already. But there still
>>> one placing missing from the streamer. It has one hard coded array
>>> for the machine code like size 256.
>>>
>>> In the lto pass, we memset the array by MAX_MACHINE_MODE count but the
>>> value of the MAX_MACHINE_MODE will grow as more and more modes are
>>> added. While the machine mode array in tree-streamer still leave 256 as is.
>>>
>>> Then, when the MAX_MACHINE_MODE is greater than 256, the memset of
>>> lto_output_init_mode_table will touch the memory out of range unexpected.
>>
>> Uh.  :-O
>>
>>> This patch would like to take the MAX_MACHINE_MODE as the size of the
>>> array in streamer, to make sure there is no potential unexpected
>>> memory access in future. Meanwhile, this patch also adjust some place
>>> which has MAX_MACHINE_MODE <= 256 assumption.
>>
>> Thanks to Jakub and Richard for guidance re the offloading compilation
>> case, where we've got different 'MAX_MACHINE_MODE's between stream-out
>> and stream-in, and a modes mapping table.
>>
>> However, with this patch, there are ICEs all over the place...  I'm
>> having a look.
>
> Your patch has all the right ideas, there are just a few additional
> changes necessary.  Please merge in the attached
> "f into Streamer: Fix out of range memory access of machine mode", with
> 'Co-authored-by: Thomas Schwinge 
> <tho...@codesourcery.com<mailto:tho...@codesourcery.com>>'.  This has
> already survived compiler-side 'lto.exp' testing and
> 'check-target-libgomp' with Nvidia GPU offloading; AMD GPU testing is now
> running (not expecting any bad surprises).  Will let you know by (my)
> tomorrow morning in case there are any more problems.
>
> Explanation:
>
>>> --- a/gcc/lto-streamer-in.cc
>>> +++ b/gcc/lto-streamer-in.cc
>>> @@ -1985,8 +1985,6 @@ lto_input_mode_table (struct lto_file_decl_data 
>>> *file_data)
>>>      internal_error ("cannot read LTO mode table from %s",
>>>                   file_data->file_name);
>>>
>>> -  unsigned char *table = ggc_cleared_vec_alloc<unsigned char> (1 << 8);
>>> -  file_data->mode_table = table;
>>>    const struct lto_simple_header_with_strings *header
>>>      = (const struct lto_simple_header_with_strings *) data;
>>>    int string_offset;
>>> @@ -1998,16 +1996,22 @@ lto_input_mode_table (struct lto_file_decl_data 
>>> *file_data)
>>>                               header->string_size, vNULL);
>>>    bitpack_d bp = streamer_read_bitpack (&ib);
>>>
>>> +  unsigned mode_bits = bp_unpack_value (&bp, 5);
>>> +  unsigned char *table = ggc_cleared_vec_alloc<unsigned char> (1 << 
>>> mode_bits);
>>> +
>>> +  file_data->mode_table = table;
>>> +  file_data->mode_bits = mode_bits;
>
> Here, we set 'file_data->mode_bits' for the offloading case (where
> 'lto_input_mode_table' is called) -- but it's not set for the
> non-offloading case (where 'lto_input_mode_table' isn't called).  (See my
> 'gcc/lto/lto-common.cc:lto_read_decls' change.)  That's "not currently a
> problem", as 'file_data->mode_bits' isn't used anywhere...
>
>>> --- a/gcc/lto-streamer.h
>>> +++ b/gcc/lto-streamer.h
>>> @@ -604,6 +604,8 @@ struct GTY(()) lto_file_decl_data
>>>    int order_base;
>>>
>>>    int unit_base;
>>> +
>>> +  unsigned mode_bits;
>>>  };
>
>>>  inline machine_mode
>>>  bp_unpack_machine_mode (struct bitpack_d *bp)
>>>  {
>>> -  return (machine_mode)
>>> -        ((class lto_input_block *)
>>> -         bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << 
>>> 8)];
>>> +  int last = 1 << ceil_log2 (MAX_MACHINE_MODE);
>>> +  lto_input_block *input_block = (class lto_input_block *) bp->stream;
>>> +  int index = bp_unpack_enum (bp, machine_mode, last);
>>> +
>>> +  return (machine_mode) input_block->mode_table[index];
>>>  }
>
> ..., but 'file_data->mode_bits' needs to be considered here, in the
> stream-in for offloading, where 'file_data->mode_bits' -- that is, the
> host 'MAX_MACHINE_MODE' -- very likely is different from the offload
> device 'MAX_MACHINE_MODE'.
>
> Easiest is in 'gcc/lto-streamer.h:class lto_input_block' to capture
> 'lto_file_decl_data *file_data' instead of just
> 'unsigned char *mode_table', and adjust all users.
>
> That's it.  :-)
>
>>> --- a/gcc/tree-streamer.h
>>> +++ b/gcc/tree-streamer.h
>
>>> @@ -108,15 +108,19 @@ inline void
>>>  bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode)
>>>  {
>>>    streamer_mode_table[mode] = 1;
>>> -  bp_pack_enum (bp, machine_mode, 1 << 8, mode);
>>> +  int last = 1 << ceil_log2 (MAX_MACHINE_MODE);
>>> +
>>> +  bp_pack_enum (bp, machine_mode, last, mode);
>>>  }
>
> That use of 'MAX_MACHINE_MODE' is safe, as that only concerns the
> stream-out phase.
>
>>> --- a/gcc/tree-streamer.cc
>>> +++ b/gcc/tree-streamer.cc
>>> @@ -35,7 +35,7 @@ along with GCC; see the file COPYING3.  If not see
>>>     During streaming in, we translate the on the disk mode using this
>>>     table.  For normal LTO it is set to identity, for ACCEL_COMPILER
>>>     depending on the mode_table content.  */
>>> -unsigned char streamer_mode_table[1 << 8];
>>> +unsigned char streamer_mode_table[MAX_MACHINE_MODE];
>
> Likewise.
>
>
> Grüße
> Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Reply via email to