Jakub Jelinek <ja...@redhat.com> writes:
> On Tue, Jul 30, 2024 at 11:25:42AM +0200, Richard Biener wrote:
>> Only "relevant" stuff should be streamed - the offload code and all
>> trees refered to.
>
> Yeah.
>
>> > > I think all current issues are because of poly-* leaking in for cases
>> > > where a non-poly would have worked fine, but I have not had a look
>> > > myself.
>> > 
>> > One of the cases that Prathamesh mentions is streaming the mode sizes.
>> > Are those modes "offload target modes" or "host modes"?  It seems like
>> > it shouldn't be an error for the host to have VLA modes per se.  It's
>> > just that those modes can't be used in the host/offload interface.
>> 
>> There's a requirement that a mode mapping exists from the host to
>> target enum machine_mode.  I don't remember exactly how we compute
>> that mapping and whether streaming of some data (and thus poly-int)
>> are part of this.
>
> During streaming out, the code records what machine modes are being streamed
> (in streamer_mode_table).
> For those modes (and their inner modes) then lto_write_mode_table
> should stream a table with mode details like class, bits, size, inner mode,
> nunits, real mode format if any, etc.
> That table is then streamed in in the offloading compiler and it attempts to
> find corresponding modes (and emits fatal_error if there is no such mode;
> consider say x86_64 long double with XFmode being used in offloading code
> which doesn't have XFmode support).
> Now, because Richard S. changed GET_MODE_SIZE etc. to give poly_int rather
> than int, this has been changed to use bp_pack_poly_value; but that relies
> on the same number of coefficients for poly_int, which is not the case when
> e.g. offloading aarch64 to gcn or nvptx.
>
> From what I can see, this mode table handling are the only uses of
> bp_pack_poly_value.  So the options are either to stream at the start of the
> mode table the NUM_POLY_INT_COEFFS value and in bp_unpack_poly_value pass to
> it what we've read and fill in any remaining coeffs with zeros, or in each
> bp_pack_poly_value stream the number of coefficients and then stream that
> back in and fill in remaining ones (and diagnose if it would try to read
> non-zero coefficient which isn't stored).
> I think streaming NUM_POLY_INT_COEFFS once would be more compact (at least
> for non-aarch64/riscv targets).

Ah, ok, thanks for the explanation.  In that case, I agree that either
of those two would work (no personal preference for which).

Richard

Reply via email to