Jakub Jelinek <ja...@redhat.com> writes: > On Tue, Jul 30, 2024 at 11:25:42AM +0200, Richard Biener wrote: >> Only "relevant" stuff should be streamed - the offload code and all >> trees refered to. > > Yeah. > >> > > I think all current issues are because of poly-* leaking in for cases >> > > where a non-poly would have worked fine, but I have not had a look >> > > myself. >> > >> > One of the cases that Prathamesh mentions is streaming the mode sizes. >> > Are those modes "offload target modes" or "host modes"? It seems like >> > it shouldn't be an error for the host to have VLA modes per se. It's >> > just that those modes can't be used in the host/offload interface. >> >> There's a requirement that a mode mapping exists from the host to >> target enum machine_mode. I don't remember exactly how we compute >> that mapping and whether streaming of some data (and thus poly-int) >> are part of this. > > During streaming out, the code records what machine modes are being streamed > (in streamer_mode_table). > For those modes (and their inner modes) then lto_write_mode_table > should stream a table with mode details like class, bits, size, inner mode, > nunits, real mode format if any, etc. > That table is then streamed in in the offloading compiler and it attempts to > find corresponding modes (and emits fatal_error if there is no such mode; > consider say x86_64 long double with XFmode being used in offloading code > which doesn't have XFmode support). > Now, because Richard S. changed GET_MODE_SIZE etc. to give poly_int rather > than int, this has been changed to use bp_pack_poly_value; but that relies > on the same number of coefficients for poly_int, which is not the case when > e.g. offloading aarch64 to gcn or nvptx. > > From what I can see, this mode table handling are the only uses of > bp_pack_poly_value. So the options are either to stream at the start of the > mode table the NUM_POLY_INT_COEFFS value and in bp_unpack_poly_value pass to > it what we've read and fill in any remaining coeffs with zeros, or in each > bp_pack_poly_value stream the number of coefficients and then stream that > back in and fill in remaining ones (and diagnose if it would try to read > non-zero coefficient which isn't stored). > I think streaming NUM_POLY_INT_COEFFS once would be more compact (at least > for non-aarch64/riscv targets).
Ah, ok, thanks for the explanation. In that case, I agree that either of those two would work (no personal preference for which). Richard