On Sun Jul 7, 2024 at 9:46 AM AEST, David Gibson wrote:
> On Sat, Jul 06, 2024 at 11:37:08AM +0100, Peter Maydell wrote:
> > On Fri, 5 Jul 2024 at 06:13, David Gibson <da...@gibson.dropbear.id.au> 
> > wrote:
> > >
> > > On Fri, Jul 05, 2024 at 02:40:19PM +1000, Nicholas Piggin wrote:
> > > > On Fri Jul 5, 2024 at 11:41 AM AEST, David Gibson wrote:
> > > > > On Fri, Jul 05, 2024 at 11:18:47AM +1000, Nicholas Piggin wrote:
> > > > > > On Thu Jul 4, 2024 at 10:15 PM AEST, Peter Maydell wrote:
> > > > > > > On Sat, 29 Jun 2024 at 04:17, David Gibson 
> > > > > > > <da...@gibson.dropbear.id.au> wrote:
> > > > > > > >
> > > > > > > > On Fri, Jun 28, 2024 at 04:20:02PM +0100, Peter Maydell wrote:
> > > > > > > > > On Thu, 27 Jun 2024 at 14:39, Akihiko Odaki 
> > > > > > > > > <akihiko.od...@daynix.com> wrote:
> > > > > > > > > >
> > > > > > > > > > FDT properties are aligned by 4 bytes, not 8 bytes.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com>
> > > > > > > > > > ---
> > > > > > > > > >  hw/ppc/vof.c | 2 +-
> > > > > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/hw/ppc/vof.c b/hw/ppc/vof.c
> > > > > > > > > > index e3b430a81f4f..b5b6514d79fc 100644
> > > > > > > > > > --- a/hw/ppc/vof.c
> > > > > > > > > > +++ b/hw/ppc/vof.c
> > > > > > > > > > @@ -646,7 +646,7 @@ static void 
> > > > > > > > > > vof_dt_memory_available(void *fdt, GArray *claimed, 
> > > > > > > > > > uint64_t base)
> > > > > > > > > >      mem0_reg = fdt_getprop(fdt, offset, "reg", &proplen);
> > > > > > > > > >      g_assert(mem0_reg && proplen == sizeof(uint32_t) * (ac 
> > > > > > > > > > + sc));
> > > > > > > > > >      if (sc == 2) {
> > > > > > > > > > -        mem0_end = be64_to_cpu(*(uint64_t *)(mem0_reg + 
> > > > > > > > > > sizeof(uint32_t) * ac));
> > > > > > > > > > +        mem0_end = ldq_be_p(mem0_reg + sizeof(uint32_t) * 
> > > > > > > > > > ac);
> > > > > > > > > >      } else {
> > > > > > > > > >          mem0_end = be32_to_cpu(*(uint32_t *)(mem0_reg + 
> > > > > > > > > > sizeof(uint32_t) * ac));
> > > > > > > > > >      }
> > > > > > > > >
> > > > > > > > > I did wonder if there was a better way to do what this is 
> > > > > > > > > doing,
> > > > > > > > > but neither we (in system/device_tree.c) nor libfdt seem to
> > > > > > > > > provide one.
> > > > > > > >
> > > > > > > > libfdt does provide unaligned access helpers (fdt32_ld() etc.), 
> > > > > > > > but
> > > > > > > > not an automatic aligned-or-unaligned helper.   Maybe we should 
> > > > > > > > add that?
> > > > > > >
> > > > > > > fdt32_ld() and friends only do the "load from this bit of memory"
> > > > > > > part, which we already have QEMU utility functions for (and which
> > > > > > > are this patch uses).
> > > > > > >
> > > > > > > This particular bit of code is dealing with an fdt property 
> > > > > > > ("memory")
> > > > > > > that is an array of (address, size) tuples where address and size
> > > > > > > can independently be either 32 or 64 bits, and it wants the
> > > > > > > size value of tuple 0. So the missing functionality is something 
> > > > > > > at
> > > > > > > a higher level than fdt32_ld() which would let you say "give me
> > > > > > > tuple N field X" with some way to specify the tuple layout. (Which
> > > > > > > is an awkward kind of API to write in C.)
> > > > > > >
> > > > > > > Slightly less general, but for this case we could perhaps have
> > > > > > > something like the getprop equivalent of 
> > > > > > > qemu_fdt_setprop_sized_cells():
> > > > > > >
> > > > > > >   uint64_t value_array[2];
> > > > > > >   qemu_fdt_getprop_sized_cells(fdt, nodename, "memory", 
> > > > > > > &value_array,
> > > > > > >                                ac, sc);
> > > > > > >   /*
> > > > > > >    * fills in value_array[0] with address, value_array[1] with 
> > > > > > > size,
> > > > > > >    * probably barfs if the varargs-list of cell-sizes doesn't
> > > > > > >    * cover the whole property, similar to the current assert on
> > > > > > >    * proplen.
> > > > > > >    */
> > > > > > >   mem0_end = value_array[0];
> > > > > >
> > > > > > Since 4/8 byte cells are most common and size is probably
> > > > > > normally known, what about something simpler to start with?
> > > > >
> > > > > Hrm, I don't think this helps much.  As Peter points out the actual
> > > > > load isn't really the issue, it's locating the right spot for it.
> > > >
> > > > I don't really see why that's a problem, it's just a pointer
> > > > addition - base + fdt_address_cells * 4. The problem was in
> > >
> > > This is harder if #address-cells and #size-cells are different, or if
> > > you're parsing ranges and #address-cells is different between parent
> > > and child node.
> > >
> > > > the memory access (yes it's fixed with the patch but you could
> > > > add a general libfdt way to do it).
> > >
> > > Huh.. well I'm getting different impressions of what the problem
> > > actually is from what I initially read versus Peter Maydell's
> > > comments, so I don't really know what to think.
> > >
> > > If it's just the load then fdt32_ld() etc. already exist.  Or is it
> > > really such a hot path that unconditionally handling unaligned
> > > accesses isn't tenable?
> > 
> > The specific problem here is that the code as written tries to
> > cast a not-aligned-enough pointer to uint64_t* to do the load,
> > which is UB.
>
> Ah... and I'm assuming it's the cast itself which triggers the UB, not
> just dereferencing it.

Oh it's just the cast itself that is UB? Looks like that's true.
Interesting gcc and clang don't flag it, I guess they care about
warning on practical breakage first.

> Which makes the interface of fdt32_ld()
> etc. unusable for their intended purpose.  Well.. damn.  Now... how do
> I fix it without breaking compatibility any more than I have to.

Why not just make them take a void * ptr? I don't think that would
break anything but existing code that was forced to add the cast
may be at risk of the UB.

Could also add fdt64_unaligned_t types with aligned(1) attribute
for new code. Those can just be dereferenced directly and the
caller the compiler can choose the appropriate access supported by
the host. (Actually gcc can recognise that load unaligned and
byteswap pattern and do the same anyway, but clang at least can
not yet).

Thanks,
Nick

Reply via email to