On Wed, May 07, 2025 at 03:11:38PM +0530, Sughosh Ganu wrote: > On Wed, 7 May 2025 at 13:19, Sughosh Ganu <sughosh.g...@linaro.org> wrote: > > > > On Tue, 6 May 2025 at 16:35, Heinrich Schuchardt > > <heinrich.schucha...@canonical.com> wrote: > > > > > > > > > > > > Sughosh Ganu <sughosh.g...@linaro.org> schrieb am Di., 6. Mai 2025, 12:50: > > >> > > >> On Tue, 6 May 2025 at 15:19, Heinrich Schuchardt > > >> <heinrich.schucha...@canonical.com> wrote: > > >> > > > >> > On 5/6/25 11:24, Sughosh Ganu wrote: > > >> > > U-Boot has support for both the 32-bit and 64-bit RiscV platforms. > > >> > > Set > > >> > > the width of the phys_{addr,size}_t data types based on the register > > >> > > size of the architecture. > > >> > > > > >> > > Currently, even the 32-bit RiscV platforms have a 64-bit > > >> > > phys_{addr,size}_t data types. This causes issues on the 32-bit > > >> > > platforms, where the upper 32-bits of the variables of these types > > >> > > can have junk data, and that can cause all kinds of side-effects. > > >> > > > >> > How could it be that the upper 32-bit have junk data? > > >> > > > >> > When we convert from a shorter variable the compiler should fill the > > >> > upper bits with zero. > > >> > > >> That does not seem to be happening. The efi_fit test fails on the > > >> qemu-riscv32 platform, when attempting to boot the OS from the FIT > > >> image. > > >> > > >> These are the values of the base address that I see in the > > >> _lmb_alloc_addr() function. > > >> > > >> _lmb_alloc_addr: 755, rgn => -1, base => 0x1a1c0e00802000bc, size => > > >> 0x50b1 > > > > > > > > > As you are running on QEMU you should be able to track down where the > > > value is actually assigned with gdb. This could for instance be a buffer > > > overrun. > > > > I was able to hook up gdb and re-create the issue. What I observe is > > that when the lmb_allocate_mem() function is called, the base address > > parameter, which is 64-bits, shows a value with the upper 32-bits not > > zeroed out. So, this looks like a compiler issue, where the upper > > 32-bits are not being zeroed out. Fwiw, this shows up with the > > compiler being used in the CI environment, as well as the one that I > > am using. > > Thinking a bit on this, I don't think this is a compiler issue. The > problem is that we are using the ulong type in some places(especially > in the boot* commands) for storing the address values, while we use > phys_addr_t in other places. And because this is a pointer being > passed across functions, when the data-type that the pointer is > pointing to changes from a 32-bit to 64-bit value, the upper 32-bits > get considered. So the issue is that we use ulong in some places, and > phys_addr_t in others for storing the addresses. > > But I think that the solution for this(at least for now) is to set > phys_addr_t based on the underlying architecture. In the long run, > there needs to be an audit of the usage of ulong for storing > addresses, and that needs to be changed to phys_addr_t.
Thanks for digging in to this more. I agree with what you're saying here for both the short and long term. -- Tom
signature.asc
Description: PGP signature