On 9/19/20 7:55 AM, Stefan Agner wrote: > On 2020-09-14 10:15, Matthias Brugger wrote: >> On 10/09/2020 23:12, Stefan Agner wrote: >>> On 2020-09-07 16:36, Peter Robinson wrote: >>>>> Any thoughts on this issue? >>>> >>>> Any reason why you're using 2020.01 and not at least 2020.07, or at >>>> least seeing if it's reproducible on 2020.10-rc3? The RPi4 support has >>>> changed quite a bit since then I suspect. >>>> >>> >>> Hi Peter, >>> >>> It's a stable release and we support a couple of devices with the same >>> U-Boot version. I'd rather prefer to stay with 2020.01 for RPi4 as well. >>> >>> We are on 2020.07 on development branch, and it does work fine there. So >>> I thought it can't be that hard, just bisect and backport whatever fixes >>> it... Unfortunately, it seems that there is no particular commit which >>> fixes it (the bisect ended up in a random unrelated commit, and it seems >>> that the issue appears/disappears depending on alignment/size...). >>> >>> I also did applied pretty much every RPi4 related commit made after >>> 2020.01 up until master back to 2020.01, no success either. >>> >> >> Which version of the Raspberry Pi firmware did you use? >> Unfortunately changes in the FW breaks stuff on U-Boot from time to time. >> > > Ok, I am now able to reproduce the issue on master as well as 2020.07 > with standard rpi_4_32b_defconfig, but I still need to have parts of a > change which seems to trigger the issue in. From what I can tell, the > change *really* should not lead to a freeze. The change is just > accessing global variables from the data section... (see below). > > To me it still seems as if relocation somehow did not work correctly in > one way or another. > > Are there maybe restrictions in U-Boot when the data section can be > accessed? E.g. is it not legal to access the data section from the > serial driver?
One thing bit me recently, and might be relevant here. Because putc_retry is initialized to zero, it is located in bss and not data. In U-Boot, bss is not accessable before relocation. The serial driver is one of the devices which U-Boot needs before relocating, so setting putc_retry may overwrite data in the device tree. To get around this, you could try adding __attribute__((section(".data"))) to that variable. --Sean > > > diff --git a/drivers/serial/serial_bcm283x_mu.c > b/drivers/serial/serial_bcm283x_mu.c > index 8a4af87eb6..74de6801ab 100644 > --- a/drivers/serial/serial_bcm283x_mu.c > +++ b/drivers/serial/serial_bcm283x_mu.c > @@ -50,7 +50,8 @@ struct bcm283x_mu_regs { > struct bcm283x_mu_priv { > struct bcm283x_mu_regs *regs; > }; > - > +static char *fs_argv[15]; > +static uint32_t putc_retry = 0; > static int bcm283x_mu_serial_getc(struct udevice *dev); > > static int bcm283x_mu_serial_setbrg(struct udevice *dev, int baudrate) > @@ -95,6 +96,8 @@ static int bcm283x_mu_serial_putc(struct udevice *dev, > const char data) > struct bcm283x_mu_priv *priv = dev_get_priv(dev); > struct bcm283x_mu_regs *regs = priv->regs; > > + putc_retry++; > + > /* Wait until there is space in the FIFO */ > if (!(readl(®s->lsr) & BCM283X_MU_LSR_TX_EMPTY)) > return -EAGAIN; > @@ -162,6 +165,10 @@ static int bcm283x_mu_serial_probe(struct udevice > *dev) > struct bcm283x_mu_priv *priv = dev_get_priv(dev); > fdt_addr_t addr; > > + /* Make sure compiler does not optimize out this fs_argv > instance */ > + if (fs_argv[0]) > + fs_argv[0] = "test"; > + > /* Don't spawn the device if it's not muxed */ > if (!bcm283x_is_serial_muxed()) > return -ENODEV; > > Most curious of all, it seems that the name (!!!) of the variable > fs_argv matters! I am not sure if that changes order of variables in > data section or something. I can also reproduce the issue with two > compilers (GCC 8.3 and GCC 9.2), so a compiler error seems somewhat > unlikely... > > Any ideas? I am a bit out of idea how to debug this (I guess JTAG/gdb > might help, but I don't have such a setup). > > FWIW, I plan to just drop the change which seems to at least partially > cause the isssue > (https://github.com/home-assistant/operating-system/blob/dev/buildroot-external/board/raspberrypi/patches/uboot/0002-avoid-block-uart-write.patch). > Still I think there is something wrong which will show itself someday in > a certain configuration. > > -- > Stefan > > >> Regards, >> Mathias >> >>> I fear that the problem in fact is still in master, but only appears if >>> certain things align a certain way... That is why I thought I bring it >>> up, to see if anybody else has noticed something along this lines. We do >>> have a rather trimmed down configuration, which might make the problem >>> appear more (fit in a D cache...). >>> >>> I probably will just disable cached around relocation for 2020.01 and >>> see if it resurfaces on development branch. >>> >>> -- >>> Stefan >>> >>> >>>>> Just to be sure, I did some memory testing on the 2GB module, but no >>>>> issues found. >>>>> >>>>> I still somehow suspected that something else might be wrong with my >>>>> hardware, so I bought a new RPi4 (this time with 4GB of RAM) but the >>>>> very same with that: >>>>> >>>>> U-Boot 2020.01 (Aug 23 2020 - 22:02:31 +0000) >>>>> >>>>> DRAM: 3.9 GiB >>>>> <freeze> >>>>> >>>>> I still think there is something wrong with caching. From what I >>>>> understand caches are enabled by the RPi (4) firmware. Is it safe to run >>>>> 32-bit ARM U-Boot with enabled caches? >>>>> >>>>> -- >>>>> Stefan >>>>> >>>>> On 2020-08-23 19:06, Stefan Agner wrote: >>>>>> Hi, >>>>>> >>>>>> I noticed a quite common freeze when running 32-bit U-Boot 2020.01 >>>>>> (rpi_4_32b_defconfig) on a 2GB RPi4 model: >>>>>> >>>>>> U-Boot 2020.01 (Aug 07 2020 - 13:00:23 +0000) >>>>>> >>>>>> DRAM: 1.9 GiB >>>>>> <freeze, no more output> >>>>>> >>>>>> This happens fairly often, I would say 4 out of 5 boot tries. However, >>>>>> if it boots, everything seems to run fine. >>>>>> >>>>>> The issue seems to go away when using 2020.04 or any newer release, >>>>>> however, when trying to find the actual patch fixing the issue using git >>>>>> bisect I ended up with a MMC merge request which really seems unrelated >>>>>> (36bdcf7f3b). It seems that the problem is quite evasive and disappears >>>>>> if certain structure are aligned differently etc. >>>>>> >>>>>> Enabling initcall debugging showed that U-Boot crashes right after >>>>>> relocation: >>>>>> >>>>>> ... >>>>>> initcall: 00016f2c >>>>>> >>>>>> RAM Configuration: >>>>>> Bank #0: 0 948 MiB >>>>>> Bank #1: 40000000 1 GiB >>>>>> Bank #2: 0 0 Bytes >>>>>> Bank #3: 0 0 Bytes >>>>>> >>>>>> DRAM: 1.9 GiB >>>>>> initcall: 00016bb8 >>>>>> New Stack Pointer is: 3af6d9e0 >>>>>> initcall: 00016da4 >>>>>> initcall: 00016ef0 >>>>>> initcall: 00016ef8 >>>>>> initcall: 00016d38 >>>>>> Relocation Offset is: 3b375000 >>>>>> Relocating to 3b37d000, new gd at 3af78ec0, sp at 3af6d9e0 >>>>>> initcall: 00016ec8 [clear_bss] >>>>>> initcall: 0004465c [display_options?? only appears sometimes] >>>>>> <freeze> >>>>>> >>>>>> I realized when using CONFIG_SYS_(I|D)CACHE_OFF=y the problem seems to >>>>>> disappear. But to be 100% certain that it is cache related, I used my >>>>>> original configuration (which is known to "reliably" freeze), and >>>>>> replaced 00016ec8 with 00008688 manually in the binary, essentially >>>>>> swapping out function pointers in "init_sequence_f" [00008688 is >>>>>> cleanup_before_linux, which flushes and disables I-cache/D-cache]. And >>>>>> indeed, that hacked up binary does boot reliably every time: >>>>>> >>>>>> ... >>>>>> initcall: 00016f2c >>>>>> >>>>>> RAM Configuration: >>>>>> Bank #0: 0 948 MiB >>>>>> Bank #1: 40000000 1 GiB >>>>>> Bank #2: 0 0 Bytes >>>>>> Bank #3: 0 0 Bytes >>>>>> >>>>>> DRAM: 1.9 GiB >>>>>> initcall: 00016bb8 >>>>>> New Stack Pointer is: 3af6d9e0 >>>>>> initcall: 00016da4 >>>>>> initcall: 00016ef0 >>>>>> initcall: 00016ef8 >>>>>> initcall: 00016d38 >>>>>> Relocation Offset is: 3b375000 >>>>>> Relocating to 3b37d000, new gd at 3af78ec0, sp at 3af6d9e0 >>>>>> initcall: 00008688 >>>>>> initcall: 3b38c10c >>>>>> initcall: 3b38c114 >>>>>> initcall: 000172e0 (relocated to 3b38c2e0) >>>>>> initcall: 0001712c (relocated to 3b38c12c) >>>>>> ... >>>>>> >>>>>> From what I understand on RPi4 caches are enabled when entering U-Boot. >>>>>> I was wondering if the relocation code really can handle that? >>>>>> >>>>>> -- >>>>>> Stefan >>>