Hi Tom, On Tue, 19 Oct 2021 at 16:53, Tom Rini <tr...@konsulko.com> wrote: > > On Tue, Oct 19, 2021 at 05:39:12PM +0200, Stefano Babic wrote: > > Hi Simon, > > > > On 07.10.21 15:43, Simon Glass wrote: > > > Hi Stefano, > > > > > > On Thu, 7 Oct 2021 at 04:37, Stefano Babic <sba...@denx.de> wrote: > > > > > > > > Hi all, > > > > > > > > CI stops by building aarch64 without notice, for reference: > > > > > > > > https://source.denx.de/u-boot/custodians/u-boot-imx/-/jobs/332319 > > > > > > > > There is no error, just process is killed. It looks like it stops at > > > > xilinx_zynqmp_virt, > > > > > > > > ./tools/buildman/buildman -o /tmp -P -E -W aarch64but board can be built > > > > without issues. > > > > > > > > If I build on my host (not in docker, anyway), it generally builds fine > > > > - but it crashes sometimes, too. On gitlab instance , it crashes. > > > > Issue does not seem that depends on merged patches, and introduces > > > > boards were already built successfully. Any hint ? I have also no idea > > > > what I should look as what I see is just > > > > > > > > "usr/bin/bash: line 104: 24 Killed > > > > ./tools/buildman/buildman -o /tmp -P -E -W aarch64" > > > > > > I cannot see that link. I am not sure what is going on. Does it say > > > what signal killed it? > > > > Pipelines on our server were not public - I have enbaled now for u-boot-imx. > > > > > > > > Does it sit there for an hour and timeout? If so, then I did see that > > > myself once recently, when the Kconfig needed stdin, but I could not > > > quitetie it down. I think buildman would provide it, but sometimes > > > not, apparently. So it can happen when there is an existing build > > > there and your new one which adds Kconfig options that don't have > > > defaults, or something like that? > > > > > > > I have investigated further, and I can reproduce it on my host outside the > > gitlab server. buildman causes a OOM, but I cannot find the cause. > > > > Strange enough, this happens with the "aarch64" target, and I cannot > > reproduce it with Tom's master. So it seems that -master is ok, and somethin > > on u-boot-imx generates the OOM. > > > > However.... > > > > The OOM happens always when -2 (two boards remain) appears. I can see with > > htop that buildman starts to allocate memory until it is exhausted (64GB RAM > > + 8 GB swap). Then the kernel decides that it is enough and kills buildman - > > this is what I see on Ci. > > > > You can see now the pipelines: > > > > https://source.denx.de/u-boot/custodians/u-boot-imx/-/pipelines/9520 > > > > I have then split aarch64 and I built imx8 separately - same result. The > > pipeline stops with xilinx board, but they have nothing to do. In fact, I > > can build all xilinx board separately. If I run buildman -W aarch64 -x > > xilinx, OOM is shown by another board. > > > > Strange enough, I can build each single board with buildman without issues, > > neither errors nor warnongs. Just when buildman runs all together (aarch64, > > 308 boards), the OOM is generated. > > > > Bisect does not help: I started bisect, and at the end this commit was > > presented: > > > > commit 53a24dee86fb72ae41e7579607bafe13442616f2 > > Author: Fabio Estevam <feste...@denx.de> > > Date: Mon Aug 23 21:11:09 2021 -0300 > > > > imx8mm-cl-iot-gate: Split the defconfigs > > I strongly suspect what's going on here is that these new defconfigs are > out of sync with changes now in Kconfig. The build itself will just sit > there, waiting for the "oldconfig" prompt to be answered. > > I want to say the problem here is that stdin is open, rather than > pointing to something closed and would lead to the build failing > immediately, rather than once a timeout is hit, or OOM kicks in due to > kconfig chewing up all the memory.
Yes that's exactly what I saw... In fact, see this commit: e62a24ce27a buildman: Avoid hanging when the config changes But that was 3 years ago. Regards, Simon