There has been a fair amount of investigation offline of this (thanks to Edmund Grimley-Evans). Here are the most relevant emails to keep this bug up to date.
----- Forwarded message from Edward Nevill <[email protected]> ----- Date: Thu, 18 Sep 2014 00:16:58 +0100 From: Edward Nevill <[email protected]> To: Wookey <[email protected]> Cc: Edmund Grimley-Evans <[email protected]> Subject: Re: Fw: Re: openjdk-7 build failure on juno X-Spam-Status: No, score=-2.6 required=4.5 tests=AWL,BAYES_00, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 Hi, OK. Looking at Edmund's description of the problem I believe this is due to a problem with ISBs. The problem is that when the JIT patches up code on the fly it does - flush dcache - invalidate icache on the patched region (with appropriate memory barriers). The problem is that on ARMv8 this is not sufficient. Every core that might execute the code that has just been patched must do an ISB, not just the core that is doing the patching. So essentially the core that is doing the patching must force all other cores to do an ISB. The only practical way to do this is to force all other threads within the process to do an ISB. The reason this does not happen on Mustang is that the ICache is more coherent on Mustang than on A57. I had thought that the additional ISB was unnecessary on Mustang, but Edmund says he has seen the problem once on Mustang. So, the good news is that a patch to fix this has been pushed to the aarch64 jdk7 hotspot tip http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot However, you are building from IcedTea which does not use this. Instead it uses a snapshot tarball of the hotspot tree. This is why you could not find the gcc.make, because it is wrapped up in the hotspot tarball. There is a file hotspot.map in the top level of the IcedTea build which, for each architecture, points to the hotspot tarball for that arch. So, what need to happen is a new hotspot tarball needs to be created from the repository above and then the entry in the hotspot.map replaced with the new entry allong with the appropriate SHA256SUM. If you let me know what version of IcedTea you are using I can create the revised tarball tomorrow. I *think* this should fix the problem. If this is the problem then just using taskset 01 on the whole build should make the problem go away albeit very slowly. All the best, Ed. On 17/09/2014, Wookey <[email protected]> wrote: > some more info from other-Ed. Edmund - keep Ed Nevil cc:ed on this as > he might some idea what to poke. > > ----- Forwarded message from Edmund Grimley Evans > <[email protected]> ----- > > Date: Wed, 17 Sep 2014 15:25:05 +0100 > From: Edmund Grimley Evans <[email protected]> > To: Wookey <[email protected]> > Subject: Re: debian-ports > X-Spam-Status: No, score=-2.6 required=4.5 tests=BAYES_00,FREEMAIL_FROM, > RCVD_IN_DNSWL_LOW,T_DKIM_INVALID > autolearn=ham version=3.3.2 > > Some updates: > > A good way to copy the ginormous build tree is with: tar cf - dir | > ssh host 'cd somewhere && tar xf -' > > (It took me a little while to understand that the argument to ssh is > not really what I call a "command" but some strings that get handed to > the shell!) > > I copied the mustang's successful build tree to the juno, and the > juno's unsuccessful build tree to the mustang. > > On the mustang, with either tree, the command I quoted early nearly > always runs in less than 10 seconds and works. However - and this is > interesting - I did see it fail on one occasion in a very similar way > to how it fails on the juno. > > On the juno, with either tree, the command usually fails, but > sometimes works, and, I happened to notice, it worked a lot more > frequently while I was copying the other build tree across. Sometimes > it runs for more than an hour. I've never seen it run for less than > 2.5 minutes, but in that particular case it succeeded in 2.5 minutes. > > You can make the command run successfully on the juno, in just 40 s, > every time I tried it, by the simple device of using taskset 0x10 > (making the program use just one core). > > So it appears to be a threading/lock issue. I would guess that either > the program is using the wrong kind of threading/lock primitive, or > the threading/locking is broken. > > This is definitely worth investigating. It could be that the issue > that is causing the openjdk-7 build to fail almost every time is also > causing intermittent problems with other programs. > > Can you pass this news onto anyone who might be interested? > > Edmund > > ----- End forwarded message ----- > Wookey > -- > Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM > http://wookware.org/ > ----- End forwarded message ----- Wookey -- Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM http://wookware.org/ -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

