Having posted about our success with testing Alpha 7, I need to ‘fess up about 
the end result of all this palaver from July:

With a bit of instrumentation in our build pipeline it turned out that we were 
thrashing on heap space in a few places and, for whatever reason, that was 
causing all the ClassNotFoundException problems. Sigh. I bumped the heap size 
up in the JVM options for Boot and everything ran flawlessly…

…the hint that pushed me in this direction was that, even on Clojure 1.9.0, we 
_occasionally_ saw problems loading instant18 (which I’d assumed was a weird 
edge case in Boot’s async pod setup/refresh)… and that started cropping up more 
often, and then we started seeing the same CFNE problems with 1.9.0 and that 
WebDriver test when run in a long Boot pipeline…

…so clearly the issue WASN’T the new Alpha build: that just happened to push us 
nearer the heap/GC issue.

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood


From: Sean Corfield <s...@corfield.org>
Sent: Saturday, July 21, 2018 8:22:21 PM
To: clojure@googlegroups.com
Subject: RE: [ANN] Clojure 1.10.0-alpha5

Things tried so far:

  *   Update JDK to 1.8.0_144 (to match one of our other environments). No 
effect on the problem (well, I haven’t seem any JVM crashes since!).
  *   Update Boot to 2.8.1. No effect. Ghadi wondered if Boot’s class loading 
might have changed since 2.7.2 and that might be a root cause.
  *   Change our multi-version testing so we’re only loading Clojure 
1.10.0-alpha6 (we were trying both alpha 4 and alpha 6 before). No effect. I 
wondered if loading Clojure based on two different versions of ASM (even 
isolated via pods) might cause issues.

We can reliably run the Boot task on its own – clj-webdriver loads and all our 
tests pass as expected. All of our tasks combined up until this point run just 
fine. All tasks individually run just fine. The only time we see a failure is 
when we combine the WebDriver-based test task with all the other tasks – and it 
reliably fails to load some class (usually a clj-webdriver class but on one run 
it failed to load clojure.tools.logging.impl.Logger) originating from Boot’s 
pod refresh, preparing to load clj-webdriver, as far as I can tell.

I haven’t been able to produce a smaller combination of tasks that exhibits the 
problem. I will observe that I have _occasionally_ seen CNFEs during pod 
refresh in the past, when Components are being stopped (and unloaded?) 
asynchronously, so it may be that this is an intermittent/non-deterministic bug 
in Boot pods that is exacerbated with Clojure 1.10.0-alpha5 and later?

For now we’re staying on Clojure 1.9.0 (and running just our unit tests against 
both that and master-SNAPSHOT). As far as we can tell, all our apps run fine on 
Clojure 1.10.0-alpha6 so this seems to just impact our Boot-based build 
toolchain and we can probably work around that, so we’ll probably switch to 
1.10 at some point (before release) so we can test it in production and provide 
feedback.

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood


From: Sean Corfield <s...@corfield.org>
Sent: Thursday, July 19, 2018 4:55:47 PM
To: clojure@googlegroups.com
Subject: RE: [ANN] Clojure 1.10.0-alpha5

Progress so far… with Alpha 6.

I’ve encountered a number of (random) JVM SEGV fatal errors running our Boot 
build pipeline and if I don’t hit any of those, I hit an exception like this 
fairly reliably:

https://gist.github.com/seancorfield/f29bdb948a2a533c14a07ff6ffd6548a

(the missing class seems to vary from run to run but the exception is always in 
the same place)

It _seems_ to be an interaction between something new in Alpha 5 and Boot’s pod 
machinery since all of the failures I’m encountering seem to have when Boot is 
attempting to refresh pods in its pool of pods. We rely heavily on pods to 
isolate various parts of our build pipeline (since we load in different sets of 
dependencies for different sets of tests).

Ghadi suggested I update my local JDK to a more recent version and try again so 
that’s next on my list.

Narrowing this down is going to be hard: if I run the build pipeline in 
separate “chunks” – which means less interaction between pods – each chunk 
always passes.

So, overall, no failures from our test suite itself for any of our application 
components (good). Just random failures within the build tool itself ☹

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood


From: Sean Corfield <s...@corfield.org>
Sent: Thursday, July 19, 2018 1:08:33 PM
To: clojure@googlegroups.com
Subject: RE: [ANN] Clojure 1.10.0-alpha5


Yes, which allowed us to actually _try_ to run our build pipeline – so the 
problems we’re seeing are fallout from the big changes in Alpha 5… I just 
haven’t nailed them down yet 😊 Everything works fine on Alpha 4.



Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood



From: clojure@googlegroups.com <clojure@googlegroups.com> on behalf of Alex 
Miller <a...@puredanger.com>
Sent: Wednesday, July 18, 2018 10:50:59 AM
To: Clojure
Subject: RE: [ANN] Clojure 1.10.0-alpha5

The only change in alpha6 was the asm fix (your patch!).... :)

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to