> From: Mutter, Florian <florian.mut...@fashion-digital.de>
> Sent: Tuesday, 9 April 2024 09:39
> To: users@groovy.apache.org <users@groovy.apache.org>
> Subject: Re: [EXT] Re: Deadlock in Java9.getDefaultImportClasses() ?
>  
> > From: Jochen Theodorou <blackd...@gmx.org>
> > Date: Monday, 8. April 2024 at 17:58
> > To: users@groovy.apache.org <users@groovy.apache.org>
> > Subject: [EXT] Re: Deadlock in Java9.getDefaultImportClasses() ?
> > On 08.04.24 16:48, Mutter, Florian wrote:
> > > After updating Kubernetes from 1.27 to 1.28 one of our applications is
> > > not working anymore.
> > >
> > > The application uses thymeleaf templating engine that uses groove under
> > > the hood. The application does not respond to request that require a
> > > template to be rendered. Looking at the stacktrace did not give us any
> > > hint what is causing this. In the profiler it looks like a lot of time
> > > is spent waiting in Java9.getDefaultImportClasses() method. We could not
> > > find any code in there or in ClassFinder.find() that looks like it could
> > > cause a dead lock.
> > >
> > > When attaching the debugger and adding some break points in
> > > Java9.getDefaultImportClasses() it did work 🤷‍♂️.
> >
> > This is really weird. Java, Groovy and thymeleaf versions are the same?
> > What Groovy version are you using btw? What version of Java?
> 
> Its the exact same docker image that we use on both versions of kubernetes.
> 
> Java version is 17.0.10+7-Debian-1deb11u1
> Groovy is 4.0.10
> 
> > > The only thing that we could see that is different between a working
> > > setup and a non-working one is the updated Kubernetes with updated node
> > > images using a newer linux kernel. No idea how this could impact the code.
> >
> > The linux kernel should not impact that unless it is a bug in Java.
> >
> > > Does anyone have an idea what could cause this or what we could do to
> > > identify the cause of the dead lock?
> > >
> > > I attached a screenshot of the profiler.
> >
> > I would be nice to know the line number, then we know at least which of
> > the 3 possible supplyAsync().get() fails. I was thinking maybe this can
> > happen if there is an exception... but normally get() should produce an
> > ExecutionException in that case.
> 
> We'll try to find out where exactly it is hanging.

We found the issue: On the cluster with the new kubernetes version the 
application did start on a smaller node. The node has only 8 cores. The common 
ForkJoinPool was only 1/4th of the size. It seems that the application uses a 
lot of long running threads (e.g. for consuming kafka messages) and never 
allowed the groovy tasks to run. Setting -XX:ActiveProcessorCount to 16 or 
something higher did fix the issue.

Maybe groovy should create it's own thread pool to not interfere with other 
parts of the application. On the other hand if this only runs on startup maybe 
an extra thread pool is overkill.

Best
Florian

Reply via email to