> On 14 Jul 2024, at 16:56, Jaroslav Bachorik <j.bacho...@gmail.com> wrote:
> 
> 
> The bottom line is that the clustering solution allows specifying JVM options 
> and extra resources that will be distributed to all nodes. Hence, if you want 
> to add an agent, you need to add the jvn options to point to the location 
> where the agent jar (resource) will be placed.  As you can expect, things can 
> break and you end up killing your Spark pipeline instead of running without 
> observably 🤷‍♂️

So there’s a rather good solution to that: configuration @files. The JVM 
options passed could be simply `@config`, and when deploying the app you 
generate that file with or without a -javaagent option, depending on whether 
the deployment (or the machine) includes the agent or not.

Now, I view a Java application as a tightly-coupled triple consisting of:

1. Code: A particular set of classes, some of which may be application code, 
some may be libraries and some may be agents.

2. Runtime: A particular jlinked runtime generated from a particular JDK 
version containing a particular set of JDK modules. The pre-jlinked runtime 
included in some JDK version is a special case of such a runtime.

3. Configuration: A particular set of options telling the runtime how to run 
the code. This includes configuring which JARs are named modules, which are in 
the unnamed module, and which are agents, as well as GC configuration, system 
properties etc.. The runtime configuration can be set in multiple ways: a set 
of command-line option in a startup script; a set of command-line options in an 
@file; a set of command-line options baked into the runtime with jlink; a set 
of JAR manifest attributes in an executable JAR’s manifest.

Now, there should be no expectation that the application should continue to run 
when any one of these is changed on its own, without at least changing another. 
So, for example, changing the code may require changing the runtime (e.g. if a 
new JDK API unavailable in an old version is now used) or the configuration 
(e.g. if the program now requires more heap space). Changing the runtime may 
require changing the configuration, as configuration options are not backward 
compatible (e.g. memory footprint may change requiring a change to the heap 
size, some integrity requirement may be added requring adding a permission 
etc..). Changing the configuration may similarly require changing the runtime 
(e.g. if a selected GC is unavailable in an old runtime).

In general, a working Java application is such a triplet where all three 
elements are tightly coupled. Because of that, we’re hesitant to add JDK 
capabilities that may give the wrong impression that the triplet is not tightly 
coupled.

Perhaps the present issue is too narrow to discuss this broader subject, but 
we’ve often heard claims about the difficulty setting a configuration, 
sometimes alongside the incorrect expectation that the same configuration 
should work across multiple runtime versions. That is why we’re interested in 
collecting concrete examples to help us understand this challenge better, and 
we would appreciate if others could offer more details/examples.

Anyway, do you think the @file solution could address the difficulty you 
encountered? 

> 
> However, I don’t understand the argument that running without agent can lead 
> to subtle errors. Agents were always meant to be optional - the core 
> application functionality should not really be dependent on the agent 
> availability. But, obviously, I was wrong all this time. 

You are absolutely right that agents were originally intended for 
observability, but they’ve long since been used for functional purposes, too, 
and so can no longer be considered a pure observability feature.

— Ron

Reply via email to