Hi Dan,
I believe that will do the trick for the AbstractCacheServer. The
record approach looks pretty clean.
The Cleaner.create() function does start a thread [1] that is used
process the clean up method so it does mean the runnable implementing
the run function does need to be mindful of blocking. The current
implementation is a synchronized instance method which locks on the
instance. Given the instance will be gone inside the cleaner you could
consider locking on the wrapper to get similar behavior[2] . Given the
cleaner is being called when the object is a phantom reference I would
not expect a lot of possibilities for contention for this lock.
Anything calling it from outside the class would need a reference. At
the very least it would be the same risk as what was happening in the
finalize method.
- Bob
[1]
https://github.com/openjdk/jdk/blob/890adb6410dab4606a4f26a942aed02fb2f55387/src/java.base/share/classes/java/lang/ref/Cleaner.java#L179
[2]
https://gist.github.com/bobpaulin/4cf9706c77b645dce5955bd989cac73e#file-cleanerexample-java-L65
On 3/24/2025 12:06 PM, Dan S wrote:
Bob,
Can you please take a look at the changes I made to
AbstractCacheServer
<https://github.com/dan-s1/nifi/commit/bdfe1c931596b5a4cc537e31f8c9925531535bd7> and
see if those changes make sense. I am currently working on separating
the state from StandardProcessSession although another wrinkle I
noticed is that the rollback method is synchronized. The cleaner
article <https://inside.java/2022/05/25/clean-cleaner/> you shared
with me has the following quote
When a /Cleaner/ is shared among different uses, and
synchronization is needed for the state, the cleanup function
might block the /Cleaner/ thread. As with any synchronization,
check carefully that the synchronization does not result in
deadlock or delay in the cleanup. For example, if the cleanup
involved closing a network connection, it would be prudent to fork
off a task independent of the /Cleaner/ thread.
On Wed, Mar 19, 2025 at 8:52 PM Bob Paulin <b...@bobpaulin.com> wrote:
Hi Dan,
<snip>
It would seem to me all the instant variables used in the
rollback and/or
the methods it references would have to be sent arguments to the
Runnable
and the same actions rollback does and the methods it calls
would have to
be done by the Runnable. Is that correct?
</snip>
Yes, but any non-final variables (like checkpoint) or mutable
primitives
(like the counters) can't be used in the Runnable directly as the
reference value may change outside the constructor. They may be
wrapped
in another class to get the desired behavior [1]. I think a static
class similar to Tomcat [2] might be better for this. No issues with
passing this.instanceVariable either which made me realize the real
problem was using the lambda to create the Runnable. It was
passing a
reference to my test class which caused it not to be phantom
reference
anymore. Fun.
<snip>
the actual StandardProcessSession as ultimately its rollback
method has to be called and this method is not static. So how is it
possible for the Runnable registered with the Cleaner not to
have a handle
on the StandardProcessSession instance?
</snip>
Right you're not going to be able to use rollback as an instance
method in this case. The cleaner will not have access to the
Phantom object directly. So implementing rollback as a static
method or inside the static class might involve wrapping all the
instance variables it needs into a new wrapper object so you don't
have too many parameters. It also would mean that
StandardProcessSession would need to mutate those variables
through the wrapper class rather than directly as it is now.
- Bob
[1] https://gist.github.com/bobpaulin/4cf9706c77b645dce5955bd989cac73e
[2]
https://github.com/apache/tomcat/blob/2bc4d43c9510da7e3da5d509517ad8bcd66739bd/java/org/apache/tomcat/util/net/openssl/OpenSSLContext.java#L163
On 3/19/2025 4:03 PM, Dan S wrote:
> Bob,
> It would seem to me all the instant variables used in the
rollback and/or
> the methods it references would have to be sent arguments to the
Runnable
> and the same actions rollback does and the methods it calls
would have to
> be done by the Runnable. Is that correct?
>
> On Wed, Mar 19, 2025 at 4:57 PM Dan S <dsti...@gmail.com> wrote:
>
>> Bob thanks for your response. I am still confused how to apply
the example
>> you have to the actual StandardProcessSession as ultimately its
rollback
>> method has to be called and this method is not static. So how is it
>> possible for the Runnable registered with the Cleaner not to
have a handle
>> on the StandardProcessSession instance?
>>
>> On Tue, Mar 18, 2025 at 10:46 AM Bob Paulin <b...@apache.org> wrote:
>>
>>> Formatting came through poorly. Please see [1]
>>>
>>> - Bob
>>>
>>> [1]
https://gist.github.com/bobpaulin/ce6a83e45cd6bbf34051935324f10c63
>>>
>>> On 2025/03/18 14:40:19 Bob Paulin wrote:
>>>> Hi
>>>>
>>>> In both cases it appears that finalize is being used as
resource clean
>>>> up following the object being de-referenced. In both cases
finalize is
>>>> not the primary means of cleanup but rather a fall-through for
>>>> exceptional cases where processing is interrupted. The
consequences
>>> are
>>>> different in each case in the case of AbstractCacheServer the
>>>> cacheServer resource is not released which is a memory
concern. In the
>>>> case of StandardProcessSession it's a call to rollback which is
>>>> important for data integrity. So I'm not sure removal is without
>>>> consequences but more so for StandardProcessSession. I also
agree that
>>>> implementing closeable might not be possible in all cases for
>>>> StandardProcessSession.
>>>>
>>>> Cleaner looks like the simplest approach given it's wrapping
some of
>>> the
>>>> complexity in working with the ReferenceQueue[1]. There are more
>>>> sophisticated things we could do using the ReferenceQueue
directly (see
>>>> [2]). There are some important caveats to implementing this
that would
>>>> need to be followed to ensure it works. First we can't use
references
>>>> to the object AbstractCacheServer or StandardProcessSession
within the
>>>> runnable or it will not trigger (since that would create a
reference
>>>> thus it would not be Phantom). It may be valuable to create
a static
>>>> method on each of these classes that requires the constructor
objects
>>> to
>>>> be passed to it in order to discourage the usage references
to the
>>>> object. Second we'll want to avoid passing "this.varaibles"
to the
>>>> static method. Below is some example code for reference.
Notice what
>>>> happens if you try to pass this.value instead of value. The
cleaner is
>>>> not called because it's not a phantom ref anymore. So I'm not
sure it's
>>>> required to pass the cleaning function to an additional class (as
>>>> demonstrated below) but it might help keep things cleaner and
further
>>>> avoid references to the object that we're cleaning from.
Hope this is
>>>> helpful.
>>>>
>>>> packagecom.bobpaulin.nifi;
>>>>
>>>> importjava.lang.ref.Cleaner;
>>>>
>>>> importjava.lang.ref.PhantomReference;
>>>>
>>>> importjava.lang.ref.ReferenceQueue;
>>>>
>>>> importorg.slf4j.Logger;
>>>>
>>>> importorg.slf4j.LoggerFactory;
>>>>
>>>> publicclassCleanerExample {
>>>>
>>>> privatestaticfinalLogger LOGGER=
>>>> LoggerFactory.getLogger(CleanerExample.class);
>>>>
>>>> //Static class just for demo purposes
>>>>
>>>> publicstaticclassStandardProcessorSession {
>>>>
>>>> privatestaticfinalCleaner cleaner= Cleaner.create();
>>>>
>>>> privatefinalCleaner.Cleanable cleanable;
>>>>
>>>> privatefinalString value;
>>>>
>>>> publicStandardProcessorSession(finalString value) {
>>>>
>>>> this.value= value;
>>>>
>>>> finalRunnable cleanUpRunner= () -> rollback(value);
>>>>
>>>> this.cleanable= cleaner.register(this, cleanUpRunner);
>>>>
>>>> }
>>>>
>>>> publicstaticvoidrollback(finalString value) {
>>>>
>>>> System.out.println("Rollback Called. Value is "+ value);
>>>>
>>>> }
>>>>
>>>> }
>>>>
>>>> publicstaticvoidmain(finalString[] args) {
>>>>
>>>> StandardProcessorSession processSession=
>>>> newStandardProcessorSession("Hello");
>>>>
>>>> //Needed for testing
>>>>
>>>> finalReferenceQueue<StandardProcessorSession> refQueue=
>>>> newReferenceQueue<>();
>>>>
>>>> // Create a phantom reference artificially
>>>>
>>>> finalPhantomReference<StandardProcessorSession>
helloPhantomReference=
>>>> newPhantomReference<>(
>>>>
>>>> processSession, refQueue);
>>>>
>>>> newThread(() -> {
>>>>
>>>> LOGGER.info("Starting ReferenceQueue consumer thread.");
>>>>
>>>> booleanreferenceNotFound= true;
>>>>
>>>> while(referenceNotFound) {
>>>>
>>>> try{
>>>>
>>>> @SuppressWarnings("unchecked")
>>>>
>>>> finalPhantomReference<StandardProcessorSession> reference=
>>>> (PhantomReference<StandardProcessorSession>) refQueue
>>>>
>>>> .remove();
>>>>
>>>> referenceNotFound= false;
>>>>
>>>> LOGGER.info("Finalized has been finalized.");
>>>>
>>>> } catch(finalInterruptedException e) {
>>>>
>>>> // Just for the purpose of this example.
>>>>
>>>> break;
>>>>
>>>> }
>>>>
>>>> }
>>>>
>>>> LOGGER.info("Finished ReferenceQueue consumer thread.");
>>>>
>>>> }).start();
>>>>
>>>> // Lose the strong reference to the object.
>>>>
>>>> processSession= null;
>>>>
>>>> // Attempt to trigger a GC.
>>>>
>>>> System.gc();
>>>>
>>>> }
>>>>
>>>> }
>>>>
>>>>
>>>> - Bob
>>>> [1] https://inside.java/2022/05/25/clean-cleaner/
>>>> [2]
>>>>
>>>
https://unitstep.net/blog/2018/03/10/java-phantomreferences-a-better-choice-than-finalize/
>>>> On 3/17/2025 4:11 PM, Dan S wrote:
>>>>> I am currently working on NIFI-14340
>>>>> <https://issues.apache.org/jira/browse/NIFI-14340> trying to
replace
>>> the
>>>>> overridden methods finalize in both AbstractCacheServer and
>>>>> StandardProcessSession as
>>>>> per the javadocs
>>>>> <
>>>
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Object.html#finalize()
>>>>> finalize
>>>>> is deprecated for removal.
>>>>>
>>>>> So my first question is whether we would like a replacement
for these
>>>>> finalize methods or is the solution for this ticket is as
simple as
>>> just
>>>>> removing them?
>>>>> If we would like to replace them based on the suggestions
given in
>>> JEP 421
>>>>> <https://openjdk.org/jeps/421> then I have a few other
questions.
>>>>> The first suggested solution is to make the class implement
>>>>> java.lang.AutoCloseable and then instantiate and use it in a
try with
>>>>> resources statement within a single lexical scope.
>>>>> It does not seem to me that is plausible for AbstractCacheServer
>>> which is a
>>>>> controller service and from what I understand is not used
within a
>>> single
>>>>> lexical scope.
>>>>> It also does not seem this would work for
StandardProcessSession as
>>> there
>>>>> are cases where instances of ProcessSession handle
transferring flow
>>>>> file(s) and commits in a catch block. If used
>>>>> in a try resources it would be closed before the catch block
code is
>>> called.
>>>>> The second suggestion is to use Cleaners but that also seems
>>> difficult as
>>>>> it seems from what I have read the right way to implement
this would
>>> be
>>>>> passing AbstractCacheServer and StandardProcessSession into a
>>> different
>>>>> class to handle cleaning. I am not sure how that would be
done for
>>> these
>>>>> classes.
>>>>>
>>>>> Can I please get guidance on how we should proceed regarding
these
>>> finalize
>>>>> methods? Thank you!
>>>>>