Re: RFR: 8339687: Rearrange reachabilityFence()s in jdk.test.lib.util.ForceGC

Brent Christian Tue, 10 Sep 2024 13:14:30 -0700

On Sat, 7 Sep 2024 00:59:25 GMT, Stuart Marks <sma...@openjdk.org> wrote:


>> From the bug description:
>> ForceGC would be improved by moving the Reference.reachabilityFence() calls 
>> for 'obj' and 'ref'.
>> 
>> Reference.reachabilityFence(obj) is currently placed after 'obj' has been 
>> set to null, so effectively does nothing. It should occur before obj = null;
>> 
>> For Reference.reachabilityFence(ref): 'ref' is a PhantomReference to 'obj', 
>> and is registered with 'queue'. ForceGC.waitFor() later remove()s the 
>> reference from the queue, as an indication that some GC and reference 
>> processing has taken place (hopefully causing the BooleanSupplier to return 
>> true).
>> 
>> The code expects the PhantomReference to be cleared and be put on the queue. 
>> But recall that a Reference refers to its queue, and not the other way 
>> around. If a Reference becomes unreachable and is garbage collected, it will 
>> never be enqueued.
>> 
>> I argue that the VM/GC could determine that 'ref' is not used by waitFor() 
>> and collect it before the call to queue.remove(). Moving 
>> Reference.reachabilityFence(ref) after the for() loop would prevent this 
>> scenario.
>> 
>> While this is only a very minor deficiency in ForceGC, I believe it would be 
>> good to ensure that the code behaves as expected.
>
> I added a couple specific comments on the code that I thought ought to be 
> addressed in this PR.
> 
> There is a broader issue with the timeout logic that we should be aware of, 
> however, we might or might not choose to address it in this PR. The main 
> issue is that the caller has requested a particular amount of time as the 
> timeout, and the timeout loop divides by 200ms to determine the maximum 
> number of retries. This _assumes_ that each loop will take 200ms. However, 
> this might not be true, because we don't know how long the booleanSupplier 
> takes, we don't know how long System.gc() takes, and we don't know how long 
> queue.remove() takes. This isn't an idle concern. Somebody might pass in a 
> booleanSupplier that itself has a timeout (say of 1 second) which will cause 
> this method to take about six times longer than expected to time out.
> 
> The usual approach for timeout logic is to take the initial System.nanoTime() 
> value and compare subsequent return values of nanoTime() to the timeout 
> duration, and exit the loop if the timeout duration has been exceeded. See 
> the nanoTime() javadoc for an example.

> The sketchy timeout handling mentioned by @stuart-marks seems to me to be out 
> of scope for this issue.

I also would prefer that updates to timeout handling be done as a separate 
issue.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20898#issuecomment-2341925965

Re: RFR: 8339687: Rearrange reachabilityFence()s in jdk.test.lib.util.ForceGC

Reply via email to