ffang opened a new pull request, #3143:
URL: https://github.com/apache/cxf/pull/3143

   ## Problem
   
   `RetryStrategy` contains a shared mutable `counter` instance field that 
makes it stateful. When configured as a singleton bean — as implied by the 
documentation and consistent with its stateless siblings `SequentialStrategy` 
and `RandomStrategy` — this counter is shared across all invocations.
   
   The result is that retries from independent call chains interfere with each 
other:
   
   - Sequential calls on the same thread carry leftover count from a previous 
invocation into the next one
   - Concurrent calls on different threads race against the same integer
   
   In both cases the effective `maxNumberOfRetries` becomes non-deterministic — 
the actual retry limit is anywhere between 1 and the configured maximum, with 
no error or warning.
   
   For example, with `maxNumberOfRetries = 5`: if one call chain fails twice 
and then succeeds, a subsequent call chain starts with `counter = 2` and only 
gets 3 retries instead of 5.
   
   ## Root Cause Analysis
   
   The bug has two dimensions:
   
   **1. Shared instance field**
   
   `RetryStrategy.counter` is incremented by every invocation that shares the 
same strategy instance.
   
   **2. Two distinct code paths both consume the counter**
   
   - `getAlternateEndpoints(Exchange)` → `stillTheSameAddress(Exchange)` 
(endpoint-based failover)
   - `getFailoverTarget` → `selectAlternateAddress` / `selectAlternateEndpoint` 
→ `getNextAlternate` → `stillTheSameAddress()` no-arg (address-based failover)
   
   ## Solution
   
   The retry counter is moved to a `ThreadLocal<Map<Integer, Integer>>` keyed 
by `Exchange` object identity (`System.identityHashCode`). Since 
`exchange.clear()` clears the Exchange's property map but never replaces the 
Exchange object itself, the counter survives between failover attempts while 
remaining scoped to a single invocation.
   
   Both code paths are covered:
   
   - **`stillTheSameAddress(Exchange)`** — new Exchange-aware variant that 
reads and writes the counter directly via the exchange identity key.
   - **`stillTheSameAddress()` no-arg** — kept for subclass compatibility. A 
second `ThreadLocal<Exchange>` (`CURRENT_EXCHANGE`) is set by 
`FailoverTargetSelector.getFailoverTarget` in a `try/finally` block before 
calling `selectAlternateAddress` / `selectAlternateEndpoint`, so the no-arg 
path can look up the correct per-exchange counter. This is safe under async 
dispatch and thread-pool reuse because it is keyed by Exchange identity, not by 
thread.
   
   ### Changes to `RetryStrategy`
   
   - Remove `private int counter` instance field
   - Add `EXCHANGE_COUNTERS` — `ThreadLocal<Map<Integer, Integer>>` keyed by 
exchange identity hash
   - Add `CURRENT_EXCHANGE` — `ThreadLocal<Exchange>` set by 
`FailoverTargetSelector` before entering the alternate-selection chain
   - Add `stillTheSameAddress(Exchange)` — Exchange-aware variant used by 
`getAlternateEndpoints`
   - Add `withExchange()` — helper that sets/restores `CURRENT_EXCHANGE` in a 
`try/finally`
   - Add `setCurrentExchange` / `clearCurrentExchange` — called by 
`FailoverTargetSelector.getFailoverTarget`
   - Add `resetCounter(Exchange)` / `removeCounter(Exchange)` — lifecycle hooks 
called by `FailoverTargetSelector`
   
   ### Changes to `FailoverTargetSelector`
   
   | Method | Change |
   |--------|--------|
   | `prepare()` | Calls `RetryStrategy.resetCounter(exchange)` to start each 
new top-level invocation from zero |
   | `getFailoverTarget()` | Wraps both `selectAlternateAddress` and 
`selectAlternateEndpoint` in a `try/finally` that sets and clears 
`CURRENT_EXCHANGE` |
   | `complete()` | Calls `RetryStrategy.removeCounter(exchange)` to prevent 
unbounded growth of the per-thread map |
   
   ## Testing
   
   Added `RetryStrategyTest` with 7 unit tests:
   
   | Test | What it verifies |
   |------|-----------------|
   | `testExchangePathRetriesExactlyMaxTimes` | Counter exhausts after exactly 
`maxNumberOfRetries` attempts |
   | `testExchangeCounterDoesNotLeakAcrossInvocations` | Fresh exchange always 
starts from zero regardless of prior state |
   | `testResetCounterClearsExchangeState` | `resetCounter` restores a clean 
slate (simulates `prepare()`) |
   | `testMaxRetriesZeroAlwaysReturnsSameAddress` | `maxNumberOfRetries=0` 
means unlimited retries |
   | `testNoArgPathUsesExchangeCounter` | No-arg `stillTheSameAddress()` shares 
the same per-exchange counter |
   | `testNoArgCounterDoesNotLeakAcrossExchanges` | No-arg path also isolates 
state between independent exchanges |
   | `testConcurrentInvocationsDoNotInterfere` | Two threads sharing a 
singleton strategy each get their full quota |
   
   All existing system tests pass:
   
   - `FailoverTest#testSequentialStrategyWithRetries`
   - `CircuitBreakerFailoverTest#testSequentialStrategyWithRetries`
   - `FailoverWebClientTest#testRetryFailover`
   - `FailoverWebClientTest#testCircuitBreakerRetryFailover`
   - `FailoverWebClientTest#testRetryFailoverAlternateAddresses`
   - `JAXRSSoapBookTest#testCheckBookClientErrorResponseWithFailover`
   
   ## Related
   
   - Jira: https://issues.apache.org/jira/browse/CXF-9213
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to