I'm still not a network engineer so feel free to disregard my comments. Only doing local reasoning with the Cache class: It's a bad sign that it uses two different mechanisms to deal with concurrency - Cache is synchronized and it also uses a concurrent collection. It seems there's a simple bug in Cache#remove that's easy to fix.
Nobody much likes LinkedList, but it does have the advantage of always shrinking to fit its contents. On Mon, Nov 4, 2019 at 7:09 AM Daniel Fuchs <daniel.fu...@oracle.com> wrote: > Hi Martin, > > On 02/11/2019 16:40, Martin Buchholz wrote: > > Hi Julia, > > > > I think this is the wrong fix. > > I disagree :-), but I understand why you could think that: > see below. > > > Looking at the Cache class - all the methods are synchronized, so it > > looks like any failure is not actually due to concurrent execution, but > > any bug can be reproduced in a single thread. (please add such a repro > > to the bug report) > > Yes - the trick is that for the ConcurrentModificationException > to happen, you have to have something to remove in the cache when > Cacghe::store() is called. > And the only way for that to happen is to have concurrent > responses adding their credentials to the cache at the > "same" time (where "same" means only after all 401 responses > have been received). > For that to happen, then you need concurrent requests to be sent > before any credentials is put in the cache, so that all requests > will call the authenticator, and all responses "compete" to add > their own successful credentials to the cache. In other words: > > req#1 is sent, finds no credentials, gets 401, finds no credential, > calls authenticator, send Authorization. > req#2 is sent, finds no credentials, gets 401, finds no credential, > calls authenticator, send Authorization. > req#3 is sent, finds no credentials, gets 401, finds no credential, > calls authenticator, send Authorization. > req#4 is sent, finds no credentials, gets 401, finds no credential, > calls authenticator, send Authorization. > req#5 is sent, finds no credentials, gets 401, finds no credential, > calls authenticator, send Authorization. > ... > > server waits for all requests to reach this point, then sends back 200 > responses. At this point, the client will (concurrently) receive all > the 200 responses, and will try to update its cache for each response > received. > > resp#1 comes with 200. Client calls Cache::store. There's nothing in > the cache so Cache::remove(3 args) does nothing. > > resp#2 comes with 200. Client calls Cache::store. There's something > in the cache - so Cache::remove(3 args) starts iterating. If the > entries match then the previously stored credentials will be > removed, and ConcurrentModificationException may be thrown (or not). > As you well know ConcurrentModificationException is thrown on best > effort. It doesn't always get thrown... > > resp#3 comes and the whole store/remove dance starts again. > etc... > > This cannot be reproduced with a single thread, because with > sequential requests, if the server sends back a 401, then > the Cache::remove(1 arg) is called when 401 is received, and that > doesn't loop and can't raise any CME. Then when 200 is received > the obsolete credentials have been already removed from > the cache, and therefore the 3 args Cache::remove doesn't find > anything matching to remove - so CME will never be thrown > either. > > You really have to have several concurrent requests/response > competing to authenticate and update the cache to see the CME, > even if though cache is properly synchronized and the CME is > thrown because List::remove() is called by the same thread > that iterates. > > > It looks like Cache#remove can be fixed simply by using an explicit > > iterator and using iterator#remove instead of List#remove. > > Yes - we could have used an Iterator. I don't like much LinkedList. > We could have used ArrayList and an explicit Iterator instead. > I suggested using CopyOnWriteArrayList just to avoid using the > Iterator explicitly. Cache are usually few writes and multiple > reads so that looked appropriate. > > Would you still advised using plain ArrayList and an Iterator instead? > > > > Cache looks fishy in other ways, e.g. why does Cache#remove ignore > > authscheme? > > Right, technical debt I'd say: the HttpClient only supports Basic > authentication so there's really only one scheme: "Basic". > That said, maybe we should clean that up and either have the cache > support multiple schemes properly or reap that out from the cache. > > best regards, > > -- daniel > > > > > (Yes, proper caching infrastructure should be added to the JDK someday, > > but that's a big project) > > > > > > On Fri, Nov 1, 2019 at 9:48 AM Julia Boes <julia.b...@oracle.com > > <mailto:julia.b...@oracle.com>> wrote: > > > > Hi, > > > > This fix addresses an issue in AuthenticationFilter.Cache::remove > that > > can lead to a ConcurrentModificationException. The cache is > implemented > > as LinkedList, the method iterates over the list and potentially > > removes > > items without using an iterator. > > > > To reproduce the bug, multiple requests are sent asynchronously so > that > > successful response headers might arrive concurrently and compete in > > updating the cache. This increases the chances to trigger the > > removal of > > a previously cached credential while storing the credentials of the > > next > > response. Under these circumstances, ConcurrentModificationException > is > > thrown with very high frequency (if not always). > > > > The proposed fix replaces LinkedList with CopyOnWriteArrayList to > > preclude the exception. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8232853 > > > > Webrev: > > > http://cr.openjdk.java.net/~jboes/webrevs/8232853/webrev.01/index.html > > > > The test fails reliably without the fix, and passes with it (jdk_net > > tests run 50 times). > > > > > > Regards, > > > > Julia > > > > > >