[
https://issues.apache.org/jira/browse/HTTPCORE-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Kalnichevski resolved HTTPCORE-609.
----------------------------------------
Fix Version/s: 4.4.13
Resolution: Fixed
Merged to 4.4.x.
Oleg
> Null Pointer Exception because of race condition in DefaultConnectingIOReactor
> ------------------------------------------------------------------------------
>
> Key: HTTPCORE-609
> URL: https://issues.apache.org/jira/browse/HTTPCORE-609
> Project: HttpComponents HttpCore
> Issue Type: Bug
> Components: HttpCore NIO
> Affects Versions: 4.4.12
> Reporter: Anurag Agarwal
> Priority: Major
> Fix For: 4.4.13
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> There is a race condition because of which a null pointer exception happens
> rendering the IOReactor break.
>
> {code:java}
> [ERROR] [] 2019-10-16 11:27:31.904 [pool-2-thread-1] InternalHttpAsyncClient
> - I/O reactor terminated abnormally[ERROR] [] 2019-10-16 11:27:31.904
> [pool-2-thread-1] InternalHttpAsyncClient - I/O reactor terminated
> abnormallyjava.lang.NullPointerException: null at
> org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:170)
> ~[AdExchange%23%23stable_1.32.23.44.jar:?] at
> org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:148)
> ~[AdExchange%23%23stable_1.32.23.44.jar:?] at
> org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351)
> ~[AdExchange%23%23stable_1.32.23.44.jar:?] at
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221)
> ~[AdExchange%23%23stable_1.32.23.44.jar:?] at
> org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
> ~[AdExchange%23%23stable_1.32.23.44.jar:?] at
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
> {code}
>
>
> This is a tough race condition to reproduce, I did it by using debuggers.
> This was introducted by
> [https://github.com/apache/httpcomponents-core/commit/aa812282f26fdd1975233a892c5405fa0da781b4#diff-d577e717cb1e97f3a3c0adbc8d563062]
>
> I will try my best to explain the steps of what happens in this scenario:
> # A session request is submitted to establish a connection to the
> ConnectingIOReactor.
> #
> [https://github.com/apache/httpcomponents-core/blob/e7b4f5cc306b3eb4b6e1e24627971049001ddd68/httpcore-nio/src/main/java/org/apache/http/impl/nio/reactor/DefaultConnectingIOReactor.java#L258-L260]
> Here it is checked if the request is already completed or not. If not then a
> socket is assigned and there after the request takes place.
> # The race condition happens when after the second step and between the
> assignment of the selection key
> [https://github.com/apache/httpcomponents-core/blob/e7b4f5cc306b3eb4b6e1e24627971049001ddd68/httpcore-nio/src/main/java/org/apache/http/impl/nio/reactor/DefaultConnectingIOReactor.java#L316-L318]
> the request is cancelled.
> # In this case from
> [https://github.com/apache/httpcomponents-core/blob/e7b4f5cc306b3eb4b6e1e24627971049001ddd68/httpcore-nio/src/main/java/org/apache/http/impl/nio/reactor/SessionRequestImpl.java#L208-L227]
> since the key has not yet being assigned, there is nothing to cancel, the
> request is cancelled successuly and the interested parties are informed as
> well via callbacks.
> # When
> [https://github.com/apache/httpcomponents-core/blob/e7b4f5cc306b3eb4b6e1e24627971049001ddd68/httpcore-nio/src/main/java/org/apache/http/impl/nio/reactor/DefaultConnectingIOReactor.java#L316-L318]
> executes now, we have got a condition where we have a cancelled request with
> an active selection key.
> # After the completion of processSessionRequests method, let us say we have
> also reached the expiry time of the request and in the same call we hit
> processTimeouts, where the current request is timedout. Since the request is
> already in completed state (via cancel) it won't be timedout but the
> attachement of the key will get an assignement null via the finally block.
> # At this point, the request is still completed while the key is still not
> cancelled. The key now has the attachment null. When the key becomes ready,
> it will be passed to processEvent method where
> [https://github.com/apache/httpcomponents-core/blob/e7b4f5cc306b3eb4b6e1e24627971049001ddd68/httpcore-nio/src/main/java/org/apache/http/impl/nio/reactor/DefaultConnectingIOReactor.java#L169-L170]
> will produce null pointer exception since the attachment is null.
> This race condition might seem like a rare scenario but under heavy load,
> this is not that difficult to reach. In those scenarios, the IOReactor breaks
> and no further request can be made via the client.
>
> I tried adding a test case for the same, but couldn't. I will let you know
> the params that I am using and the point where I had put the debug point if
> that helps.
>
> soTimeout -> 1000
> connectTimeout -> 1000
> soKeepAlive -> true
> tcpNoDelay -> true
> selectInterval -> 10
> soThreadCount -> 4
> soReuseAddress -> true
>
> DefaultRequestConfig
> socketTimeout -> 1000
> connectTimeout -> 1000
>
> Debug Point:
> [https://github.com/apache/httpcomponents-core/blob/e7b4f5cc306b3eb4b6e1e24627971049001ddd68/httpcore-nio/src/main/java/org/apache/http/impl/nio/reactor/DefaultConnectingIOReactor.java#L263]
>
> At this point I will call request.cancel and wait for 1-2 seconds before I
> move forward, and it reproduces the above scenario.
>
> A suggested solution is to check while assigning key to request if it is
> already completed then cancel the key and close the channel. We have been
> using this in production, and it resolves the above issue. I will also submit
> a corresponding PR for the same.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]