Andreas Sewe created HTTPCLIENT-1789:
----------------------------------------

             Summary: 200 Response with Vary header does not invalidate cached 
404 response without
                 Key: HTTPCLIENT-1789
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1789
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpCache
         Environment: Tested with the org.apache.httpcomponents.httpclient 
4.3.6 OSGi bundle distributed by Eclipse Orbit: 
<http://download.eclipse.org/tools/orbit/downloads/drops/R20160520211859/>
            Reporter: Andreas Sewe


While implementing my own {{HttpCacheStorage}} I noticed the following 
problematic cache revalidation behavior. FYI, this behavior also occurs with 
{{BasicHttpCacheStorage}} (created through 
{{CachingHttpClients.createMemoryBound()}}), so it is not caused by my 
{{HttpCacheStorage}} implementation. Consider this sequence of requests and 
responses:

* {{GET /something HTTP/1.1}}
* {{Accept: application/json}}

* {{404 Not Found HTTP/1.1}}
* {{Cache-Control: max-age=60}}

This response is cached under the key {{/something}}. After 60 seconds, another 
{{GET}} request is performed and send over the network, as the cached {{404}} 
response is stale.

* {{GET /something HTTP/1.1}}
* {{Accept: application/json}}

* {{200 OK HTTP/1.1}}
* {{Vary: Accept}}
* {{Cache-Control: max-age=120}}

This response is cached under the key {{\{Accept:application/json\}/something}} 
and key {{/something}}’s {{variantMap}} is updated to refer to this key. After 
another 60 seconds, a third {{GET}} request is performed which again performs 
*network I/O* – even though it IMHO should not.

* {{GET /something HTTP/1.1}}
* {{Accept: application/json}}

* {{200 OK HTTP/1.1}}
* {{Vary: Accept}}
* {{Cache-Control: max-age=120}}

This re-validation occurs because a stale {{404}} response for {{/something}} 
was cached – although its {{variantMap}} contains a fresh, selectable {{200}} 
response.

FWIW, [RFC 7234|https://tools.ietf.org/html/rfc7234#page-9] has this to say 
about the subject:

{quote}
   The stored response with matching selecting header fields is known as
   the selected response.

   If multiple selected responses are available (potentially including
   responses without a Vary header field), the cache will need to choose
   one to use.  When a selecting header field has a known mechanism for
   doing so (e.g., qvalues on Accept and similar request header fields),
   that mechanism MAY be used to select preferred responses; of the
   remainder, the most recent response (as determined by the Date header
   field) is used, as per Section 4.
{quote}

According to this, the {{200}} response should have been selected, as its 
{{Date}} is newer than the {{404}}'s responses. Instead, another request for 
{{/something}} is send to the server, even though the most recent cache entry 
is still fresh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to