[ https://issues.apache.org/jira/browse/HTTPCLIENT-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17724557#comment-17724557 ]
Oleg Kalnichevski commented on HTTPCLIENT-1347: ----------------------------------------------- [~arturobernalg] I have looked at the original reproducer and ported to the latest Node.js and HttpClient stable versions. I am afraid the problem is still there. The test server you have been using does not send a `Vary` header in its responses and therefore it does not trigger the defect. Here's the message exchange generated with curl and the test server provided by the original issue reporter. See the `Vary: Accept-Encoding` header in the response. {noformat} $ curl -H 'Accept-Encoding: gzip' -v http://localhost:8000/assets/1 * Trying 127.0.0.1:8000... * Connected to localhost (127.0.0.1) port 8000 (#0) > GET /assets/1 HTTP/1.1 > Host: localhost:8000 > User-Agent: curl/7.81.0 > Accept: */* > Accept-Encoding: gzip > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < X-Powered-By: Express < Accept-Ranges: bytes < Cache-Control: public, max-age=31536000 < Last-Modified: Sat, 05 Nov 2011 19:41:40 GMT < ETag: W/"c23c6-13375402d20" < Content-Type: application/octet-stream < Vary: Accept-Encoding < Content-Encoding: gzip < Date: Sat, 20 May 2023 15:19:25 GMT < Connection: keep-alive < Keep-Alive: timeout=5 < Transfer-Encoding: chunked < {noformat} Here's the updated server and client test code that I used to reproduce the problem *Server:* {code:javascript} const compression = require('compression') const express = require('express') const app = express() app.use(compression({ filter: function () { return true; } })); const oneYear = 31557600000; app.use('/assets', express.static('assets', { maxAge: oneYear })) const port = 8000 app.listen(port, () => { console.log(`Example app listening on port ${port}`) }) {code} {*}Client{*}: {code:java} public static void main(String[] args) throws Exception { SurespotHttpCacheStorage cacheStorage = new SurespotHttpCacheStorage(); CloseableHttpClient client = CachingHttpClientBuilder.create() .setCacheConfig(CacheConfig.custom() .setMaxCacheEntries(50) .setMaxObjectSize(1000000) .build() ) .setHttpCacheStorage(cacheStorage) .build(); client.execute(new HttpGet("http://localhost:8000/assets/1"), response -> { EntityUtils.consume(response.getEntity()); return null; }); client.execute(new HttpGet("http://localhost:8000/assets/2"), response -> { EntityUtils.consume(response.getEntity()); return null; }); client.execute(new HttpGet("http://localhost:8000/assets/1"), response -> { EntityUtils.consume(response.getEntity()); return null; }); client.execute(new HttpGet("http://localhost:8000/assets/2"), response -> { EntityUtils.consume(response.getEntity()); return null; }); cacheStorage.dumpCache(); } public static class SurespotHttpCacheStorage implements HttpCacheStorage { private final HashMap<String, HttpCacheEntry> mCache = new HashMap<>(); @Override public HttpCacheEntry getEntry(String arg0) { System.out.println("getting entry: " + arg0); return mCache.get(arg0); } @Override public void putEntry(String key, HttpCacheEntry entry) { System.out.println("putting entry: " + key); mCache.put(key, entry); } @Override public void removeEntry(String arg0) { System.out.println("removing entry: " + arg0); mCache.remove(arg0); } @Override public void updateEntry(String key, HttpCacheCASOperation casOperation) throws ResourceIOException, HttpCacheUpdateException { System.out.println("updating entry: " + key); HttpCacheEntry entry = getEntry(key); //I believe we should not update non existent entries in the "update method", but it throws exceptions otherwise //if (entry != null) { putEntry(key, casOperation.execute(entry)); // } else { // throw new HttpCacheUpdateException("key not found: " + arg0); // } //System.out.println(mCache.toString()); } @Override public Map<String, HttpCacheEntry> getEntries(Collection<String> keys) throws ResourceIOException { Set<String> set = new HashSet<>(keys); return mCache.entrySet().stream() .filter(e -> set.contains(e.getKey())) .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); } public void dumpCache() { for (Map.Entry<String, HttpCacheEntry> entry : mCache.entrySet()) { System.out.println("cache key: " + entry.getKey() + " content length: " + entry.getValue().getResource().length()); } } } {code} {*}Output{*}: {noformat} getting entry: http://localhost:8000/assets/1 getting entry: http://localhost:8000/assets/1 getting entry: http://localhost:8000/assets/1 putting entry: {Accept-Encoding=gzip%2C+x-gzip%2C+deflate}http://localhost:8000/assets/1 updating entry: http://localhost:8000/assets/1 getting entry: http://localhost:8000/assets/1 putting entry: http://localhost:8000/assets/1 getting entry: http://localhost:8000/assets/2 getting entry: http://localhost:8000/assets/2 getting entry: http://localhost:8000/assets/2 putting entry: {Accept-Encoding=gzip%2C+x-gzip%2C+deflate}http://localhost:8000/assets/2 updating entry: http://localhost:8000/assets/2 getting entry: http://localhost:8000/assets/2 putting entry: http://localhost:8000/assets/2 getting entry: http://localhost:8000/assets/1 getting entry: {Accept-Encoding=gzip%2C+x-gzip%2C+deflate}http://localhost:8000/assets/1 getting entry: http://localhost:8000/assets/2 getting entry: {Accept-Encoding=gzip%2C+x-gzip%2C+deflate}http://localhost:8000/assets/2 cache key: http://localhost:8000/assets/2 content length: 904333 cache key: http://localhost:8000/assets/1 content length: 793875 cache key: {Accept-Encoding=gzip%2C+x-gzip%2C+deflate}http://localhost:8000/assets/2 content length: 904333 cache key: {Accept-Encoding=gzip%2C+x-gzip%2C+deflate}http://localhost:8000/assets/1 content length: 793875 {noformat} There are four entries in the case which seems to be correct (one root and one variant) but storing the same content twice looks wasteful indeed. Oleg > gzip responses doubly cached > ---------------------------- > > Key: HTTPCLIENT-1347 > URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1347 > Project: HttpComponents HttpClient > Issue Type: Improvement > Components: HttpCache > Affects Versions: 4.2.5 > Environment: ARCH Linux kernel 3.8.8-1 > node.js 0.8.22 > Reporter: Adam Patacchiola > Priority: Major > Labels: stuck, volunteers-wanted > Fix For: Stuck > > Attachments: Archive.zip, Screen Shot 2014-01-11 at 7.11.36 PM.png, > Screen Shot 2014-01-13 at 3.56.19 PM.png, Showing_entry_pointer.png, > httpClientCacheTest.tar.gz, httpClientTestServer.js, output.out > > > Compressed responses are cached twice. > Run the attached server (node.js 0.8.22) and client tests. Create an "assets" > directory under where you are running the server and add two files named 1 > and 2 ( < 1000000 bytes) . You will see that after the test is run the cache > dump output displays 2 sets of entries for each request, each containing the > full content length of the file. > Changing the implementation of HttpCacheStorage updateEntry to not update non > existent entries (as I believe the correct implementation should do) throws > exceptions. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org