[ https://issues.apache.org/jira/browse/CXF-8765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17909955#comment-17909955 ]
Ben Manes edited comment on CXF-8765 at 1/5/25 8:25 PM: -------------------------------------------------------- Unfortunately I am still able to reproduce this denial of service attack on Ehcache. Running a sql database workload trace took 22.5 minutes on the latest Ehcache3 for a 28% hit rate, 42s if I override the key's hashCode using xxHash but drops to a 21.97% hit rate, while a a simple LRU (e.g. LinkedHashMap) is 6.9s at 20.3% hit rate, and Caffeine is 7.6s at a 45.8% hit rate. Since this is a problem that gets worse with the cache size, if yours are small then it might not be a problem worth your time. If so, though, the WSS4J's usage of a disk cache would be seem unnecessary (defaults at 5,000 heap entries + 10 mb disk). JCache is dead so even if the API was rich enough the migration might not be very beneficial in retrospect. I still find it very frustrating their disinterest to fix this exploit since a developer might actually want a moderately sized cache and be surprised, and that this has was disclosed to them during their pre-release (3.0-M4) and promptly swept under the rug. {code:java} java.lang.Thread.State: RUNNABLE at org.ehcache.impl.internal.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3410) at org.ehcache.impl.internal.concurrent.ConcurrentHashMap.getEvictionCandidate(ConcurrentHashMap.java:6493) at org.ehcache.impl.internal.store.heap.SimpleBackend.getEvictionCandidate(SimpleBackend.java:53) at org.ehcache.impl.internal.store.heap.OnHeapStore.evict(OnHeapStore.java:1579) at org.ehcache.impl.internal.store.heap.OnHeapStore.enforceCapacity(OnHeapStore.java:1558) at org.ehcache.impl.internal.store.heap.OnHeapStore.put(OnHeapStore.java:365) at org.ehcache.core.Ehcache.doPut(Ehcache.java:93) at org.ehcache.core.EhcacheBase.put(EhcacheBase.java:187){code} was (Author: ben.manes): Unfortunately I am still able to reproduce this denial of service attack on Ehcache. Running a sql database workload trace took 22.5 minutes on the latest Ehcache3 for a 28% hit rate, 42s if I override the key's hashCode using xxHash but drops to a 21.97% hit rate, while a a simple LRU (e.g. LinkedHashMap) is 6.9s at 20.3% hit rate, and Caffeine is 7.6s at a 45.8% hit rate. Since this is a problem that gets worse with the cache size, if yours are small then it might not be a problem worth your time. If so, though, the WSS4J's usage of a disk cache would be seem unnecessary (defaults at 5,000 heap entries + 10 mb disk). JCache is dead so even if the API was rich enough the migration might not be very beneficial in retrospect. I still find it very frustrating their disinterest to fix this exploit since a developer might actually want a moderately sized cache and be surprised, and that this has was disclosed to them during their pre-release (3.0-M3) and promptly swept under the rug. {code:java} java.lang.Thread.State: RUNNABLE at org.ehcache.impl.internal.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3410) at org.ehcache.impl.internal.concurrent.ConcurrentHashMap.getEvictionCandidate(ConcurrentHashMap.java:6493) at org.ehcache.impl.internal.store.heap.SimpleBackend.getEvictionCandidate(SimpleBackend.java:53) at org.ehcache.impl.internal.store.heap.OnHeapStore.evict(OnHeapStore.java:1579) at org.ehcache.impl.internal.store.heap.OnHeapStore.enforceCapacity(OnHeapStore.java:1558) at org.ehcache.impl.internal.store.heap.OnHeapStore.put(OnHeapStore.java:365) at org.ehcache.core.Ehcache.doPut(Ehcache.java:93) at org.ehcache.core.EhcacheBase.put(EhcacheBase.java:187){code} > Option to remove Ehcache > ------------------------ > > Key: CXF-8765 > URL: https://issues.apache.org/jira/browse/CXF-8765 > Project: CXF > Issue Type: Improvement > Components: JAX-RS Security > Reporter: Ben Manes > Assignee: Andriy Redko > Priority: Major > Fix For: 3.6.6, 4.0.7, 4.1.1 > > > Is it possible to remove or replace Ehcache with an alternative provider? For > example if JCache was used then one could exclude this dependency and > register an alternative. > I would like to ban Ehcache3 from my dependency tree because it is a trivial > target for a hash flooding denial of service attack. Unfortunately this has > been known and ignored by their team since 2015, and I am still able to > trivially introduce this problem in my test workloads (outside of CXF). For > example, in one simple case Ehcache takes 67 minutes whereas a simple LRU > takes 13 seconds. While I have not seen this exploited, at work we are > undergoing SOC-2 compliance and I'd like to shore up known deficiencies by > banning it company-wide. > For background, the problem is that Ehcache uses a forked version of > ConcurrentHashMap. That map uses a very cheap and weak hash function because > it degrades to a red-black tree on collisions, so the problems are mitigated. > Ehcache uses an sampling policy that relies on the entries being uniformly > distributed during its traversal, which if not degrades to O\(n\). It is > trivial to construct a query pattern that is unfriendly to LRU, triggers an > eviction, and results in threads being stuck performing this eviction scan > instead of servicing requests. The solution is to update their fork with a > more robust hash function or ensure that the keys use a good hashCode, which > then drops this runtime to 1.4 minutes. -- This message was sent by Atlassian Jira (v8.20.10#820010)