Ah, so you want a concurrent cache! This is why it's important to talk about use
cases. Turns out there are a lot of subtle issues here, and there has been a lot of
work in this area outside the JDK. For example, a bit of searching turned up this
Stack Overflow question [1]. The answers aren't particularly valuable, but there is
this comment [2] from Ben Manes, who knows something about this topic:
ConcurrentLinkedHashMap [3] beget Guava's Cache [4] which beget Caffeine. [5]
[6]
[1]:
https://stackoverflow.com/questions/40239485/concurrent-lru-cache-implementation
[2]:
https://stackoverflow.com/questions/40239485/concurrent-lru-cache-implementation#comment67809840_40239485
[3]: https://github.com/ben-manes/concurrentlinkedhashmap/wiki/Design
[4]:
https://github.com/ben-manes/concurrentlinkedhashmap/blob/wiki/ConcurrentCachingAtGoogle.pdf
[5]: http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html
[6]: https://github.com/ben-manes/caffeine
It looks like "ConcurrentLinkedHashMap" is a concurrent hash map that maintains
ordering but that doesn't actually maintain a linked list of entries in the same way
as LinkedHashMap. I haven't looked closely at the implementation, but I'd guess that
it maintains ordering using a weakly consistent side data structure.
Weak consistency seems to be the flavor of most concurrent data structures; you give
up non-essential consistency in order to gain concurrency. For example,
ConcurrentHashMap's views' iterators are "weakly consistent" which roughly means
that an element reported by the iterator was a member at one time, but elements
might not be reported by the iterator if they were added or removed sometime during
the iteration. For an LRU cache, weaker semantics might be that you don't care about
the exact ordering of recency of usage. (Indeed, ordering of events is somewhat
ill-defined in a concurrent system.) Instead, you probably want things that were
used fairly recently to stay in the cache longer, while things that haven't been
used recently should tend to expire earlier.
I'd suggest looking at some of these caching libraries. Or if you're keen to roll
your own, read some of the design documents. Which might convince you that you don't
want to roll your own. :-)
s'marks
On 11/10/23 3:26 AM, Sebastian Fischer wrote:
Hi Stuart.
Thanks, I understand now how the limited thread safety of "synchronized" wrappers
has led to not adding new ones.
My use case is an LRU-cache that is accessed by a large number of (virtual)
threads.
In my initial implementation I constructed a LinkedHashMap passing true for the
accessOrder parameter, implemented removeEldestEntry appropriately and then
wrapped it using Collections.sequencedMap. Thanks to this interface, I did not
need a specialized wrapper supporting putLast or pollFirstEntry to manage the
capacity myself. So, wrapping a LinkedHashMap seemed like a viable approach at first.
But, the map is being used by many threads, and a synchronized wrapper is governed
by a single exclusion lock. I improved throughput using a concurrent map
supporting concurrent reads and (some) writes. Here is a simplified example using
a sequenced collection instead of a map.
public record RecentlyAccessed<T>(int capacity, SequencedCollection<T>
elements) {
public RecentlyAccessed(int capacity) {
this(capacity, new ConcurrentLinkedDeque<>());
}
public void add(T elem) {
elements.addLast(elem);
while (capacity < elements.size()) {
elements.removeFirst();
}
}
}
As the add method is not synchronized, the real capacity might deviate from the
intended capacity. Additionally, ConcurrentLinkedDeque can contain duplicate
elements. I could implement repositioning by removing all occurrences of elem
before calling addLast but that would introduce an intermediate state where elem
is not present (even if it was present before) due to the lack of synchronization.
In my current implementation of the cache I use a ConcurrentHashMap with an
additional ConcurrentLinkedDeque holding keys in access order telling me which
entries to remove from the map. But this approach is not really thread-safe due to
missing synchronization and incorrect due to duplicate keys in the deque.
I was hoping to use a ConcurrentLinkedHashSet (or really ConcurrentLinkedHashMap,
but both seem useful), where repositioning would presumably be implemented in a
thread-safe way. Ideally, the option to override removeEldestEntry would allow it
to also maintain the intended capacity in a thread-safe way.
I'm not familiar enough with the implementation of concurrent collections to judge
whether my hope is justified. But essentially, my use case would be an LRU-cache
that does not use a single exclusion lock to improve concurrent access by a large
number of (virtual) threads.
Sebastian
On Fri, Nov 10, 2023 at 1:03 AM Stuart Marks <stuart.ma...@oracle.com> wrote:
Hi Sebastian,
Regarding the lack of "synchronized" wrappers for Sequenced Collections:
the main
issue is that they provide only a limited sense of "thread safety" and as
such
don't
add much actual value. Specifically, the synchronized wrappers hold a lock
only
around invocations of individual methods. This omits a large class of
common,
compound operations on collections, such as check-then-act operations and
bulk
operations that involve iteration or streaming. These operations aren't
thread-safe
with synchronized wrappers. To perform these operations safely from multiple
threads, the client needs to do its own locking around the compound
operation.
This
mostly defeats the purpose of the synchronized wrappers.
If you need to use something like LinkedHashMap/LinkedHashSet from multiple
threads,
my recommendation is to write your own class that exposes exactly the set of
operations you need and that does proper locking for those operations. It
can
then
delegate the appropriate operations to an internal collection.
Regarding concurrent collections: these are mainly intended for operations
to
proceed concurrently on different parts of the collection, such as updating
different mappings in a ConcurrentHashMap simultaneously, or having
producers/consumers operating on different ends of a Queue. I note that
historically, the concurrent collections haven't included concurrent
versions of
LinkedHashSet/Map or general-purpose List implementations. (There is
CopyOnWriteArrayList/Set, but they're highly specialized and can be
prohibitively
expensive for use cases that involve a lot of updates.)
While it looks like things are missing from the concurrent collections, I'd
have to
ask what use cases would be satisfied by adding something like a "concurrent
sequenced set" (or map) that wouldn't be fulfilled with a customized wrapper
around
LinkedHashSet/Map.
s'marks
On 11/2/23 4:56 AM, Sebastian Fischer wrote:
> Hello.
>
> I am new to this list so might have missed previous discussion about this
but
> could not find what I was looking for in the archives.
>
> The Sequenced collections JEP adds the following static methods to
> java.util.Collections.
>
> - unmodifiableSequencedCollection
> - unmodifiableSequencedSet
> - unmodifiableSequencedMap
>
> However, the set of static methods returning thread-safe versions of their
> arguments (like synchronizedCollection, synchronizedSet, and
synchronizedMap) has
> not been extended.
>
> As a consequence, I don't see a straightforward way to create thread-safe
> sequenced sets or maps where encounter-order corresponds to insertion
order
(like
> LinkedHashSet or LinkedHashMap) and access them as SequencedSet or
SequencedMap.
>
> When using the available methods mentioned above, the result is not a
sequenced type.
>
> For predefined sets and maps where encounter order corresponds to natural
order,
> there are the following methods in java.util.Collections.
>
> - synchronizedNavigableSet
> - synchronizedNavigableMap
> - synchronizedSortedSet
> - synchronizedSortedMap
>
> However, these methods cannot be used with LinkedHashSet or LinkedHashMap.
>
> The following methods would be useful in order to create thread-safe
sequenced
> collections where encounter order is insertion order, but they are
currently
not
> provided.
>
> - synchronizedSequencedCollection
> - synchronizedSequencedSet
> - synchronizedSequencedMap
>
> What are the reasons for or against adding these methods?
>
> There are also no implementations of concurrent sequenced sets and maps
where
> encounter order is insertion order in java.util.concurrent. All of the
provided
> concurrent sequenced sets and maps are based on natural order, and
consequently do
> not support addFirst, and so on.
>
> Are there plans to add concurrent sequenced collections that support the
whole
> interface of sequenced collections?
>
> Kind regards,
> Sebastian