Ah, so you want a concurrent cache! This is why it's important to talk about use cases. Turns out there are a lot of subtle issues here, and there has been a lot of work in this area outside the JDK. For example, a bit of searching turned up this Stack Overflow question [1]. The answers aren't particularly valuable, but there is this comment [2] from Ben Manes, who knows something about this topic:

ConcurrentLinkedHashMap [3] beget Guava's Cache [4] which beget Caffeine. [5] 
[6]

[1]: 
https://stackoverflow.com/questions/40239485/concurrent-lru-cache-implementation

[2]: https://stackoverflow.com/questions/40239485/concurrent-lru-cache-implementation#comment67809840_40239485

[3]: https://github.com/ben-manes/concurrentlinkedhashmap/wiki/Design

[4]: https://github.com/ben-manes/concurrentlinkedhashmap/blob/wiki/ConcurrentCachingAtGoogle.pdf

[5]: http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html

[6]: https://github.com/ben-manes/caffeine

It looks like "ConcurrentLinkedHashMap" is a concurrent hash map that maintains ordering but that doesn't actually maintain a linked list of entries in the same way as LinkedHashMap. I haven't looked closely at the implementation, but I'd guess that it maintains ordering using a weakly consistent side data structure.

Weak consistency seems to be the flavor of most concurrent data structures; you give up non-essential consistency in order to gain concurrency. For example, ConcurrentHashMap's views' iterators are "weakly consistent" which roughly means that an element reported by the iterator was a member at one time, but elements might not be reported by the iterator if they were added or removed sometime during the iteration. For an LRU cache, weaker semantics might be that you don't care about the exact ordering of recency of usage. (Indeed, ordering of events is somewhat ill-defined in a concurrent system.) Instead, you probably want things that were used fairly recently to stay in the cache longer, while things that haven't been used recently should tend to expire earlier.

I'd suggest looking at some of these caching libraries. Or if you're keen to roll your own, read some of the design documents. Which might convince you that you don't want to roll your own. :-)

s'marks


On 11/10/23 3:26 AM, Sebastian Fischer wrote:
Hi Stuart.

Thanks, I understand now how the limited thread safety of "synchronized" wrappers has led to not adding new ones.

My use case is an LRU-cache that is accessed by a large number of (virtual) 
threads.

In my initial implementation I constructed a LinkedHashMap passing true for the accessOrder parameter, implemented removeEldestEntry appropriately and then wrapped it using Collections.sequencedMap. Thanks to this interface, I did not need a specialized wrapper supporting putLast or pollFirstEntry to manage the capacity myself. So, wrapping a LinkedHashMap seemed like a viable approach at first.

But, the map is being used by many threads, and a synchronized wrapper is governed by a single exclusion lock. I improved throughput using a concurrent map supporting concurrent reads and (some) writes. Here is a simplified example using a sequenced collection instead of a map.

    public record RecentlyAccessed<T>(int capacity, SequencedCollection<T> 
elements) {
        public RecentlyAccessed(int capacity) {
            this(capacity, new ConcurrentLinkedDeque<>());
        }

        public void add(T elem) {
            elements.addLast(elem);
            while (capacity < elements.size()) {
                elements.removeFirst();
            }
        }
    }

As the add method is not synchronized, the real capacity might deviate from the intended capacity. Additionally, ConcurrentLinkedDeque can contain duplicate elements. I could implement repositioning by removing all occurrences of elem before calling addLast but that would introduce an intermediate state where elem is not present (even if it was present before) due to the lack of synchronization.

In my current implementation of the cache I use a ConcurrentHashMap with an additional ConcurrentLinkedDeque holding keys in access order telling me which entries to remove from the map. But this approach is not really thread-safe due to missing synchronization and incorrect due to duplicate keys in the deque.

I was hoping to use a ConcurrentLinkedHashSet (or really ConcurrentLinkedHashMap, but both seem useful), where repositioning would presumably be implemented in a thread-safe way. Ideally, the option to override removeEldestEntry would allow it to also maintain the intended capacity in a thread-safe way.

I'm not familiar enough with the implementation of concurrent collections to judge whether my hope is justified. But essentially, my use case would be an LRU-cache that does not use a single exclusion lock to improve concurrent access by a large number of (virtual) threads.

Sebastian



On Fri, Nov 10, 2023 at 1:03 AM Stuart Marks <stuart.ma...@oracle.com> wrote:

    Hi Sebastian,

    Regarding the lack of "synchronized" wrappers for Sequenced Collections: 
the main
    issue is that they provide only a limited sense of "thread safety" and as 
such
    don't
    add much actual value. Specifically, the synchronized wrappers hold a lock 
only
    around invocations of individual methods. This omits a large class of 
common,
    compound operations on collections, such as check-then-act operations and 
bulk
    operations that involve iteration or streaming. These operations aren't
    thread-safe
    with synchronized wrappers. To perform these operations safely from multiple
    threads, the client needs to do its own locking around the compound 
operation.
    This
    mostly defeats the purpose of the synchronized wrappers.

    If you need to use something like LinkedHashMap/LinkedHashSet from multiple
    threads,
    my recommendation is to write your own class that exposes exactly the set of
    operations you need and that does proper locking for those operations. It 
can
    then
    delegate the appropriate operations to an internal collection.

    Regarding concurrent collections: these are mainly intended for operations 
to
    proceed concurrently on different parts of the collection, such as updating
    different mappings in a ConcurrentHashMap simultaneously, or having
    producers/consumers operating on different ends of a Queue. I note that
    historically, the concurrent collections haven't included concurrent 
versions of
    LinkedHashSet/Map or general-purpose List implementations. (There is
    CopyOnWriteArrayList/Set, but they're highly specialized and can be 
prohibitively
    expensive for use cases that involve a lot of updates.)

    While it looks like things are missing from the concurrent collections, I'd
    have to
    ask what use cases would be satisfied by adding something like a "concurrent
    sequenced set" (or map) that wouldn't be fulfilled with a customized wrapper
    around
    LinkedHashSet/Map.

    s'marks


    On 11/2/23 4:56 AM, Sebastian Fischer wrote:
    > Hello.
    >
    > I am new to this list so might have missed previous discussion about this 
but
    > could not find what I was looking for in the archives.
    >
    > The Sequenced collections JEP adds the following static methods to
    > java.util.Collections.
    >
    > - unmodifiableSequencedCollection
    > - unmodifiableSequencedSet
    > - unmodifiableSequencedMap
    >
    > However, the set of static methods returning thread-safe versions of their
    > arguments (like synchronizedCollection, synchronizedSet, and
    synchronizedMap) has
    > not been extended.
    >
    > As a consequence, I don't see a straightforward way to create thread-safe
    > sequenced sets or maps where encounter-order corresponds to insertion 
order
    (like
    > LinkedHashSet or LinkedHashMap) and access them as SequencedSet or 
SequencedMap.
    >
    > When using the available methods mentioned above, the result is not a
    sequenced type.
    >
    > For predefined sets and maps where encounter order corresponds to natural
    order,
    > there are the following methods in java.util.Collections.
    >
    > - synchronizedNavigableSet
    > - synchronizedNavigableMap
    > - synchronizedSortedSet
    > - synchronizedSortedMap
    >
    > However, these methods cannot be used with LinkedHashSet or LinkedHashMap.
    >
    > The following methods would be useful in order to create thread-safe 
sequenced
    > collections where encounter order is insertion order, but they are 
currently
    not
    > provided.
    >
    > - synchronizedSequencedCollection
    > - synchronizedSequencedSet
    > - synchronizedSequencedMap
    >
    > What are the reasons for or against adding these methods?
    >
    > There are also no implementations of concurrent sequenced sets and maps 
where
    > encounter order is insertion order in java.util.concurrent. All of the 
provided
    > concurrent sequenced sets and maps are based on natural order, and
    consequently do
    > not support addFirst, and so on.
    >
    > Are there plans to add concurrent sequenced collections that support the 
whole
    > interface of sequenced collections?
    >
    > Kind regards,
    > Sebastian

Reply via email to