[
https://issues.apache.org/jira/browse/COLLECTIONS-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Griffin updated COLLECTIONS-885:
-------------------------------------
Description:
*Problem:*
When iterating the PatriciaTrie (or a view of it), a
ConcurrentModificationException will be thrown if a different thread creates
another view of it (submap(), etc.) with a key which is not an exact match for
an existing entry in the trie.
These methods
{code:java}
AbstractPatriciaTrie.ceilingEntry()
AbstractPatriciaTrie.floorEntry()
AbstractPatriciaTrie.higherEntry()
AbstractPatriciaTrie.lowerEntry(){code}
use a pattern where they:
# insert a phantom node into the trie with the search key
# increment the modCount
# find the neighbor of that phantom node via nextEntry()/previousEntry()
# remove the phantom node
# increment the modCount again
# then decrement modCount by 2 to hide the 2 modifications (add + remove)
The referenced modCount is used as a fail-fast mechanism in the iterators.
This behaviour is contrary to expectations as multiple threads which are
performing read-only operations on a collection usually can share a collection
without any CME expected. If it was a shared object which can be updated by
another thread a user would ensure that there is some defensive external
locking done to prevent any CMEs.
*Existing Code:*
{code:java}
TrieEntry<K, V> ceilingEntry(final K key) {
...
if (KeyAnalyzer.isValidBitIndex(bitIndex)) {
final TrieEntry<K, V> added = new TrieEntry<>(key, null, bitIndex);
addEntry(added, lengthInBits);
incrementSize(); // must increment because remove will decrement
final TrieEntry<K, V> ceil = nextEntry(added);
removeEntry(added);
modCount -= 2; // we didn't really modify it.
return ceil;
} {code}
*Error log:*
{code:java}
java.util.ConcurrentModificationException
at
org.apache.commons.collections4.trie.AbstractPatriciaTrie$AbstractTrieIterator.nextEntry(AbstractPatriciaTrie.java:261)
at
org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.nextEntry(AbstractPatriciaTrie.java:1142)
at
org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.next(AbstractPatriciaTrie.java:1137)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at org.example.ScratchMain.lambda$main$1(ScratchMain.java:73)
at java.lang.Thread.run(Thread.java:750){code}
*Proposed Fix:*
Replace the phantom add/remove with a read-only implementation that:
# Finds the nearest entry via getNearestEntryForKey().
# Determines whether the search key is less than or greater than the found
entry using isBitSet() at the first differing bit.
# Walks forward (nextEntry()) or backward (previousEntry()) to locate the
correct ceiling/floor neighbor.
This prevents any structural changes in these methods and removes the
*{{modCount-=2}}* hack entirely.
*N.B.* The existing *TODO: Cleanup* comments in the code acknowledged this
change is needed.
{code:java}
TrieEntry<K, V> ceilingEntry(final K key) {
// Basically:
// Follow the steps of adding an entry, but instead...
//
// - If we ever encounter a situation where we found an equal
// key, we return it immediately.
//
// - If we hit an empty root, return the first iterable item.
//
// - If we have to add a new item, we temporarily add it,
// find the successor to it, then remove the added item.
//
// These steps ensure that the returned value is either the
// entry for the key itself, or the first entry directly after
// the key.
// TODO: Cleanup so that we don't actually have to add/remove from the
// tree. (We do it here because there are other well-defined
// functions to perform the search.)
...{code}
*Sequence Diagrams*
!patricia-trie-concurrency-safe.png|width=211,height=548!!patricia-trie-concurrency-unsafe.png|width=225,height=541!
was:
*Problem:*
When iterating the PatriciaTrie (or a view of it), a
ConcurrentModificationException will be thrown if a different thread creates
another view of it (submap(), etc.) with a key which is not an exact match for
an existing entry in the trie.
These methods
{code:java}
AbstractPatriciaTrie.ceilingEntry()
AbstractPatriciaTrie.floorEntry()
AbstractPatriciaTrie.higherEntry()
AbstractPatriciaTrie.lowerEntry(){code}
use a pattern where they:
# insert a phantom node into the trie with the search key
# increment the modCount
# find the neighbor of that phantom node via nextEntry()/previousEntry()
# remove the phantom node
# increment the modCount again
# then decrement modCount by 2 to hide the 2 modifications (add + remove)
The referenced modCount is used as a fail-fast mechanism in the iterators.
This behaviour is contrary to expectations as multiple threads which are
performing read-only operations on a collection usually can share a collection
without any CME expected. If it was a shared object which can be updated by
another thread a user would ensure that there is some defensive external
locking done to prevent any CMEs.
*Existing Code:*
{code:java}
TrieEntry<K, V> ceilingEntry(final K key) {
...
if (KeyAnalyzer.isValidBitIndex(bitIndex)) {
final TrieEntry<K, V> added = new TrieEntry<>(key, null, bitIndex);
addEntry(added, lengthInBits);
incrementSize(); // must increment because remove will decrement
final TrieEntry<K, V> ceil = nextEntry(added);
removeEntry(added);
modCount -= 2; // we didn't really modify it.
return ceil;
} {code}
*Error log:*
{code:java}
java.util.ConcurrentModificationException
at
org.apache.commons.collections4.trie.AbstractPatriciaTrie$AbstractTrieIterator.nextEntry(AbstractPatriciaTrie.java:261)
at
org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.nextEntry(AbstractPatriciaTrie.java:1142)
at
org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.next(AbstractPatriciaTrie.java:1137)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at org.example.ScratchMain.lambda$main$1(ScratchMain.java:73)
at java.lang.Thread.run(Thread.java:750){code}
*Proposed Fix:*
Replace the phantom add/remove with a read-only implementation that:
# Finds the nearest entry via getNearestEntryForKey().
# Determines whether the search key is less than or greater than the found
entry using isBitSet() at the first differing bit.
# Walks forward (nextEntry()) or backward (previousEntry()) to locate the
correct ceiling/floor neighbor.
This prevents any structural changes in these methods and removes the
*{{modCount-=2}}* hack entirely.
*N.B.* The existing *TODO: Cleanup* comments in the code acknowledged this
change is needed.
{code:java}
TrieEntry<K, V> ceilingEntry(final K key) {
// Basically:
// Follow the steps of adding an entry, but instead...
//
// - If we ever encounter a situation where we found an equal
// key, we return it immediately.
//
// - If we hit an empty root, return the first iterable item.
//
// - If we have to add a new item, we temporarily add it,
// find the successor to it, then remove the added item.
//
// These steps ensure that the returned value is either the
// entry for the key itself, or the first entry directly after
// the key.
// TODO: Cleanup so that we don't actually have to add/remove from the
// tree. (We do it here because there are other well-defined
// functions to perform the search.)
...{code}
> PatriciaTrie submap and range view operations silently mutate trie structure
> resulting in ConcurrentModificationException for any iterating threads
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: COLLECTIONS-885
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-885
> Project: Commons Collections
> Issue Type: Bug
> Components: Collection, Map
> Affects Versions: 4.5.0
> Reporter: John Griffin
> Priority: Minor
> Attachments: patricia-trie-concurrency-safe.png,
> patricia-trie-concurrency-unsafe.png
>
>
> *Problem:*
> When iterating the PatriciaTrie (or a view of it), a
> ConcurrentModificationException will be thrown if a different thread creates
> another view of it (submap(), etc.) with a key which is not an exact match
> for an existing entry in the trie.
> These methods
> {code:java}
> AbstractPatriciaTrie.ceilingEntry()
> AbstractPatriciaTrie.floorEntry()
> AbstractPatriciaTrie.higherEntry()
> AbstractPatriciaTrie.lowerEntry(){code}
> use a pattern where they:
> # insert a phantom node into the trie with the search key
> # increment the modCount
> # find the neighbor of that phantom node via nextEntry()/previousEntry()
> # remove the phantom node
> # increment the modCount again
> # then decrement modCount by 2 to hide the 2 modifications (add + remove)
> The referenced modCount is used as a fail-fast mechanism in the iterators.
> This behaviour is contrary to expectations as multiple threads which are
> performing read-only operations on a collection usually can share a
> collection without any CME expected. If it was a shared object which can be
> updated by another thread a user would ensure that there is some defensive
> external locking done to prevent any CMEs.
> *Existing Code:*
> {code:java}
> TrieEntry<K, V> ceilingEntry(final K key) {
> ...
> if (KeyAnalyzer.isValidBitIndex(bitIndex)) {
> final TrieEntry<K, V> added = new TrieEntry<>(key, null, bitIndex);
> addEntry(added, lengthInBits);
> incrementSize(); // must increment because remove will decrement
> final TrieEntry<K, V> ceil = nextEntry(added);
> removeEntry(added);
> modCount -= 2; // we didn't really modify it.
> return ceil;
> } {code}
> *Error log:*
> {code:java}
> java.util.ConcurrentModificationException
> at
> org.apache.commons.collections4.trie.AbstractPatriciaTrie$AbstractTrieIterator.nextEntry(AbstractPatriciaTrie.java:261)
> at
> org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.nextEntry(AbstractPatriciaTrie.java:1142)
> at
> org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.next(AbstractPatriciaTrie.java:1137)
> at java.util.Iterator.forEachRemaining(Iterator.java:116)
> at org.example.ScratchMain.lambda$main$1(ScratchMain.java:73)
> at java.lang.Thread.run(Thread.java:750){code}
>
> *Proposed Fix:*
> Replace the phantom add/remove with a read-only implementation that:
> # Finds the nearest entry via getNearestEntryForKey().
> # Determines whether the search key is less than or greater than the found
> entry using isBitSet() at the first differing bit.
> # Walks forward (nextEntry()) or backward (previousEntry()) to locate the
> correct ceiling/floor neighbor.
> This prevents any structural changes in these methods and removes the
> *{{modCount-=2}}* hack entirely.
> *N.B.* The existing *TODO: Cleanup* comments in the code acknowledged this
> change is needed.
> {code:java}
> TrieEntry<K, V> ceilingEntry(final K key) {
> // Basically:
> // Follow the steps of adding an entry, but instead...
> //
> // - If we ever encounter a situation where we found an equal
> // key, we return it immediately.
> //
> // - If we hit an empty root, return the first iterable item.
> //
> // - If we have to add a new item, we temporarily add it,
> // find the successor to it, then remove the added item.
> //
> // These steps ensure that the returned value is either the
> // entry for the key itself, or the first entry directly after
> // the key.
> // TODO: Cleanup so that we don't actually have to add/remove from the
> // tree. (We do it here because there are other well-defined
> // functions to perform the search.)
> ...{code}
> *Sequence Diagrams*
> !patricia-trie-concurrency-safe.png|width=211,height=548!!patricia-trie-concurrency-unsafe.png|width=225,height=541!
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)