Hi, Earlier this week a few colleagues of mine ran into a concurrency problem in Avro which is a regression of the fix applied in https://issues.apache.org/jira/browse/AVRO-1760.
I created a new ticket to track this https://issues.apache.org/jira/browse/AVRO-3713 Summary of the problem: In 2016 this change was made to solve the concurrency problem (too much synchronization/locking): *Original code:* private final Map<Field, Object> defaultValueCache = Collections.synchronizedMap(new WeakHashMap<Field, Object>()); *New code (using Guava):* private final Map<Field, Object> defaultValueCache = new MapMaker().weakKeys().makeMap(); Then in 2018 guava was dropped *New code:* private final Map<Field, Object> defaultValueCache = Collections.synchronizedMap(new WeakHashMap<>()); So we are back where we started and the same problem has returned. Looking at this last code change I found that when dropping Guava in many places a ConcurrentHashMap is used. Only in this place because of the "Weak Keys" a different setup was used. I looked back into the version history and this code dates back to 2013 ( https://issues.apache.org/jira/browse/AVRO-1228 ) . I found no explanation about why the weak keys are used (I checked Jira, Mailing list and the code). My question is simply: *Why use weak keys? Is that really needed?* If we do not need it I propose to simply go for a ConcurrentHashMap. If we do need it I think the fix should probably look something like this: private final WeakConcurrentMap<Field, Object> defaultValueCache = new WeakConcurrentMap<>(); private static class WeakConcurrentMap<K, V> { ConcurrentHashMap<WeakReference<K>, V> map = new ConcurrentHashMap<>(); public V get(K key) { WeakReference<K> weakKey = new WeakReference<>(key); V value = map.get(weakKey); if(value != null) { return value; } else { map.remove(weakKey); return null; } } V put(K key, V value) { return map.put(new WeakReference<>(key), value); } } -- Best regards / Met vriendelijke groeten, Niels Basjes