[AVRO-3713] Why does the default values cache need weak keys?

Niels Basjes Fri, 10 Feb 2023 04:16:51 -0800

Hi,

Earlier this week a few colleagues of mine ran into a concurrency problem
in Avro which is a regression of the fix applied in
https://issues.apache.org/jira/browse/AVRO-1760.


I created a new ticket to track this
https://issues.apache.org/jira/browse/AVRO-3713

Summary of the problem:
In 2016 this change was made to solve the concurrency problem (too much
synchronization/locking):
*Original code:*

  private final Map<Field, Object> defaultValueCache
    = Collections.synchronizedMap(new WeakHashMap<Field, Object>());

*New code (using Guava):*

  private final Map<Field, Object> defaultValueCache
    = new MapMaker().weakKeys().makeMap();

Then in 2018 guava was dropped
*New code:*

  private final Map<Field, Object> defaultValueCache
    = Collections.synchronizedMap(new WeakHashMap<>());

So we are back where we started and the same problem has returned.

Looking at this last code change I found that when dropping Guava in many
places a ConcurrentHashMap is used.
Only in this place because of the "Weak Keys" a different setup was used.

I looked back into the version history and this code dates back to 2013 (
https://issues.apache.org/jira/browse/AVRO-1228 ) .
I found no explanation about why the weak keys are used (I checked Jira,
Mailing list and the code).

My question is simply: *Why use weak keys? Is that really needed?*

If we do not need it I propose to simply go for a ConcurrentHashMap.
If we do need it I think the fix should probably look something like this:

private final WeakConcurrentMap<Field, Object> defaultValueCache = new
WeakConcurrentMap<>();

private static class WeakConcurrentMap<K, V> {
  ConcurrentHashMap<WeakReference<K>, V> map = new ConcurrentHashMap<>();

  public V get(K key) {
    WeakReference<K> weakKey = new WeakReference<>(key);
    V value = map.get(weakKey);
    if(value != null) {
      return value;
    } else {
      map.remove(weakKey);
      return null;
    }
  }

  V put(K key, V value) {
    return map.put(new WeakReference<>(key), value);
  }
}




-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

[AVRO-3713] Why does the default values cache need weak keys?

Reply via email to