Dear all, When releasing an ArrowBuf, we will run the following piece of code:
private int decrement(int decrement) { allocator.assertOpen(); final int outcome; synchronized (allocationManager) { outcome = bufRefCnt.addAndGet(-decrement); if (outcome == 0) { lDestructionTime = System.nanoTime(); allocationManager.release(this); } } return outcome; } It can be seen that we need to acquire the lock for allocation manager lock, no matter if we need to release the buffer. In addition, the operation of decrementing refcount is only carried out after the lock is acquired. This leads to unnecessary lock contention, and may degrade performance. We propose to change the code like this: private int decrement(int decrement) { allocator.assertOpen(); final int outcome; outcome = bufRefCnt.addAndGet(-decrement); if (outcome == 0) { lDestructionTime = System.nanoTime(); synchronized (allocationManager) { allocationManager.release(this); } } return outcome; } Note that this change can be dangerous, as it lies in the core of our code base, so we should be careful with it. On the other hand, it may have non-trivial performance implication. For example, when a distributed task is getting closed, a large number of ArrowBuf will be closed simultaneously. If we reduce the range of the synchronization block, we can significantly improve the performance. Would you please give your valueable feedback? Best, Liya Fan