[ https://issues.apache.org/jira/browse/ARROW-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17662282#comment-17662282 ]
Rok Mihevc commented on ARROW-5259: ----------------------------------- This issue has been migrated to [issue #21730|https://github.com/apache/arrow/issues/21730] on GitHub. Please see the [migration documentation|https://github.com/apache/arrow/issues/14542] for further details. > [Java] Add option for ValueVector to allocate buffers with actual size > ---------------------------------------------------------------------- > > Key: ARROW-5259 > URL: https://issues.apache.org/jira/browse/ARROW-5259 > Project: Apache Arrow > Issue Type: Wish > Components: Java > Reporter: Ji Liu > Assignee: Ji Liu > Priority: Minor > > Currently in _BaseValueVector#computeCombinedBufferSize_, it calculates the > buffer size with _valueCount_ and _typeWidth_ as inputs and then allocates > memory for dataBuffer and validityBuffer. However, it always allocate memory > greater than the actual size, because of the invoke of > _BaseAllocator.nextPowerOfTwo(bufferSize)_. > For example, IntVector will allocate buffers with size 8192 with valueCount = > 1025, memory usage is almost double what it actually is. So in some cases, > there have enough memory for actual use but throws OOM when the allocated > memory is increased to next power of 2 and I think this problem is absolutely > avoidable. > Is it feasible to add option for ValueVector to allocate actual buffer size > rather than make it next power of 2 to reduce memory allocation? -- This message was sent by Atlassian Jira (v8.20.10#820010)