On Tue, 9 Mar 2021 13:57:24 GMT, Lin Zang <[email protected]> wrote:
>> Update a new patch that reduce the memory consumption issue.
>> As mentioned in the CR, there is internal buffer used for segmented heap
>> dump. The dumped object data firstly write into this buffer and then
>> flush() when the size is known. when the internal buffer is full, the
>> current implementation do:
>>
>> - new a larger buffer, copy data from old buffer into the new one, and then
>> use it as the internal buffer.
>>
>> This could cause large memory consumption because the old buffer data are
>> copied, and also the old buffer can not be "free" until next GC.
>>
>> For example, if the internel buffer's length is 1MB, when it is full, a new
>> 2MB buffer is allocated so there is actually 3MB memory taken (old buffer +
>> new buffer). And in this case, for the ~4GB large array, it keeps generating
>> new buffers and do copying, which takes both CPU and memory intensively and
>> cause the timeout issue.
>>
>> This patch optimize it by creating a array list of byte[]. when old buffer
>> is full, it saved into the list and the new one is created and used as the
>> internal buffer. In this case, the above example takes 2MB(1MB for old,
>> saved in the list; and 1MB for the new buffer)
>>
>> Together with the "write through" mode introduced in this PR, by which all
>> arrays are write through to underlying stream and hence no extra buffer
>> requried. The PR could help fix the LargeArray issue and also save memory.
>>
>> Thanks!
>> Lin
>
> As discussed in CR https://bugs.openjdk.java.net/browse/JDK-8262386, the
> byte[] list is much more like an optimization. Revert it in the PR and I will
> create a separate CR and PR for it.
>
> Thanks,
> Lin
Hi Lin,
One concern I have is the naming used in the fix.
First, the term write-through you use is unusual and confusing.
Would it better to say 'direct or unbuffered output' instead?
612 // Now the total size of data to dump is known and can be filled
to segment header.
613 // Enable write-through mode to avoid internal buffer copies.
614 if (useSegmentedHeapDump) {
615 long longBytes = length * typeSize + headerSize;
616 int bytesToWrite = (int) (longBytes);
617
hprofBufferedOut.fillSegmentSizeAndEnableWriteThrough(bytesToWrite);
618 }
Why 'longBytes' ? The scope of this local variable is very short, and it is
clear that its type is long.
Why not something like 'size' or 'totalSize' ?
There is no need in parentheses around the 'longBytes' at L616.
Also, there is no need to for one more local variable 'bytesToWrite'.
Something like below is more clear:
int size = (int)(length * typeSize + headerSize);
I agree with Chris, it is confusing why the decision about write-trough mode is
made in this method.
The method fillSegmentSizeAndEnableWriteThrough() is called for any array if
useSegmentedHeapDump == true,
so any segmented output for arrays is direct (in write-through mode).
Is this new dimension really needs to be introduced?
Thanks,
Serguei
-------------
PR: https://git.openjdk.java.net/jdk/pull/2803