TestTooManyEntries.java and clarify its purpose [v7]

Lance Andersen Mon, 30 Oct 2023 07:33:38 -0700

On Mon, 30 Oct 2023 11:57:09 GMT, Eirik Bjorsnos <d...@openjdk.org> wrote:


>> Please review this PR which speeds up TestTooManyEntries and clarifies its 
>> purpose:
>> 
>> - The name 'TestTooManyEntries' does not clearly convey the purpose of the 
>> test. What is tested is the validation that the total CEN size fits in a 
>> Java byte array. Suggested rename: CenSizeTooLarge
>> - The test creates DEFLATED entries which incurs zlib costs and File Data / 
>> Data Descriptors for no additional benefit. We can use STORED instead.
>> - By creating a single LocalDateTime and setting it with 
>> `ZipEntry.setTimeLocal`, we can avoid repeated time zone calculations. 
>> - The name of entries is generated by calling UUID.randomUUID, we could use 
>> simple counter instead.
>> - The produced file is unnecessarily large. We know how large a CEN entry 
>> is, let's take advantage of that to create a file with the minimal size.
>> - By adding a maximally large extra field to the CEN entries, we get away 
>> with fewer CEN records and save memory
>> - The summary and comments of the test can be improved to help explain the 
>> purpose of the test and how we reach the limit being tested.
>> - By writing sparse 'holes' until the last CEN entry, we can reduce required 
>> disk space.
>> 
>> These speedups reduced the runtime from 4 min 17 sec to 3 seconds on my 
>> Macbook Pro. The produced ZIP size was reduced from 5.7 GB to ~4K. Memory 
>> consumption is down from 8GB to something like 12MB.
>
> Eirik Bjorsnos has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains 14 additional 
> commits since the last revision:
> 
>  - Run CenSizeTooLarge automatically, since it no longer requires excessive 
> memory or disk space
>  - Use a SparseOutputStream to reduce required diskspace from 2GB to 4K. 
> SparseOutputStream writes 'holes' instead of actual bytes until the final CEN 
> is written.
>  - TestTooManyEntries is renamed to CenSizeTooLarge
>  - Merge branch 'master' into cen-size-too-large
>  - MAX_EXTRA_FIELD_SIZE can be better expressed as 0xFFFF
>  - Bring back '@requires sun.arch.data.model == 64' for now
>  - Spell 'specified' correctly
>  - Give test method a long, meaningful name
>  - Remove blank line
>  - Make CEN headers maximally large such that we need fewer of them and save 
> memory
>  - ... and 4 more: https://git.openjdk.org/jdk/compare/f5d01b56...9629b8d2

Thanks for restarting the discussion on this test clean up.

Please see comments below

test/jdk/java/util/zip/ZipFile/CenSizeTooLarge.java line 33:

> 31: import org.testng.annotations.BeforeTest;
> 32: import org.testng.annotations.Test;
> 33: 

As this will be a new test, could you please consider converting to junit.

test/jdk/java/util/zip/ZipFile/CenSizeTooLarge.java line 139:

> 137:     public void centralDirectoryTooLargeToFitInByteArray() {
> 138:         ZipException ex = expectThrows(ZipException.class, () -> new 
> ZipFile(hugeZipFile));
> 139:         assertEquals(ex.getMessage(), "invalid END header (central 
> directory size too large)");

Could be have the expected message a class constant so that if this message 
ever changes it is easier to find/update

test/jdk/java/util/zip/ZipFile/CenSizeTooLarge.java line 187:

> 185:             } else {
> 186:                 channel.position(position);
> 187:                 if (Arrays.equals(LAST_COMMENT_BYTES, 0, 
> LAST_COMMENT_BYTES.length, b, off, len)) {

I would suggest a comment to help future developers.  I realize there are some 
earlier bread crumbs but I think this would be beneficial

-------------

PR Review: https://git.openjdk.org/jdk/pull/12991#pullrequestreview-1704236482
PR Review Comment: https://git.openjdk.org/jdk/pull/12991#discussion_r1376286075
PR Review Comment: https://git.openjdk.org/jdk/pull/12991#discussion_r1376295495
PR Review Comment: https://git.openjdk.org/jdk/pull/12991#discussion_r1376302371

Re: RFR: 8304020: Speed up test/jdk/java/util/zip/ZipFile/TestTooManyEntries.java and clarify its purpose [v7]

Reply via email to