On Mon, 30 Oct 2023 11:57:09 GMT, Eirik Bjorsnos <d...@openjdk.org> wrote:
>> Please review this PR which speeds up TestTooManyEntries and clarifies its >> purpose: >> >> - The name 'TestTooManyEntries' does not clearly convey the purpose of the >> test. What is tested is the validation that the total CEN size fits in a >> Java byte array. Suggested rename: CenSizeTooLarge >> - The test creates DEFLATED entries which incurs zlib costs and File Data / >> Data Descriptors for no additional benefit. We can use STORED instead. >> - By creating a single LocalDateTime and setting it with >> `ZipEntry.setTimeLocal`, we can avoid repeated time zone calculations. >> - The name of entries is generated by calling UUID.randomUUID, we could use >> simple counter instead. >> - The produced file is unnecessarily large. We know how large a CEN entry >> is, let's take advantage of that to create a file with the minimal size. >> - By adding a maximally large extra field to the CEN entries, we get away >> with fewer CEN records and save memory >> - The summary and comments of the test can be improved to help explain the >> purpose of the test and how we reach the limit being tested. >> - By writing sparse 'holes' until the last CEN entry, we can reduce required >> disk space. >> >> These speedups reduced the runtime from 4 min 17 sec to 3 seconds on my >> Macbook Pro. The produced ZIP size was reduced from 5.7 GB to ~4K. Memory >> consumption is down from 8GB to something like 12MB. > > Eirik Bjorsnos has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 14 additional > commits since the last revision: > > - Run CenSizeTooLarge automatically, since it no longer requires excessive > memory or disk space > - Use a SparseOutputStream to reduce required diskspace from 2GB to 4K. > SparseOutputStream writes 'holes' instead of actual bytes until the final CEN > is written. > - TestTooManyEntries is renamed to CenSizeTooLarge > - Merge branch 'master' into cen-size-too-large > - MAX_EXTRA_FIELD_SIZE can be better expressed as 0xFFFF > - Bring back '@requires sun.arch.data.model == 64' for now > - Spell 'specified' correctly > - Give test method a long, meaningful name > - Remove blank line > - Make CEN headers maximally large such that we need fewer of them and save > memory > - ... and 4 more: https://git.openjdk.org/jdk/compare/f5d01b56...9629b8d2 Thanks for restarting the discussion on this test clean up. Please see comments below test/jdk/java/util/zip/ZipFile/CenSizeTooLarge.java line 33: > 31: import org.testng.annotations.BeforeTest; > 32: import org.testng.annotations.Test; > 33: As this will be a new test, could you please consider converting to junit. test/jdk/java/util/zip/ZipFile/CenSizeTooLarge.java line 139: > 137: public void centralDirectoryTooLargeToFitInByteArray() { > 138: ZipException ex = expectThrows(ZipException.class, () -> new > ZipFile(hugeZipFile)); > 139: assertEquals(ex.getMessage(), "invalid END header (central > directory size too large)"); Could be have the expected message a class constant so that if this message ever changes it is easier to find/update test/jdk/java/util/zip/ZipFile/CenSizeTooLarge.java line 187: > 185: } else { > 186: channel.position(position); > 187: if (Arrays.equals(LAST_COMMENT_BYTES, 0, > LAST_COMMENT_BYTES.length, b, off, len)) { I would suggest a comment to help future developers. I realize there are some earlier bread crumbs but I think this would be beneficial ------------- PR Review: https://git.openjdk.org/jdk/pull/12991#pullrequestreview-1704236482 PR Review Comment: https://git.openjdk.org/jdk/pull/12991#discussion_r1376286075 PR Review Comment: https://git.openjdk.org/jdk/pull/12991#discussion_r1376295495 PR Review Comment: https://git.openjdk.org/jdk/pull/12991#discussion_r1376302371