On Sat, 30 Mar 2024 03:21:39 GMT, Jiangli Zhou <jian...@openjdk.org> wrote:
>> The other loop uses `readAt` to read in additional data and advance through >> the extra fields, this loop already has access to a buffer that contains all >> of the data for the extra fields and doesn't need to do that. >> >> I considered having the other loop read in all of the data for the extra >> fields so there could be a shared helper, but the data is variable length, >> so that would have required having a large fixed-size buffer or allocating >> memory, which this code seems to be trying to avoid. >> >> Do you see a good way to share the loop? > > How about something like the following (I haven't tried to compile or test, > please double check if everything is correct)? The size of the extra fields > is recorded in the 2-byte field, `extra field length`, so we can just read > all extra fields in a temp buffer. It causes less overhead with just one > read. > > > jboolean read_zip64_ext(int fd, jlong censtart, jlong base_offset, Byte > *cenhdr, > jboolean check_offset_only, > jboolean read_extra_fields) { > jlong cenlen = CENLEN(cenhdr); > jlong censiz = CENSIZ(cenhdr); > jlong cenoff = CENOFF(cenhdr); > if (cenoff == ZIP64_MAGICVAL || > (!check_offset_only && (censiz == ZIP64_MAGICVAL || cenlen == > ZIP64_MAGICVAL))) { > jlong cenext = CENEXT(cenhdr); > if (cenext == 0) { > return JNI_FALSE; > } > > jlong start = censtart + CENHDR + CENNAM(cenhdr); > jlong offset = 0; > Byte *p = start; > > // Read the extra fields if needed. > if (read_extra_fields) { > *p = (Byte*)malloc(cenext); > if (p == NULL) { > return JNI_FALSE; > } > if (!readAt(fd, start + offset, cenext, p)) { > free(p); > return JNI_FALSE; > } > } > > while (offset < cenext) { > short headerId = ZIPEXT_HDR(p + offset); > short headerSize = ZIPEXT_SIZ(p + offset); > if (headerId == ZIP64_EXTID) { > if (!read_zip64_ext(p, &cenlen, &censiz, &cenoff, CENDSK(cenhdr))) { > if (p != start) { > free(p); > } > return JNI_FALSE; > } > break; > } > offset += 4 + headerSize; > } > } > > if (p != start) { > free(p); > } > return JNI_TRUE; > } I think that's similar idea to one of the alternatives I mentioned earlier, won't that allocate for every central directory entry? This callsite has already read the data we need into a buffer, if we end up doing something like that it might be better to do the allocation only for the call site in `validate_lochdr`, and have the shared helper take the pointer to the buffer instead of having a duplicate read. It seems like there are some tradeoffs here, I can definitely restructure this, but I'd also like to get some feedback from a core-libs owner on the overall approach before iterating too much more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18479#discussion_r1545070043