Hi Commons folks, I’d like to discuss splitting commons-compress into a small core plus per-format modules, while keeping the current behavior via a meta POM (no shaded jar).
JIRA: COMPRESS-710 — “Proposal: Modularize Commons Compress into per-format artifacts to reduce attack surface” Why: Many users only need TAR/GZip (e.g., testcontainers-java when building/extracting image tarballs). The single jar exposes them to parsers/codecs they never use and broadens the CVE blast radius. Direction (high level): * commons-compress-core (shared utils; JDK GZip/Deflate) * commons-compress-tar, …-zip, …-7z, …-cpio, …-ar, etc. * Optional codec modules for external deps (…-xz, …-zstd, …-brotli) * commons-compress meta POM that brings today’s full set * Preserve existing APIs; ArchiveStreamFactory remains (clear error if a requested format isn’t on the classpath) PoC I propose (unless maintainers prefer to lead): * Extract core + tar only, wire tests/CI, add migration notes (“need only TAR/GZip? depend on core + tar”) * No API breaks; same packages; avoid split packages Seeking quick feedback on: * Support in principle for core + modules with a meta POM? * Monolith behavior: meta POM only (no shaded jar) * Initial module granularity (any formats to group/exclude) and deprecations (e.g., Pack200)? * Single version across modules; single repo as a multi-module Maven build? If there are no -1s within 7 days, I’ll start a small core+tar PoC branch and iterate in the open. Thanks! Vladimir
