Hi Commons folks,

I’d like to discuss splitting commons-compress into a small core plus
per-format modules, while keeping the current behavior via a meta POM (no
shaded jar).

JIRA: COMPRESS-710 — “Proposal: Modularize Commons Compress into per-format
artifacts to reduce attack surface”

Why: Many users only need TAR/GZip (e.g., testcontainers-java when
building/extracting image tarballs).
The single jar exposes them to parsers/codecs they never use and broadens
the CVE blast radius.

Direction (high level):
* commons-compress-core (shared utils; JDK GZip/Deflate)
* commons-compress-tar, …-zip, …-7z, …-cpio, …-ar, etc.
* Optional codec modules for external deps (…-xz, …-zstd, …-brotli)
* commons-compress meta POM that brings today’s full set
* Preserve existing APIs; ArchiveStreamFactory remains (clear error if a
requested format isn’t on the classpath)

PoC I propose (unless maintainers prefer to lead):
* Extract core + tar only, wire tests/CI, add migration notes
(“need only TAR/GZip? depend on core + tar”)
* No API breaks; same packages; avoid split packages

Seeking quick feedback on:
* Support in principle for core + modules with a meta POM?
* Monolith behavior: meta POM only (no shaded jar)
* Initial module granularity (any formats to group/exclude) and
deprecations (e.g., Pack200)?
* Single version across modules; single repo as a multi-module Maven build?

If there are no -1s within 7 days, I’ll start a small core+tar PoC branch
and iterate in the open.

Thanks!
Vladimir

Reply via email to