+1

though a lot of thought would have to be given to exactly where to split what.

There are many benefits here beyond security. There's no particular
reason a Strings class and an Arrays class should be in the same
package or jar. Grab bag libraries like Apache Commons and Guava are
not particularly well factored or designed in the first place. They
mostly started as a "Do Everything the JDK doesn't do that itches
somebody" 20+ years ago, and in many cases the JDK today already does
that, and does it better.

These utility classes are bundled in one big library for developer
convenience, not because it helps clients. It's less work to put
Arrays and Strings in the same package, repo, and jar instead of
spinning up a new project for what is clearly a different thing.
Indeed the JDK itself suffers from this antipattern. JPMS tried and
failed to fix that, but it would be much easier to fix Apache Commons
than the JDK.

On Thu, Oct 30, 2025 at 5:50 AM Vladimir Sitnikov
<[email protected]> wrote:
>
> Hi all,
>
> Following the “Branch protection rules (CTR-style)” thread,
> I’d like to spin off a separate discussion about micro-modularizing some
> Commons libraries
> to reduce CVE blast radius and dependency weight.
>
> Motivation (real-world pain):
>
> As Sebb noted, unused classes shouldn’t affect runtime, however
> vulnerability scanners flag artifacts,
> not “used classes”.
> In practice teams must upgrade/patch even when only a tiny part is
> affected; proving non-impact is often harder than bumping or excluding.
>
> Mere presence of a vulnerable class on the classpath can widen attack
> surface
> (e.g., unsafe deserialization paths + a vulnerable helper available to the
> attacker).
>
> Recent examples show cross-bleed: projects that depend on
> commons-compress:1.25.0 saw multiple CVEs
> (CVE-2024-26308 Pack200 OOM, CVE-2024-25710 DUMP DoS) and also pulled in
> commons-lang3 where ClassUtils CVE-2025-48924 then arrives transitively.
> A modular layout like commons-pack200, commons-dump, commons-stringutils,
> commons-arrayutils, etc.,
> would let consumers pick only what they need and limit exposure.
>
> Concrete proposal (small, testable):
>
> Pilot a commons-stringutils4 artifact containing only StringUtils and
> Strings (and minimal shared internals if any).
> Use org.apache.commons.stringutils4 package so it could co-exist with the
> current commons-lang3.
>
> The existing commons-lang3 could depend on commons-stringutils4 so
> lang3.StringUtils could delegate all the methods to
> stringutils4.StringUtils.
>
> This would keep full backward compatibility for commons-lang3, and it would
> avoid code duplication.
> It would give users the ability to pull only StringUtils.
>
> Questions for the community:
>
> Are folks open to a pilot micro-module (commons-stringutils) released from
> the lang repo?
> Any hard blockers you see?
>
> Success criteria: adoption by projects that currently shade/extract
> StringUtils; fewer CVE flags for users that don’t pull the rest of lang3.
> For instance, even commons-compress runtime seem to
> require just stringtuils and arrayutils.
>
> If there’s interest, I can draft a PR with commons-stringutils4.
>
> Thanks,
> Vladimir



-- 
Elliotte Rusty Harold
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to