As I know proguard does such tracing internally but it works only for trivial cases (like `Class.forName` with string constant, see [1]). Another simple was is to monitor which classes were loaded with `-verbose:class` in case of hotspot [2].
But second way wouldn't show classes which weren't loaded because of lack of tests like with ctakes parser. At least, such method catches SPI and alike dynamic loading of plugins/modules. Also we have optional deps like Stanford CoreNLP (because of license AFAIK) which wouldn't be covered with either method. It would be hard to do fine grained exclusion but I advocate for coarse grained one. It could give noticable result with moderate effort, IMHO. To be honest, I just exclude edu.ucar and similar deps because of their huge footprint when use Tika since I can trade off support of some scientific formats for smaller footprint in my cases and this issue doesn't affect me directly. [1]: http://proguard.sourceforge.net/index.html#manual/usage.html [2]: http://www.oracle.com/technetwork/java/javase/clopts-139448.html#gbmtm ср, 24 авг. 2016 г. в 21:16, Ken Krugler <kkrugler_li...@transpac.com>: > I think excluding more deps would be good…but challenging. > > The problem is that some of the jars only wind up getting used for edge > cases (e.g. you have an encrypted email, and so you need bouncy castle, or > something like that which had bitten me in the past). > > So it’s hard to know what’s really required or not. Is there a good Java > tool for tracing all possible calls from starting points, to see if it’s > even possible to reach a jar? > > Though that would need some help for cases where we’re dynamically loading > classes (mostly plug-in support?) > > — Ken > > > > On Aug 24, 2016, at 10:59am, Konstantin Gribov <gros...@gmail.com> > wrote: > > > > Hi, folks. > > > > It seems that we have too much dependencies in `tika-parsers` and many of > > them could actually be not used. As Tim found in TIKA-2007 [1] > > `jackson-core` wasn't necessary for `tika-parsers` at all. > > > > When I looked into current parser deps I found a lot of strange deps like > > `quartz` with `c3p0` (jdbc connection pool impl) and `ehcache-core` via > > `cdm`, lucene parts (via `ctakes-core`), spring framework 3.x (also via > > `ctakes-core`) et cetera. Latter could even break app if you have another > > spring version in transitive deps. > > > > Also, there seems to be no tests for ctakes parser on the first glance > and > > I have no easy way to check what I can exclude from deps without breaking > > things. > > > > What do you think about shrinking some of such deps? With at least > minimal > > test coverage to ensure common usecases won't be broken, of course. > > > > [1]: > > > https://issues.apache.org/jira/browse/TIKA-2007?focusedCommentId=15435206&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15435206 > > -- > > > > Best regards, > > Konstantin Gribov > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > -- Best regards, Konstantin Gribov