I think excluding more deps would be good…but challenging.

The problem is that some of the jars only wind up getting used for edge cases 
(e.g. you have an encrypted email, and so you need bouncy castle, or something 
like that which had bitten me in the past).

So it’s hard to know what’s really required or not. Is there a good Java tool 
for tracing all possible calls from starting points, to see if it’s even 
possible to reach a jar?

Though that would need some help for cases where we’re dynamically loading 
classes (mostly plug-in support?)

— Ken


> On Aug 24, 2016, at 10:59am, Konstantin Gribov <gros...@gmail.com> wrote:
> 
> Hi, folks.
> 
> It seems that we have too much dependencies in `tika-parsers` and many of
> them could actually be not used. As Tim found in TIKA-2007 [1]
> `jackson-core` wasn't necessary for `tika-parsers` at all.
> 
> When I looked into current parser deps I found a lot of strange deps like
> `quartz` with `c3p0` (jdbc connection pool impl) and `ehcache-core` via
> `cdm`, lucene parts (via `ctakes-core`), spring framework 3.x (also via
> `ctakes-core`) et cetera. Latter could even break app if you have another
> spring version in transitive deps.
> 
> Also, there seems to be no tests for ctakes parser on the first glance and
> I have no easy way to check what I can exclude from deps without breaking
> things.
> 
> What do you think about shrinking some of such deps? With at least minimal
> test coverage to ensure common usecases won't be broken, of course.
> 
> [1]:
> https://issues.apache.org/jira/browse/TIKA-2007?focusedCommentId=15435206&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15435206
> -- 
> 
> Best regards,
> Konstantin Gribov

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr



Reply via email to