All, I was recently chatting about Tika 2.x with some Tika friends and they had some hesitation about the names for the three high level parser modules.
They are currently: tika-parsers-classic tika-parsers-extended tika-parsers-advanced The quibbles weren't with the delineation, but with the naming. In my mind, this is what I've been thinking as definitions: tika-parsers-classic -- with the exception of optional OCR, these should be lightish weight dependencies in pure java with no parsers/resources that require network calls. tika-parsers-extended -- these can require native libs and/or have heavier dependencies, including network calls. tika-parsers-advanced -- anything goes. dl4j as a dependency, etc. Some options for classic-> basic, base, ...what else? Any other recommendations for these names? Thank you! Best, Tim