I’d like to see the discriminators on the parsers be more about the type of 
parser, and what it’s going to drag along/impact my system with, and these 
names reflect more the history of Tika’s evolution.

Starting with the descriptive paragraphs, here is some brainstorming of names:

with the exception of optional OCR, these
should be lightish weight dependencies in pure java with no
parsers/resources that require network calls.

        —tika-parsers-files
        —tika-parsers-alljava
        —tika-parsers-local
        —tika-parsers-simple
        —tika-parsers-lightweight
        —tika-parsers-aluminum

these can require native libs and/or have
heavier dependencies, including network calls.

        —tika-parsers-heavy
        —tika-parsers-complex
        —tika-parsers-extended-dependencies
        —tika-parsers-iron


anything goes. dl4j as a dependency, etc.

        —tika-parsers-anything-goes
        —tika-parsers-sandbox
        —tika-parsers-deep
        —tika-parsers-model-driven
        —tika-parsers-lead




> On Mar 9, 2021, at 12:03 PM, Tim Allison <talli...@apache.org> wrote:
> 
> All,
>  I was recently chatting about Tika 2.x with some Tika friends and
> they had some hesitation about the names for the three high level
> parser modules.
> 
> They are currently:
> 
> tika-parsers-classic
> tika-parsers-extended
> tika-parsers-advanced
> 
> The quibbles weren't with the delineation, but with the naming.
> 
> In my mind, this is what I've been thinking as definitions:
> 
> tika-parsers-classic -- with the exception of optional OCR, these
> should be lightish weight dependencies in pure java with no
> parsers/resources that require network calls.
> 
> tika-parsers-extended -- these can require native libs and/or have
> heavier dependencies, including network calls.
> 
> tika-parsers-advanced -- anything goes. dl4j as a dependency, etc.
> 
> Some options for classic-> basic, base, ...what else?
> 
> Any other recommendations for these names?  Thank you!
> 
> Best,
> 
>           Tim

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to